[syslog-ng] allowed concurrent connections - bug?
Evan Rempel
erempel at uvic.ca
Fri Feb 18 17:45:59 UTC 2022
Some more anecdotal details. I know this is not really a systematic
approach to tyrouble shooting this.
I guess I'll have to turn on the verbose stats. We collect them
regularly so that might get overwhelmed.
The issue started again.
Number of allowed concurrent connections reached, rejecting connection;
client='AF_INET(XXXX:50773)', local='AF_INET(YYYY:6514)',
group_name='client_network_tcp',
location='/etc/syslog-ng/syslog-ng.server.conf:61:9', max='15000'
The established connection count was 2747
$ netstat -an | grep ESTABLISHED | grep -c 6514
2747
The output destination stats were
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945402
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;memory_usage;148574728
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;written;1839783
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;truncated_count;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;truncated_bytes;0
repeating this statistics report a few times with 2-3 seconds between
then it shows
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945406
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945406
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945410
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945410
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1960812
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;121005
reload syslong-ng
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2084011
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;97926
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2084013
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;97938
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2103861
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;19842
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2107477
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;23463
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2112825
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;28810
By the time I was this far composingthis message, things seemed to have
stalled again
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2909715
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;54738
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2909718
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;54738
eventually it started cycling through the large queueing and flushing
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;3177968
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;157790
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;3187677
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;2150
with periods of getting stuck
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;3418242
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;62294
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;3418246
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;62294
and judging by the cpu usage, not all of the log messages are being
processed.
Verbose stats ... here I come.
On 2022-02-18 04:08, Laszlo Varady (lvarady) wrote:
>
> Hi,
>
> A macroed file destination is unlikely to cause such an issue if the
> location is writable.
>
> Just a tip that might help ruling out the case I mentioned:
> Setting the stats-level() to 4 results in an extremely verbose stats
> output, where a counter called "free_window" can be found for each
> network connection.
> This can be used to check whether a connection is suspended or not.
> Note that this is a momentary value, which oscillates between 0 and
> full_window, so a momentary 0 does not mean anything bad,
> but we're looking for fixed 0 "free_window" values.
>
> --
> László Várady
> ------------------------------------------------------------------------
> *From:* syslog-ng <syslog-ng-bounces at lists.balabit.hu> on behalf of
> Evan Rempel <erempel at uvic.ca>
> *Sent:* Friday, February 18, 2022 3:22
> *To:* syslog-ng at lists.balabit.hu <syslog-ng at lists.balabit.hu>
> *Subject:* Re: [syslog-ng] allowed concurrent connections - bug?
> CAUTION: This email originated from outside of the organization. Do
> not follow guidance, click links, or open attachments unless you
> recognize the sender and know the content is safe.
>
> There is only 1 destination, although it is a file named with macros
> of date and hour.
>
> destination workstation.log {
> file("/var/syslog/workstation.log.$R_YEAR$R_MONTH$R_DAY.${R_HOUR}0000"
> ); };
>
> And that volume has never become full.
>
> Also, with a file based destination I can't actually turn flow-control
> off since files have soft flow-control.
>
> Because I have 3500 real connections that are all active (total of
> 10,000 messages per second) syslog-ng cycles through reading 100
> messages from each source for 35000 messages and then writing those to
> disk. it is very common for the queued messages to fluctuate from a
> few hundred to 200,000 messages.
>
> I may have to add some new metrics to our statistics gathering to
> understand more about what is happening.
>
> Evan Rempel.
>
>
> On 2022-02-17 13:13, Laszlo Varady (lvarady) wrote:
>>
>> Hi,
>>
>> Do you have flags(flow-control) specified in your log paths?
>> If so, a dead destination in such log paths might cause the mentioned
>> issue.
>>
>> When flow-control is activated, the corresponding sources will be
>> suspended. This suspended state does not even allow syslog-ng to
>> truly release connections that have been closed by the clients.
>> This is actually more of expected behavior as we don't want to allow
>> new connections in situations where logs could not be delivered anyway.
>>
>> Please check the queued statistic counters of "syslog-ng-ctl stats"
>> to see whether this is the case.
>>
>> In case of anything else, I would suspect a bug.
>>
>> --
>> László Várady
>> ------------------------------------------------------------------------
>> *From:* syslog-ng <syslog-ng-bounces at lists.balabit.hu>
>> <mailto:syslog-ng-bounces at lists.balabit.hu> on behalf of Evan Rempel
>> <erempel at uvic.ca> <mailto:erempel at uvic.ca>
>> *Sent:* Thursday, February 17, 2022 19:01
>> *To:* syslog-ng at lists.balabit.hu <mailto:syslog-ng at lists.balabit.hu>
>> <syslog-ng at lists.balabit.hu> <mailto:syslog-ng at lists.balabit.hu>
>> *Subject:* [syslog-ng] allowed concurrent connections - bug?
>>
>> I am having an issue that is a little difficult to reproduce so I wanted
>> some input from others.
>>
>> I have a syslog-ng 3.35.1 that has a TLS source defined with
>> max-connections(10000)
>>
>> After some time the server starts logging a lot of messages
>>
>> syslog-ng[12802]: Number of allowed concurrent connections reached,
>> rejecting connection; client='AF_INET(XXXX:61062)',
>> local='AF_INET(YYYY:6514)', group_name='client_network_tcp',
>> location='/etc/syslog-ng/syslog-ng.server.conf:61:9', max='10000'
>>
>> To the best of my ability I can only find about 2500 actual connections.
>>
>> Both lsof and netstat report around the 2500 connections.
>>
>> I had to restart syslog-ng to stop this situation.
>>
>> Has anyone seen this behavior before?
>>
>> I get a lot of TLS connections without a certificate.
>>
>> Error reading RFC6587 style framed data
>>
>> Pperhaps the counters are not decremented for those timed out
>> connections?
>>
>> --
>> Evan Rempel
--
Evan Rempel 250.721.7691
Senior Systems Administratorerempel at uvic.ca
Data Centre Services, University Systems, University of Victoria
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20220218/731f37bd/attachment-0001.htm>
More information about the syslog-ng
mailing list