[syslog-ng] allowed concurrent connections - bug?

Evan Rempel erempel at uvic.ca
Fri Feb 18 17:45:59 UTC 2022


Some more anecdotal details. I know this is not really a systematic 
approach to tyrouble shooting this.
I guess I'll have to turn on the verbose stats. We collect them 
regularly so that might get overwhelmed.

The issue started again.

Number of allowed concurrent connections reached, rejecting connection; 
client='AF_INET(XXXX:50773)', local='AF_INET(YYYY:6514)', 
group_name='client_network_tcp', 
location='/etc/syslog-ng/syslog-ng.server.conf:61:9', max='15000'


The established connection count was 2747

$ netstat -an | grep ESTABLISHED | grep -c 6514
2747

The output destination stats were

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945402
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;memory_usage;148574728
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;written;1839783
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;truncated_count;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;truncated_bytes;0

repeating this statistics report a few times with 2-3 seconds between 
then it shows

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945406
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945406
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945410
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1945410
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;105619

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;1960812
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;121005

reload syslong-ng

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2084011
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;97926

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2084013
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;97938


dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2103861
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;19842

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2107477
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;23463

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2112825
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;28810

By the time I was this far composingthis message, things seemed to have 
stalled again

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2909715
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;54738

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;2909718
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;54738

eventually it started cycling through the large queueing and flushing

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;3177968
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;157790

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;3187677
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;2150

with periods of getting stuck

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;3418242
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;62294

dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;dropped;0
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;processed;3418246
dst.file;workstation.log#0;/var/syslog/workstation.log.20220218.090000;a;queued;62294

and judging by the cpu usage, not all of the log messages are being 
processed.

Verbose stats ... here I come.


On 2022-02-18 04:08, Laszlo Varady (lvarady) wrote:
>
> Hi,
>
> A macroed file destination is unlikely to cause such an issue if the 
> location is writable.
>
> Just a tip that might help ruling out the case I mentioned:
> Setting the stats-level() to 4 results in an extremely verbose stats 
> output, where a counter called "free_window" can be found for each 
> network connection.
> This can be used to check whether a connection is suspended or not. 
> Note that this is a momentary value, which oscillates between 0 and 
> full_window, so a momentary 0 does not mean anything bad,
> but we're looking for fixed 0 "free_window" values.
>
> --
> László Várady
> ------------------------------------------------------------------------
> *From:* syslog-ng <syslog-ng-bounces at lists.balabit.hu> on behalf of 
> Evan Rempel <erempel at uvic.ca>
> *Sent:* Friday, February 18, 2022 3:22
> *To:* syslog-ng at lists.balabit.hu <syslog-ng at lists.balabit.hu>
> *Subject:* Re: [syslog-ng] allowed concurrent connections - bug?
> CAUTION: This email originated from outside of the organization. Do 
> not follow guidance, click links, or open attachments unless you 
> recognize the sender and know the content is safe.
>
> There is only 1 destination, although it is a file named with macros 
> of date and hour.
>
> destination workstation.log { 
> file("/var/syslog/workstation.log.$R_YEAR$R_MONTH$R_DAY.${R_HOUR}0000" 
> ); };
>
> And that volume has never become full.
>
> Also, with a file based destination I can't actually turn flow-control 
> off since files have soft flow-control.
>
> Because I have 3500 real connections that are all active (total of 
> 10,000 messages per second) syslog-ng cycles through reading 100 
> messages from each source for 35000 messages and then writing those to 
> disk. it is very common for the queued messages to fluctuate from a 
> few hundred to 200,000 messages.
>
> I may have to add some new metrics to our statistics gathering to 
> understand more about what is happening.
>
> Evan Rempel.
>
>
> On 2022-02-17 13:13, Laszlo Varady (lvarady) wrote:
>>
>> Hi,
>>
>> Do you have flags(flow-control) specified in your log paths?
>> If so, a dead destination in such log paths might cause the mentioned 
>> issue.
>>
>> When flow-control is activated, the corresponding sources will be 
>> suspended. This suspended state does not even allow syslog-ng to 
>> truly release connections that have been closed by the clients.
>> This is actually more of expected behavior as we don't want to allow 
>> new connections in situations where logs could not be delivered anyway.
>>
>> Please check the queued statistic counters of "syslog-ng-ctl stats" 
>> to see whether this is the case.
>>
>> In case of anything else, I would suspect a bug.
>>
>> --
>> László Várady
>> ------------------------------------------------------------------------
>> *From:* syslog-ng <syslog-ng-bounces at lists.balabit.hu> 
>> <mailto:syslog-ng-bounces at lists.balabit.hu> on behalf of Evan Rempel 
>> <erempel at uvic.ca> <mailto:erempel at uvic.ca>
>> *Sent:* Thursday, February 17, 2022 19:01
>> *To:* syslog-ng at lists.balabit.hu <mailto:syslog-ng at lists.balabit.hu> 
>> <syslog-ng at lists.balabit.hu> <mailto:syslog-ng at lists.balabit.hu>
>> *Subject:* [syslog-ng] allowed concurrent connections - bug?
>>
>> I am having an issue that is a little difficult to reproduce so I wanted
>> some input from others.
>>
>> I have a syslog-ng 3.35.1 that has a TLS source defined with
>> max-connections(10000)
>>
>> After some time the server starts logging a lot of messages
>>
>> syslog-ng[12802]: Number of allowed concurrent connections reached,
>> rejecting connection; client='AF_INET(XXXX:61062)',
>> local='AF_INET(YYYY:6514)', group_name='client_network_tcp',
>> location='/etc/syslog-ng/syslog-ng.server.conf:61:9', max='10000'
>>
>> To the best of my ability I can only find about 2500 actual connections.
>>
>> Both lsof and netstat report around the 2500 connections.
>>
>> I had to restart syslog-ng to stop this situation.
>>
>> Has anyone seen this behavior before?
>>
>> I get a lot of TLS connections without a certificate.
>>
>> Error reading RFC6587 style framed data
>>
>> Pperhaps the counters are not decremented for those timed out 
>> connections?
>>
>> --
>> Evan Rempel


-- 
Evan Rempel                                         250.721.7691
Senior Systems Administratorerempel at uvic.ca
Data Centre Services, University Systems, University of Victoria
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20220218/731f37bd/attachment-0001.htm>


More information about the syslog-ng mailing list