[syslog-ng] Syslog-ng shipping logs through AMQP with huge memory leaks
Scheidler, Balázs
balazs.scheidler at balabit.com
Wed Mar 21 00:57:19 UTC 2018
126% CPU usage? that would be great to know the details there, although I
understand that the memory is more of a concern now. :)
can you run a perf record on that process, perhaps once the memory issue is
solved? I have my suspicion where it is spending its time, but it would be
great to confirm. (my guess is value-pairs while formatting json messages).
cheers,
--
Bazsi
On Tue, Mar 20, 2018 at 8:26 PM, Michal Purzynski <michal at mozilla.com>
wrote:
> Hello Gábor!
>
> Answers inline.
>
> On Mon, Mar 19, 2018 at 9:09 AM, Nagy, Gábor <gabor.nagy at balabit.com>
> wrote:
>
>>
>> - have you built syslog-ng from source or you downloaded a package?
>>
>
> It's a package from
>
> deb http://download.opensuse.org/repositories/home:/laszlo_
> budai:/syslog-ng/xUbuntu_14.04 ./
>
>
> - have you tried to reproduce the issue in a different environment with a
>> minimal config? We are using your configuration, but if you narrowed down
>> the problem it would be helpful.
>>
>
> If I disable the Bro via AMQP the problem goes away, so that must be
> something there :/
>
>
> - we were experimenting with a very simple JSON message, can you show us
>> an example log to see the complexity of it, please? We are thinking to
>> check Bro out for log message structure.
>>
>
> Example logs follow sent to Peter via a private channel, making them
> public would be kind of difficult.
>
> Appreciate you looking into it! And BTW, I just restarted syslog-ng on the
> most busy server
>
> 14910 root 20 0 59.899g 0.057t 3784 S 126.4 92.9 1166:22 syslog-ng
>
> ;-)
>
>
>>
>> I saw that you have gave a huge log-fetch-limit() in the global config
>> compared to the default. Setting log-fetch-limit() in global config is
>> deprecated, you need to set it up per source.
>>
>> We have a couple of ideas and will continue to try reproducing the
>> memleak you reported.
>>
>> Regards,
>> Gabor
>>
>> On Sat, Mar 17, 2018 at 12:45 AM, Michal Purzynski <michal at mozilla.com>
>> wrote:
>>
>>> Hello!!
>>>
>>> Could you help us troubleshoot a memory leak or a misconfiguration that
>>> makes the syslog-ng process memory usage grow? Like, a lot.
>>>
>>> We use syslog-ng to read some JSON files and ship them to a RabbitMQ
>>> server via AMQP. As you can see, this is just a client, it does not accept
>>> connections from other systems, it works mostly with Bro logs (plus it
>>> handles a local syslog because that's convenient).
>>>
>>> We have around 6000 events per second on this server. What's
>>> interesting, syslog-ng's memory grows quickly without flow control (and
>>> slower, but it still continues to grow with it). I'll switch that to TLS
>>> soon, a more secure configuration is ready to be deployed.
>>>
>>> Things look pretty good on the RabbitMQ side. That server is not under
>>> pressure and handles the load just fine, the queue is consumed, there's
>>> nothing building up that would make me believe we have RabbitMQ server
>>> overloaded.
>>>
>>> How much does syslog-ng grow?
>>>
>>> I'd say - if I disable flow-control it will eat 55GB of RAM in less than
>>> 24h, if not faster. With flow-control enabled on the most 'busy' files
>>> things are way better, but the memory usage still keep growing -
>>>
>>> syslog-ng.conf looks like below - BTW that's Ubuntu 14.04 LTS, 3.14.1-3
>>> of syslog-ng
>>>
>>> Let me know what other data you might need.
>>>
>>> @version: 3.14
>>> @include "scl.conf"
>>>
>>> # Syslog-ng configuration file, compatible with default Debian syslogd
>>> # installation.
>>>
>>> # First, set some global options.
>>> options {
>>> threaded (yes);
>>> flush_lines (50000);
>>> flush_timeout (1000);
>>> time_reopen (10);
>>> log_fetch_limit (50000);
>>> log_fifo_size (500000);
>>> use_dns (yes);
>>> dns_cache (5000);
>>> dns_cache_expire(87600);
>>> use_fqdn (yes);
>>> owner("root");
>>> group("adm");
>>> perm(0640);
>>> keep_hostname (yes);
>>> chain_hostnames (off);
>>> };
>>>
>>> ########################
>>> # Sources
>>> ########################
>>> # This is the default behavior of sysklogd package
>>> # Logs may come from unix stream, but not from another machine.
>>> #
>>> source s_src {
>>> system();
>>> internal();
>>> };
>>>
>>> # If you wish to get logs from remote machine you should uncomment
>>> # this and comment the above source line.
>>> #
>>> #source s_net { tcp(ip(127.0.0.1) port(1000)); };
>>>
>>> ########################
>>> # Destinations
>>> ########################
>>> # First some standard logfile
>>> #
>>> destination d_auth { file("/var/log/auth.log"); };
>>> destination d_cron { file("/var/log/cron.log"); };
>>> destination d_daemon { file("/var/log/daemon.log"); };
>>> destination d_kern { file("/var/log/kern.log"); };
>>> destination d_mail { file("/var/log/mail.log"); };
>>> destination d_syslog { file("/var/log/syslog"); };
>>>
>>> # This files are the log come from the mail subsystem.
>>> #
>>> #destination d_mailinfo { file("/var/log/mail.info"); };
>>> #destination d_mailwarn { file("/var/log/mail.warn"); };
>>> #destination d_mailerr { file("/var/log/mail.err"); };
>>>
>>> # Logging for INN news system
>>> #
>>> #destination d_newscrit { file("/var/log/news/news.crit"); };
>>> #destination d_newserr { file("/var/log/news/news.err"); };
>>> #destination d_newsnotice { file("/var/log/news/news.notice"); };
>>>
>>> # Some 'catch-all' logfiles.
>>> #
>>> destination d_debug { file("/var/log/debug"); };
>>> destination d_error { file("/var/log/error"); };
>>>
>>> # Syslog1 in SCL3
>>> destination d_scl3 {
>>> udp("syslog1.private.scl3.mozilla.com" port(514));
>>> };
>>>
>>> ########################
>>> # Filters
>>> ########################
>>> # Here's come the filter options. With this rules, we can set which
>>> # message go where.
>>>
>>> filter f_dbg { level(debug); };
>>> filter f_info { level(info); };
>>> filter f_notice { level(notice); };
>>> filter f_warn { level(warn); };
>>> filter f_err { level(err); };
>>> filter f_crit { level(crit .. emerg); };
>>> filter f_debug { level(debug) and not facility(auth, authpriv, news,
>>> mail); };
>>> filter f_error { level(err .. emerg) ; };
>>> filter f_auth { facility(auth, authpriv) and not filter(f_debug); };
>>> filter f_cron { facility(cron) and not filter(f_debug); };
>>> filter f_daemon { facility(daemon) and not filter(f_debug); };
>>> filter f_kern { facility(kern) and not filter(f_debug); };
>>> filter f_local { facility(local0, local1, local3, local4, local5,
>>> local6, local7) and not filter(f_debug); };
>>> filter f_mail { facility(mail) and not filter(f_debug); };
>>> filter f_syslog3 { not facility(auth, authpriv, mail) and not
>>> filter(f_debug); };
>>>
>>> ########################
>>> # Log paths
>>> ########################
>>> log { source(s_src); filter(f_auth); destination(d_auth); };
>>> log { source(s_src); filter(f_cron); destination(d_cron); };
>>> log { source(s_src); filter(f_daemon); destination(d_daemon); };
>>> log { source(s_src); filter(f_kern); destination(d_kern); };
>>> log { source(s_src); filter(f_syslog3); destination(d_syslog); };
>>> log { source(s_src); filter(f_mail); destination(d_mail); };
>>> log { source(s_src); filter(f_debug); destination(d_debug); };
>>> log { source(s_src); filter(f_error); destination(d_error); };
>>>
>>>
>>> # All messages send to a remote site
>>> #
>>> log { source(s_src); destination(d_scl3); };
>>>
>>> ###
>>> # Include all config files in /etc/syslog-ng/conf.d/
>>> ###
>>> @include "/etc/syslog-ng/conf.d/*.conf"
>>>
>>>
>>>
>>> There's another file, amqp.conf where we actually read and ship those
>>> Bro logs.
>>>
>>>
>>> source bro_conn {
>>> file( "/nsm/bro/logs/current/conn.log" flags(no-parse));
>>> };
>>>
>>> source bro_ssl {
>>> file( "/nsm/bro/logs/current/ssl.log" flags(no-parse));
>>> };
>>>
>>> source bro_dns {
>>> file( "/nsm/bro/logs/current/dns.log" flags(no-parse));
>>> };
>>>
>>> source bro_smtp {
>>> file( "/nsm/bro/logs/current/smtp.log" flags(no-parse));
>>> };
>>>
>>> source bro_ssh {
>>> file( "/nsm/bro/logs/current/ssh.log" flags(no-parse));
>>> };
>>>
>>> source bro_notice {
>>> file( "/nsm/bro/logs/current/notice.log" flags(no-parse));
>>> };
>>>
>>> source bro_intel {
>>> file( "/nsm/bro/logs/current/intel.log" flags(no-parse));
>>> };
>>>
>>> source bro_dce_rpc {
>>> file( "/nsm/bro/logs/current/dce_rpc.log" flags(no-parse));
>>> };
>>>
>>> source bro_dhcp {
>>> file( "/nsm/bro/logs/current/dhcp.log" flags(no-parse));
>>> };
>>>
>>> source bro_files {
>>> file( "/nsm/bro/logs/current/files.log" flags(no-parse));
>>> };
>>>
>>> source bro_kerberos {
>>> file( "/nsm/bro/logs/current/kerberos.log" flags(no-parse));
>>> };
>>>
>>> source bro_http {
>>> file( "/nsm/bro/logs/current/http.log" flags(no-parse));
>>> };
>>>
>>> source bro_software {
>>> file( "/nsm/bro/logs/current/software.log" flags(no-parse));
>>> };
>>>
>>> source bro_snmp {
>>> file( "/nsm/bro/logs/current/snmp.log" flags(no-parse));
>>> };
>>>
>>> source bro_socks {
>>> file( "/nsm/bro/logs/current/socks.log" flags(no-parse));
>>> };
>>>
>>> source bro_tunnel {
>>> file( "/nsm/bro/logs/current/tunnel.log" flags(no-parse));
>>> };
>>>
>>> source bro_ntlm {
>>> file( "/nsm/bro/logs/current/ntlm.log" flags(no-parse));
>>> };
>>>
>>> source bro_pe {
>>> file( "/nsm/bro/logs/current/pe.log" flags(no-parse));
>>> };
>>>
>>> source bro_sip {
>>> file( "/nsm/bro/logs/current/sip.log" flags(no-parse));
>>> };
>>>
>>> source bro_smb_files {
>>> file( "/nsm/bro/logs/current/smb_files.log" flags(no-parse));
>>> };
>>>
>>> source bro_smb_mapping {
>>> file( "/nsm/bro/logs/current/smb_mapping.log" flags(no-parse));
>>> };
>>>
>>> source bro_x509 {
>>> file( "/nsm/bro/logs/current/x509.log" flags(no-parse));
>>> };
>>>
>>> source bro_known_certs {
>>> file( "/nsm/bro/logs/current/known_certs.log" flags(no-parse));
>>> };
>>>
>>> source bro_known_devices {
>>> file( "/nsm/bro/logs/current/known_devices.log" flags(no-parse));
>>> };
>>>
>>> source bro_known_hosts {
>>> file( "/nsm/bro/logs/current/known_hosts.log" flags(no-parse));
>>> };
>>>
>>> source bro_known_services {
>>> file( "/nsm/bro/logs/current/known_services.log" flags(no-parse));
>>> };
>>>
>>>
>>> destination d_amqp {
>>> amqp(
>>> vhost("nsm")
>>> host("<our happy rabbit>
>>> <http://syslog-proxy1.dmz.mdc1.mozilla.com>")
>>> port(5672)
>>> exchange("eventtask")
>>> exchange-type("direct")
>>> routing-key("eventtask")
>>> body("$(format-json --scope nv_pairs --pair category=\"bro\"
>>> --pair source=$source --pair customendpoint=\" \" --pair tags=\"bro\")")
>>> persistent(yes)
>>> username("USERNAME")
>>> password("PASSWORD")
>>> );
>>> };
>>>
>>>
>>> parser p_json { json-parser(); };
>>>
>>>
>>> log { source(bro_conn); parser(p_json); destination(d_amqp); };
>>> log { source(bro_http); parser(p_json); destination(d_amqp); };
>>> log { source(bro_ssl); parser(p_json); destination(d_amqp); };
>>> log { source(bro_dns); parser(p_json); destination(d_amqp); };
>>> log { source(bro_smtp); parser(p_json); destination(d_amqp); };
>>> log { source(bro_ssh); parser(p_json); destination(d_amqp); };
>>> log { source(bro_intel); parser(p_json); destination(d_amqp); };
>>> log { source(bro_notice); parser(p_json); destination(d_amqp); };
>>> log { source(bro_dce_rpc); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_dhcp); parser(p_json); destination(d_amqp); };
>>> log { source(bro_files); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_kerberos); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_software); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_snmp); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_socks); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_tunnel); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_ntlm); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_pe); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_sip); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_smb_files); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_smb_mapping); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_x509); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_known_certs); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_known_devices); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_known_hosts); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>> log { source(bro_known_services); parser(p_json); destination(d_amqp);
>>> flags(flow-control); };
>>>
>>>
>>>
>>>
>>>
>>> *Statistics*
>>>
>>>
>>> Mar 15 *00:17:30* nsmserver syslog-ng[11278]: Log statistics;
>>> processed='source(bro_conn)=112360513', processed='source(s_src)=227349',
>>> processed='source(bro_known_devices)=3791',
>>> processed='global(sdata_updates)=0', processed='center(received)=310790955',
>>> processed='source(bro_ssh)=622441', processed='source(bro_smb_files)=5815964',
>>> processed='source(bro_socks)=0', processed='destination(d_daemon)=21',
>>> dropped='dst.amqp(d_amqp#0,amqp,nsm,happyrabbit,5672,eventtask,direct)=2',
>>> processed='dst.amqp(d_amqp#0,amqp,nsm,happyrabbit,5672,event
>>> task,direct)=310563565 <(31)%20056%203565>',
>>> queued='dst.amqp(d_amqp#0,amqp,nsm,happyrabbit,5672,eventtask,direct)=0',
>>> processed='destination(d_error)=189386', processed='destination(d_syslog)=207595',
>>> processed='source(bro_ssl)=49788364', processed='source(bro_kerberos)=133177',
>>> processed='source(bro_dhcp)=69970', processed='destination(d_mail)=0',
>>> processed='source(bro_http)=60085539', processed='global(msg_clones)=1576',
>>> processed='destination(d_amqp)=310563565 <(31)%20056%203565>',
>>> processed='destination(d_kern)=146', processed='source(bro_tunnel)=520921',
>>> processed='source(bro_software)=18851236 <(1)%20885%201236>',
>>> processed='source(bro_known_services)=13403',
>>> processed='source(bro_known_certs)=2070', processed='source(bro_dce_rpc)=501875',
>>> processed='destination(d_scl3)=227349', processed='source(bro_known_hosts)=14604',
>>> processed='source(bro_smb_mapping)=116412',
>>> processed='source(bro_files)=15152100', processed='center(queued)=311210449',
>>> processed='destination(d_debug)=10280', processed='src.internal(s_src#2)=26785',
>>> stamp='src.internal(s_src#2)=1521073048', processed='source(bro_ntlm)=16823',
>>> processed='destination(d_auth)=9474', processed='global(internal_queue_length)=0',
>>> processed='source(bro_smtp)=1067448', dropped='dst.udp(d_scl3#0,udp,
>>> syslog1.private.scl3.mozilla.com:514)=0', processed='dst.udp(d_scl3#0,ud
>>> p,syslog1.private.scl3.mozilla.com:514)=227349',
>>> queued='dst.udp(d_scl3#0,udp,syslog1.private.scl3.mozilla.com:514)=221705',
>>> written='dst.udp(d_scl3#0,udp,syslog1.private.scl3.mozilla.com:514)=5644',
>>> processed='global(payload_reallocs)=310467149 <(31)%20046%207149>',
>>> queued='global(scratch_buffers_count)=17875655781170',
>>> processed='destination(d_cron)=2633', processed='source(bro_snmp)=9854568',
>>> processed='source(bro_notice)=119415', processed='source(bro_dns)=19303431',
>>> processed='source(bro_sip)=97822', processed='source(bro_intel)=26969',
>>> processed='source(bro_pe)=531103', processed='source(bro_x509)=15493647',
>>> queued='global(scratch_buffers_bytes)=2304'
>>>
>>>
>>> Mar 15 *00:27:30* nsmserver <http://nsmserver1.private.scl3.mozilla.com>
>>> syslog-ng[11278]: Log statistics; processed='source(bro_conn)=112988941',
>>> processed='source(s_src)=228651', processed='source(bro_known_devices)=3791',
>>> processed='global(sdata_updates)=0', processed='center(received)=312659144',
>>> processed='source(bro_ssh)=627013', processed='source(bro_smb_files)=5863697',
>>> processed='source(bro_socks)=0', processed='destination(d_daemon)=21',
>>> dropped='dst.amqp(d_amqp#0,amqp,nsm,happyrabbit
>>> <http://syslog-proxy1.dmz.mdc1.mozilla.com>,5672,eventtask,direct)=2',
>>> processed='dst.amqp(d_amqp#0,amqp,nsm,happyrabbit
>>> <http://syslog-proxy1.dmz.mdc1.mozilla.com>,5672,eventtask,direct)=
>>> 312430452 <(31)%20243%200452>', queued='dst.amqp(d_amqp#0,amqp
>>> ,nsm,happyrabbit <http://syslog-proxy1.dmz.mdc1.mozilla.com>
>>> ,5672,eventtask,direct)=0', processed='destination(d_error)=190429',
>>> processed='destination(d_syslog)=208759', processed='source(bro_ssl)=50077572',
>>> processed='source(bro_kerberos)=134215', processed='source(bro_dhcp)=70487',
>>> processed='destination(d_mail)=0', processed='source(bro_http)=60446166',
>>> processed='global(msg_clones)=1594', processed='destination(d_amqp)=
>>> 312430452 <(31)%20243%200452>', processed='destination(d_kern)=146',
>>> processed='source(bro_tunnel)=524450', processed='source(bro_software)=
>>> 18938552 <(1)%20893%208552>', processed='source(bro_known_services)=13532',
>>> processed='source(bro_known_certs)=2073', processed='source(bro_dce_rpc)=505206',
>>> processed='destination(d_scl3)=228651', processed='source(bro_known_hosts)=14630',
>>> processed='source(bro_smb_mapping)=117177',
>>> processed='source(bro_files)=15252368', processed='center(queued)=313080999',
>>> processed='destination(d_debug)=10352', processed='src.internal(s_src#2)=26966',
>>> stamp='src.internal(s_src#2)=1521073648', processed='source(bro_ntlm)=16848',
>>> processed='destination(d_auth)=9540', processed='global(internal_queue_length)=0',
>>> processed='source(bro_smtp)=1074012', dropped='dst.udp(d_scl3#0,udp,
>>> syslog1.private.scl3.mozilla.com:514)=0', processed='dst.udp(d_scl3#0,ud
>>> p,syslog1.private.scl3.mozilla.com:514)=228651',
>>> queued='dst.udp(d_scl3#0,udp,syslog1.private.scl3.mozilla.com:514)=223007',
>>> written='dst.udp(d_scl3#0,udp,syslog1.private.scl3.mozilla.com:514)=5644',
>>> processed='global(payload_reallocs)=312333723 <(31)%20233%203723>',
>>> queued='global(scratch_buffers_count)=17970145061685',
>>> processed='destination(d_cron)=2649', processed='source(bro_snmp)=9917302',
>>> processed='source(bro_notice)=120140', processed='source(bro_dns)=19462256',
>>> processed='source(bro_sip)=98565', processed='source(bro_intel)=27061',
>>> processed='source(bro_pe)=535753', processed='source(bro_x509)=15598686',
>>> queued='global(scratch_buffers_bytes)=2304'
>>>
>>>
>>> ____________________________________________________________
>>> __________________
>>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>> Documentation: http://www.balabit.com/support
>>> /documentation/?product=syslog-ng
>>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>>
>>>
>>>
>>
>> ____________________________________________________________
>> __________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation: http://www.balabit.com/support/documentation/?product=
>> syslog-ng
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>
>>
>>
>
> ____________________________________________________________
> __________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?
> product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20180321/fffc007c/attachment-0001.html>
More information about the syslog-ng
mailing list