Syslog-ng + Consuming syslog alerts via FIFO: Alarms being lost, no logs related anywhere
This is my first time posting here. I hope you can help me. We have 8 collector servers running syslog probes. They also run a collector application. Syslog-ng receives all alarms through UDP Ports, filter them, and write a fifo. It also forwards these alarms through UDP to another server. We are using this another server that receives these forwarded messages as a validation point. There are some filter rules for the pipe delivery, and no filters for UDP delivery. But we could see several not filtered alarms being received at the UDP receiver server, but not being received by the collector application, that reads from the fifo (/opt/solution/src1.pipe). Collector logs have no error lines. I tried rewrititng syslog-ng configuration file, in order to use flow-control, that, according to syslog-ng documentation, is necessary for high volume environments (as this is the case here, 500-1000 alarms per second). I applied it yesterday, together with net.core.rmem_max=1024000, and the scenario got even worst. We noticed missing messages on the UDP destination server and on the fifo. Dropped packets counter on ifconfig was increasing quickly. We rolled back to the previous syslog-ng configuration. Below, you will find the first version of the syslog-ng.conf file, without flow-control (several names , filters and addresses were changed in sake of privacy): @version:3.19 @include "scl.conf" # syslog-ng configuration file. # # This should behave pretty much like the original syslog on RedHat. But # it could be configured a lot smarter. # # See syslog-ng(8) and syslog-ng.conf(5) for more information. # # Note: it also sources additional configuration files (*.conf) # located in /etc/syslog-ng/conf.d/ options { flush_lines (0); time_reopen (10); log_fifo_size (1000); chain_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (no); keep_hostname (yes); }; source s_sys { system(); internal(); # udp(ip(0.0.0.0) port(514)); }; template t_customer { template("${DATE} ${SOURCEIP} ${HOST} ${MSGHDR} ${MESSAGE}\n"); }; # UDP Source destination d_cons { file("/dev/console"); }; destination d_mesg { file("/var/log/messages"); }; destination d_auth { file("/var/log/secure"); }; destination d_mail { file("/var/log/maillog" flush_lines(10)); }; destination d_spol { file("/var/log/spooler"); }; destination d_boot { file("/var/log/boot.log"); }; destination d_cron { file("/var/log/cron"); }; destination d_kern { file("/var/log/kern"); }; destination d_mlal { usertty("*"); }; filter f_kernel { facility(kern); }; filter f_default { level(info..emerg) and not (facility(mail) or facility(authpriv) or facility(cron)); }; filter f_auth { facility(authpriv); }; filter f_mail { facility(mail); }; filter f_emergency { level(emerg); }; filter f_news { facility(uucp) or (facility(news) and level(crit..emerg)); }; filter f_boot { facility(local7); }; filter f_cron { facility(cron); }; #log { source(s_sys); filter(f_kernel); destination(d_cons); }; #log { source(s_sys); filter(f_kernel); destination(d_kern); }; log { source(s_sys); filter(f_default); destination(d_mesg); }; log { source(s_sys); filter(f_auth); destination(d_auth); }; log { source(s_sys); filter(f_mail); destination(d_mail); }; log { source(s_sys); filter(f_emergency); destination(d_mlal); }; log { source(s_sys); filter(f_news); destination(d_spol); }; log { source(s_sys); filter(f_boot); destination(d_boot); }; log { source(s_sys); filter(f_cron); destination(d_cron); }; #Debug filter f_debu_1 { not level(debug); }; #Information filter f_info_1 { not match("PROGRAM-" value("MESSAGE")) }; filter f_info_2 { not match("Multicast Client" value("MESSAGE")); }; filter f_info_3 { not match("executeCommand" value("MESSAGE")); }; filter f_info_4 { not match("has the privilege to execute" value("MESSAGE")); }; filter f_info_5 { not match("Telnet " value("MESSAGE")); }; #Warning filter f_warn_1 { not match("LSP went Up" value("MESSAGE")); }; filter f_warn_2 { not match("LSP went Down" value("MESSAGE")); }; filter f_warn_3 { not match("Subscriber created" value("MESSAGE")); }; filter f_warn_4 { not match("Subscriber deleted" value("MESSAGE")); }; filter f_warn_5 { not match("BGP-WARN" value("MESSAGE")); }; filter f_warn_6 { not match("DHCPS-WARNING" value("MESSAGE")); }; filter f_warn_7 { not match("DHCP-WARNING" value("MESSAGE")); }; #Notice filter f_noti_1 { not match("iCDR\[[0-9]+\]:"); }; #Error filter f_erro_1 { not match("FAN started working" value("MESSAGE")); }; filter f_erro_2 { not match("FAN is not working" value("MESSAGE")); }; #Alert filter f_aler_1 { not match("LDP virtual tunnel went" value("MESSAGE")); }; #Critical filter f_crit_1 { not match("traps dropped" value("MESSAGE")); }; filter f_crit_2 { not match("upstream E1 signals" value("MESSAGE")); }; filter f_crit_3 { not match(",ABCDE," value("MESSAGE")); }; filter f_crit_4 { not match(",XYZW," value("MESSAGE")); }; filter f_crit_5 { not match(" for ABCDE-[0-9]" value("MESSAGE")); }; source s_dummy{ udp(ip(0.0.0.0) port(5140)); }; source src1{ udp(ip(0.0.0.0) port(514)); }; source s_10514{ udp(ip(0.0.0.0) port(10514)); }; source src2{ udp(ip(0.0.0.0) port(515)); }; destination d_src1 { pipe("/opt/solution/src1.pipe" template(t_customer)); }; destination d_src2 { pipe("/opt/solution/src2.pipe" template(t_customer)); }; destination d_syslogCustom{ pipe("/opt/solution/syslog_custom.pipe" template(t_customer)); }; destination d_udp_dst{ syslog("192.168.1.151" transport("udp") port(514) spoof_source(no)); }; filter f_src2 { host("r(core|distribution|management).*") or host("^abcd.*") or host("^d2") or host("src2") or host("xyz1") or host("^jkl") or host("asddf") or host("zxcvb"); }; log { source(src1); source(s_10514); filter(f_debu_1); filter(f_info_1); filter(f_info_2); filter(f_info_3); filter(f_info_4); filter(f_info_5); filter(f_warn_1); filter(f_warn_2); filter(f_warn_3); filter(f_warn_4); filter(f_warn_5); filter(f_warn_6); filter(f_warn_7); filter(f_noti_1); filter(f_erro_1); filter(f_erro_2); filter(f_aler_1); filter(f_crit_1); filter(f_crit_2); filter(f_crit_3); filter(f_crit_4); filter(f_crit_5); destination(d_src1); }; log { source(src2); filter(f_src2); destination(d_src2); }; log { source(src1); destination(d_udp_dst); }; That's all. I Also tried using flow-control. The main achanges are below: source src1{ # udp( network( #new driver syntax transport("udp") #new driver syntax ip(0.0.0.0) port(514) max-connections(3000) ); }; source s_10514{ # udp( network( #new driver syntax transport("udp") #new driver syntax ip(0.0.0.0) port(10514) max_connections(3000) ); }; source src2{ # udp( network( #new driver syntax transport("udp") #new driver syntax ip(0.0.0.0) port(515) max_connections(3000) ); }; destination d_src1 { pipe( "/opt/solution/src1.pipe" template(t_customer) log_fifo_size(600000) ); }; destination d_src2 { pipe( "/opt/solution/src2.pipe" template(t_customer) log_fifo_size(300000) ); }; destination d_udp_dst{ syslog( "192.168.1.151" transport("udp") port(514) spoof_source(no) log_fifo_size(300000) ); }; And the situation got even worst. So, I would like to ask you: How can I fix it? I need to be sure that all data received is forwarded to their destinations, respecting the filters, and without any information loss. I'm quite sure all data is being received, I've checked it using tcpdump, no udp packet loss during transmission. The most important sessions are: log { source(src1); source(s_10514); filter(f_debu_1); filter(f_info_1); filter(f_info_2); filter(f_info_3); filter(f_info_4); filter(f_info_5); filter(f_warn_1); filter(f_warn_2); filter(f_warn_3); filter(f_warn_4); filter(f_warn_5); filter(f_warn_6); filter(f_warn_7); filter(f_noti_1); filter(f_erro_1); filter(f_erro_2); filter(f_aler_1); filter(f_crit_1); filter(f_crit_2); filter(f_crit_3); filter(f_crit_4); filter(f_crit_5); destination(d_src1); }; log { source(src1); destination(d_udp_dst); }; Thanks a lot, and sorry for the huge message, Richter <https://www.linkedin.com/company/icarotech>
Hi, On Thu, Apr 08, 2021 at 12:25:44PM -0300, Richter Vitali Guedes wrote:
How can I fix it? I need to be sure that all data received is forwarded to their destinations, respecting the filters, and without any information loss.
You should really consider using TCP. We switched our 2000 nodes+ a month ago, and it saves us a lot of headeaches. It also makes it much easier to monitor, as TCP connections are all easily traceable and can be plotted to assess the load-balancer's situation for instance. If you really want to keep UDP, the advice I could give would be to use reusable sockets, which has been implemented quite recently.
Hello, Thanks for replying. I really would love to use TCP. But I'm dealing with several legacy network/telecomm devices, and most of them doesn't support syslog under TCP. I really need to deal with UDP. Just applied reusable sockets to the configuration. More configuration I tried that gave me better results: mem-bug-length and disk-buf-size : The data loss reached zero. Got one more doubt: While reading a pipe with cat command, is it normal to get cat closed while there are messages streams being processed? Thanks, Richter Em 09/04/2021 04:08, Fabien Wernli escreveu:
Hi,
On Thu, Apr 08, 2021 at 12:25:44PM -0300, Richter Vitali Guedes wrote:
How can I fix it? I need to be sure that all data received is forwarded to their destinations, respecting the filters, and without any information loss. You should really consider using TCP. We switched our 2000 nodes+ a month ago, and it saves us a lot of headeaches. It also makes it much easier to monitor, as TCP connections are all easily traceable and can be plotted to assess the load-balancer's situation for instance.
If you really want to keep UDP, the advice I could give would be to use reusable sockets, which has been implemented quite recently. <https://www.linkedin.com/company/icarotech>
On Thu, Apr 15, 2021 at 03:18:32PM -0300, Richter Vitali Guedes wrote:
Got one more doubt: While reading a pipe with cat command, is it normal to get cat closed while there are messages streams being processed?
Just so things are clear, are you referring to using a program() or pipe() source ? And if yes, are you saying that syslog-ng closes (SIGPIPE) the pipe ?
participants (2)
-
Fabien Wernli
-
Richter Vitali Guedes