Hi Szilard,

 

After the upgrade to 3.36.1 this issue has been seen more frequently.

 

We separated internal messages to an individual log file, but the information is not that much, please check the attached file.

You can see that syslog-ng is reloading, because we installed a job to try to recover from the error, which was to reload the syslog-ng if no logs are written within a 30 seconds period.

 

One the things I noticed is that the socket to the journal seems to vanish during the error situation:

root@machine:~# lsof /run/log/journal/98101a328524447d88917bea845a8966/system*

COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME

systemd-j 1723 root  mem    REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

systemd-j 1723 root  mem    REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

systemd-j 1723 root   16u   REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

systemd-j 1723 root   24u   REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

syslog-ng 3201 root  mem    REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

syslog-ng 3201 root  mem    REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

syslog-ng 3201 root   14r   REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

syslog-ng 3201 root   15r   REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

journalct 6861 root  mem    REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

journalct 6861 root  mem    REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

journalct 6861 root    5r   REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

journalct 6861 root    6r   REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

root@ machine:~# lsof /run/log/journal/98101a328524447d88917bea845a8966/system*

COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME

systemd-j 1723 root  mem    REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

systemd-j 1723 root  mem    REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

systemd-j 1723 root   16u   REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

systemd-j 1723 root   24u   REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

journalct 6861 root  mem    REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

journalct 6861 root  mem    REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

journalct 6861 root    5r   REG   0,19  8388608 31745 /run/log/journal/98101a328524447d88917bea845a8966/system.journal

journalct 6861 root    6r   REG   0,19  8388608 26165 /run/log/journal/98101a328524447d88917bea845a8966/system@3721b31246e54dc0baab1ac0f68c3f43-0000000000000001-000581d7e3fe20ba.journal

 

Is this an expected behavior?

 

Thanks & Regards,

Alex

 

From: Alexandre Santos
Sent: 5 de maio de 2022 15:31
To: Syslog-ng users' and developers' mailing list <syslog-ng@lists.balabit.hu>
Subject: RE: [syslog-ng] Local sources seem not to be working

 

Hi Szilard,

 

The logs being written in /var/log/linecard.log are the ones coming from the ‘syslog(ip(10.20.30.40) transport("udp") port(514) keep-alive(no));’ source.

Because log messages received in syslog() source, have always local4 facility.

 

I am sending you in attachment the write_with_rotation.sh script.

 

Thanks,

Alex

 

From: syslog-ng <syslog-ng-bounces@lists.balabit.hu> On Behalf Of Szilard Parrag (sparrag)
Sent: 5 de maio de 2022 07:32
To: syslog-ng@lists.balabit.hu
Subject: Re: [syslog-ng] Local sources seem not to be working

 

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

 

Hi Alex,

 

After checking the stats, you have sent we can see that there had been some writes:

from

dst.program;d_localfile_linecard#0;/opt/machine/local/bin/write_with_rotation.sh /var/log/linecard.log 10 10;a;written;4518

to

dst.program;d_localfile_linecard#0;/opt/machine/local/bin/write_with_rotation.sh /var/log/linecard.log 10 10;a;written;4549
·         we do not see increase in the counters of /var/log destinatons, but only on one destination
·         we could only see an increase in syslog-udp processed counters
·         there are no dropped/queued counters

 

 

 

We would guess this could be due to flow-control, but for that we would need to see non-zero queued counter values, which is not the case. It could happen that one destination hangs/can't send messages out, which leads to suspended sources due to flow-control, but the syslog() source is not affected since it doesn't send messages to the hanged destination(s).

 

Based on the stats, only "d_localfile_linecard" is active (~30 messages in 15 minutes), maybe the syslog() source would be affected too without the filtering.

 

 

 

We should see more internal logs, which is problematic, since internal() source seems to be stopped too. For that I would recommend extracting internal() source from the s_src statement and putting it in a separate log path with a simple file destination.

Also, if possible, could you please share your `write_with_rotation.sh` script? It is unlikely that it interferes with syslog-ng, but a double check would be nice. 🙂