syslog-ng restart: postfix/nqmgr logline missing on next received e-mail
Hi again, during yesterdays problem my postfix statistics reporting tool (pflogstats) reported about incomplete logging in 4 cases. In all 4 cases I have syslog-ng restarted before. Example: Feb 12 22:22:54 syslog-ng[1871]: syslog-ng version 1.5.26 going down Feb 12 22:22:54 syslog-ng[5317]: syslog-ng version 1.5.26 starting Feb 12 22:22:54 syslog-ng: syslog-ng startup succeeded Feb 12 22:23:01 crond: crond shutdown succeeded <- already learnt :-( Feb 12 22:23:02 crond: crond startup succeeded Missing nqmgr: Feb 12 22:29:58 (7 minutes later, first received e-mail after syslog-ng restart) postfix/smtpd[5507]: C12F31386E: client=outgoing3... postfix/cleanup[5521]: C12F31386E: message-id=<20030212180349.... postfix/smtpd[5507]: disconnect from outgoing3... postfix/pipe[5525]: C12F31386E: to=<pbieringer@... Normally (e.g. next delivered e-mail): Feb 12 22:29:59 postfix/cleanup[5521]: 3AEAD1386F: message-id=<20030212180... postfix/pipe[5525]: C12F31386E: to=<pbieringer@aer... postfix/nqmgr[19977]: 3AEAD1386F: from=<bugtraq-r.... <--!!!!! postfix/smtp[5530]: 3AEAD1386F: to=<pbieringer@aerasec.de>,... Where is the nqmgr log line lost in first event? Is this a well-known issue (feature or bug)? If yes, how can this be fixed? Pls. don't tell me that I have to restart every service after I restart syslog-ng (already known: crond, now postfix, which one is still missing?). This would be a very unnice "feature". If bug is in a common used glibc-version, then imho a workaround by syslog-ng would be required. I don't like to loose loglines, you sure not, too. Does this also happen if old syslogd would be used? Don't believe so. It's all very strange...looks like I have to switch back to syslog, loosing all the nice features :-( Peter, wondering -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/
On Thu, Feb 13, 2003 at 09:54:52AM +0100, Peter Bieringer wrote:
Pls. don't tell me that I have to restart every service after I restart syslog-ng (already known: crond, now postfix, which one is still missing?). This would be a very unnice "feature". If bug is in a common used glibc-version, then imho a workaround by syslog-ng would be required. I don't like to loose loglines, you sure not, too.
Does this also happen if old syslogd would be used? Don't believe so.
It's all very strange...looks like I have to switch back to syslog, loosing all the nice features :-(
the same happens to old syslogd as well as this is a libc issue. just try it and you will see. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
On Thursday 13 February 2003 10:56, Balazs Scheidler wrote:
On Thu, Feb 13, 2003 at 09:54:52AM +0100, Peter Bieringer wrote:
Pls. don't tell me that I have to restart every service after I restart syslog-ng (already known: crond, now postfix, which one is still missing?). This would be a very unnice "feature". If bug is in a common used glibc-version, then imho a workaround by syslog-ng would be required. I don't like to loose loglines, you sure not, too.
Does this also happen if old syslogd would be used? Don't believe so.
It's all very strange...looks like I have to switch back to syslog, loosing all the nice features :-(
the same happens to old syslogd as well as this is a libc issue. just try it and you will see.
Sorry, but at least on a Linux 2.4 system this is wrong! I just tested it: With the old syslogd 1.4.1 no message will be lost: Feb 13 13:50:16 test ./syslog_test[183]: 21 Feb 13 13:50:21 test ./syslog_test[183]: 22 Feb 13 13:50:23 test exiting on signal 15 Feb 13 13:50:24 test syslogd 1.4.1: restart. Feb 13 13:50:26 test ./syslog_test[183]: 23 Feb 13 13:50:31 test ./syslog_test[183]: 24 The same with syslog-ng 1.5.25: Feb 13 13:34:03 s_local@Elwing ./syslog_test[2118]: 71 Feb 13 13:34:08 s_local@Elwing ./syslog_test[2118]: 72 Feb 13 13:34:08 s_local@Elwing syslog-ng[2204]: syslog-ng version 1.5.25 going down Feb 13 13:34:09 s_local@Elwing syslog-ng[2252]: syslog-ng version 1.5.25 starting Feb 13 13:34:18 s_local@Elwing ./syslog_test[2118]: 74 Feb 13 13:34:23 s_local@Elwing ./syslog_test[2118]: 75 Message 73 is missing! The test program "syslog_test.c": #include <syslog.h> int main( int argc, char *argv[] ) { int cnt=0; openlog( argv[0], LOG_NDELAY|LOG_PID, LOG_DAEMON ); while( 1 ) { syslog( LOG_INFO, "%d", ++cnt ); sleep( 5 ); } return( 0 ); } All programs (syslog-ng, syslogd, syslog_test) are linked with glibc-2.3.1. Achim
--On Thursday, February 13, 2003 02:04:52 PM +0100 Achim Gsell <achim@cybercity.ch> wrote:
On Thursday 13 February 2003 10:56, Balazs Scheidler wrote:
On Thu, Feb 13, 2003 at 09:54:52AM +0100, Peter Bieringer wrote:
Pls. don't tell me that I have to restart every service after I restart syslog-ng (already known: crond, now postfix, which one is still missing?). This would be a very unnice "feature". If bug is in a common used glibc-version, then imho a workaround by syslog-ng would be required. I don't like to loose loglines, you sure not, too.
Does this also happen if old syslogd would be used? Don't believe so.
It's all very strange...looks like I have to switch back to syslog, loosing all the nice features :-(
the same happens to old syslogd as well as this is a libc issue. just try it and you will see.
Now I've tested in on a very clean RHL 7.3 glibc-2.25-42 sysklogd-1.4.1-8 postfix-1.1.7-2 First, Balazs is right in the postfix issue, in this case it's even more worse: If syslogd/klogd restarts, postfix *completly* stops logging forever until postfix is reloaded or restarted.
Sorry, but at least on a Linux 2.4 system this is wrong!
I just tested it: With the old syslogd 1.4.1 no message will be lost:
Me too.
The test program "syslog_test.c":
# include <syslog.h>
int main( int argc, char *argv[] ) { int cnt=0;
openlog( argv[0], LOG_NDELAY|LOG_PID, LOG_DAEMON );
while( 1 ) { syslog( LOG_INFO, "%d", ++cnt ); sleep( 5 ); }
return( 0 ); }
Is someone able to write a short program which opens a /dev/log and log through this channel all the time? Will dig now through RHL errata to check whether this problem is already filed. Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/
The test program "syslog_test.c":
# include <syslog.h>
int main( int argc, char *argv[] ) { int cnt=0;
openlog( argv[0], LOG_NDELAY|LOG_PID, LOG_DAEMON );
while( 1 ) { syslog( LOG_INFO, "%d", ++cnt ); sleep( 5 ); }
return( 0 ); }
Is someone able to write a short program which opens a /dev/log and log through this channel all the time?
:) Look at the code above posted by Achim.
Will dig now through RHL errata to check whether this problem is already filed.
Best regards, Roberto Nibali, ratz -- echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc
On Thu, Feb 13, 2003 at 02:04:52PM +0100, Achim Gsell wrote:
On Thursday 13 February 2003 10:56, Balazs Scheidler wrote:
On Thu, Feb 13, 2003 at 09:54:52AM +0100, Peter Bieringer wrote: the same happens to old syslogd as well as this is a libc issue. just try it and you will see.
Sorry, but at least on a Linux 2.4 system this is wrong!
I just tested it: With the old syslogd 1.4.1 no message will be lost:
Feb 13 13:50:16 test ./syslog_test[183]: 21 Feb 13 13:50:21 test ./syslog_test[183]: 22 Feb 13 13:50:23 test exiting on signal 15 Feb 13 13:50:24 test syslogd 1.4.1: restart. Feb 13 13:50:26 test ./syslog_test[183]: 23 Feb 13 13:50:31 test ./syslog_test[183]: 24
The same with syslog-ng 1.5.25:
Feb 13 13:34:03 s_local@Elwing ./syslog_test[2118]: 71 Feb 13 13:34:08 s_local@Elwing ./syslog_test[2118]: 72 Feb 13 13:34:08 s_local@Elwing syslog-ng[2204]: syslog-ng version 1.5.25 going down Feb 13 13:34:09 s_local@Elwing syslog-ng[2252]: syslog-ng version 1.5.25 starting Feb 13 13:34:18 s_local@Elwing ./syslog_test[2118]: 74 Feb 13 13:34:23 s_local@Elwing ./syslog_test[2118]: 75
Message 73 is missing!
The test program "syslog_test.c":
#include <syslog.h>
int main( int argc, char *argv[] ) { int cnt=0;
openlog( argv[0], LOG_NDELAY|LOG_PID, LOG_DAEMON );
while( 1 ) { syslog( LOG_INFO, "%d", ++cnt ); sleep( 5 ); }
return( 0 ); }
All programs (syslog-ng, syslogd, syslog_test) are linked with glibc-2.3.1.
glibc 2.3.1 might have been fixed so it tries to resend messages immediately, therefore the sysklogd case didn't drop messages. I've checked, libc 2.2.5 has this issue fixed, here's an strace that shows this: send(3, "<30>Feb 13 14:30:52 ./a.out[1655]: 5", 36, 0) = -1 ECONNREFUSED (Connection refused) close(3) = 0 socket(PF_UNIX, SOCK_DGRAM, 0) = 3 fcntl64(3, F_SETFD, FD_CLOEXEC) = 0 connect(3, {sin_family=AF_UNIX, path="/dev/log"}, 16) = 0 send(3, "<30>Feb 13 14:30:52 ./a.out[1655]: 5", 36, 0) = 36 Note that send() was called two times. If you check the same with libc 2.1.3, you'll see that sending the message is not reattempted. So this is a different issue. .... hmm... I've found the reason. libc does reattempt sending messages provided you use a unix-dgram source. sysklogd uses a unix-dgram source by default, thus I assume your syslog-ng config uses unix-stream. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
On Thursday 13 February 2003 14:38, Balazs Scheidler wrote:
So this is a different issue. .... hmm... I've found the reason. libc does reattempt sending messages provided you use a unix-dgram source.
sysklogd uses a unix-dgram source by default, thus I assume your syslog-ng config uses unix-stream.
You are right! With unix-dgram it works fine ... Achim
--On Thursday, February 13, 2003 03:08:10 PM +0100 Achim Gsell <achim@cybercity.ch> wrote:
On Thursday 13 February 2003 14:38, Balazs Scheidler wrote:
So this is a different issue. .... hmm... I've found the reason. libc does reattempt sending messages provided you use a unix-dgram source.
sysklogd uses a unix-dgram source by default, thus I assume your syslog-ng config uses unix-stream.
You are right! With unix-dgram it works fine ...
Me to here on my clean system, postfix and crond are logging fine further on even syslog-ng is restarted. Thank you very very much!!!! Looks like FAQ, doc and contrib/syslog2ng need to be updated. Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/
Hi again, --On Thursday, February 13, 2003 09:54:52 AM +0100 Peter Bieringer <pb@bieringer.de> wrote: only for doublecheck, the missing nqmgr line on first attemp after syslog-ng was confirmed on a clean RHL 7.3 system without libsafe and Openwall patch. BTW: I found no errata on RH (neither postfix nor glibc), any other daemon than postfix known, which has such logging problems (either with sysklogd or syslog-ng)? Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/
participants (4)
-
Achim Gsell
-
Balazs Scheidler
-
Peter Bieringer
-
Roberto Nibali