RE: [syslog-ng]cannot get sec.pl to exit after syslog-ng does
From: Balazs Scheidler [mailto:bazsi@balabit.hu] Sent: Tuesday, May 04, 2004 10:30 AM To: syslog-ng@lists.balabit.hu Subject: RE: [syslog-ng]cannot get sec.pl to exit after syslog-ng does
I've tried to reproduce the problem you are describing, but without success. I attached with strace to syslog-ng and its child process (this time it was /bin/cat), and here are the results:
[ results snipped ]
So cat receives the TERM signal, maybe it is blocked for you in some ways? Another thing that should terminate your process is that syslog-ng closes the write side of its pipe e.g. your perl script should have received an EOF on its standard input, even if it does not receive the term signal.
Well, it looks like the intervening "ignore SIGTERM" action is the /bin/sh spawned as the parent of all syslog-ng children (v1.6.2, line 112 of afprogram.c). Sorry I didn't think of using truss/strace sooner. I've included the truss output below. The command line used was "truss -d -f /usr/local/sbin/syslog-ng 2>&1 | tee debug1.txt", so each line has the pid and the offset-timestamp. You'll see that the /bin/sh ignores the SIGTERM and keeps waiting for it's children. It was only after I SIGTERM'd sec.pl did the /bin/sh exit. The processes were laid out as follows: 14340 /usr/local/sbin/syslog-ng 14342 /bin/sh -c /usr/local/sbin/sec.pl -intevents -input="-" -conf=/usr/local/etc/se 14344 /usr/local/bin/perl -w /usr/local/sbin/sec.pl -intevents -input=- -conf=/usr/lo I sent the first SIGTERM to syslog-ng at timestamp ~7.5 seconds, where it exited and /bin/sh didn't. I SIGTERM'd /bin/sh at timestamp ~17.4 seconds, with no result. It was only after SIGTERM'ing sec.pl at ~23.4 seconds did /bin/sh give up at ~26.4 seconds. I'm not sure what to do at this point about the /bin/sh - but I'm open to suggestions. I've checked syslog-ng 1.5.26 that Nate said he was using and it has the same /bin/sh line in afprogram.c. I've found this info from the 'sh' manpage: Signals The INTERRUPT and QUIT signals for an invoked command are ignored if the command is followed by &; otherwise signals have the values inherited by the shell from its parent, with the exception of signal 11 (but see also the trap command below). But I'm not sure what it would inherit from syslog-ng.
this is in main.c around line 77:
Ah yes! ok! Once again, thanks for the help. -- "Computer science is as much about computers as astronomy is about telescopes" -- Edsger Dijkstra --------------------------------------------------------- Anthony Tonns, UNIX Administrator - atonns@mail.ivillage.com RELEVANT TRUSS OUTPUT: syslog-ng --------- 14340: 0.5666 poll(0xFFBEF908, 8, 100) = 0 14340: poll(0xFFBEF908, 8, 60000) (sleeping...) 14340: signotifywait() (sleeping...) 14340: lwp_sema_wait(0xFEF0DE78) (sleeping...) 14340: door_return(0x00000000, 0, 0x00000000, 0) (sleeping...) 14340: 7.5103 signotifywait() = 15 14340: 7.5107 Received signal #15, SIGTERM, in poll() [caught] 14340: siginfo: SIGTERM pid=13532 uid=0 14340: 7.5106 lwp_sigredirect(1, SIGTERM) = 0 14340: 7.5109 poll(0xFFBEF908, 8, 60000) Err#4 EINTR 14340: 7.5111 sigaction(SIGTERM, 0xFFBEF398, 0x00000000) = 0 14340: 7.5113 sysconfig(_CONFIG_SIGRT_MIN) = 38 14340: 7.5114 sigprocmask(SIG_SETMASK, 0xFF0DD9A8, 0x00000000) = 0 14340: 7.5116 sigaction(SIGTERM, 0xFFBEF228, 0xFFBEF328) = 0 14340: 7.5118 sigprocmask(SIG_SETMASK, 0xFF0E96D8, 0x00000000) = 0 14340: 7.5119 sysconfig(_CONFIG_SIGRT_MIN) = 38 14340: 7.5121 setcontext(0xFFBEF570) 14340: 7.5123 getpid() = 14340 [1] 14340: 7.5124 time() = 1083681719 14340: 7.5135 time() = 1083681719 14340: 7.5137 time() = 1083681719 14340: 7.5138 time() = 1083681719 14340: 7.5140 poll(0xFFBEF908, 8, 100) = 2 14340: 7.5142 write(4, " M a y 4 1 0 : 4 1".., 82) = 82 14340: 7.5145 write(9, " < 4 5 > M a y 4 1".., 86) = 86 14340: 7.5148 getpid() = 14340 [1] 14340: 7.5149 kill(14342, SIGTERM) = 0 14340: 7.5152 door_revoke(5) = 0 14340: 7.5153 close(5) Err#9 EBADF 14340: 7.5162 llseek(0, 0, SEEK_CUR) = 0 14340: 7.5164 _exit(0) /bin/sh ------- 14342: 0.2626 fork() = 14344 14342: waitid(P_PID, 14344, 0xFFBEF850, WEXITED|WTRAPPED|WNOWAIT) (sleeping...) 14342: 7.5170 Received signal #15, SIGTERM, in waitid() [caught] 14342: siginfo: SIGTERM pid=14340 uid=0 14342: 7.5178 waitid(P_PID, 14344, 0xFFBEF850, WEXITED|WTRAPPED|WNOWAIT) Err#4 EINTR 14342: 7.5180 setcontext(0xFFBEF538) 14342: waitid(P_PID, 14344, 0xFFBEF850, WEXITED|WTRAPPED|WNOWAIT) (sleeping...) 14342: 17.4322 Received signal #15, SIGTERM, in waitid() [caught] 14342: siginfo: SIGTERM pid=13532 uid=0 14342: 17.4324 waitid(P_PID, 14344, 0xFFBEF850, WEXITED|WTRAPPED|WNOWAIT) Err#4 EINTR 14342: 17.4326 setcontext(0xFFBEF538) 14342: waitid(P_PID, 14344, 0xFFBEF850, WEXITED|WTRAPPED|WNOWAIT) (sleeping...) 14342: 26.4310 waitid(P_PID, 14344, 0xFFBEF850, WEXITED|WTRAPPED|WNOWAIT) = 0 14342: 26.4318 ioctl(0, TIOCGPGRP, 0xFFBEF80C) Err#22 EINVAL 14342: 26.4319 getpgid(14344) = 14337 14342: 26.4321 ioctl(0, TIOCGPGRP, 0xFFBEF80C) Err#22 EINVAL 14342: 26.4322 waitid(P_PID, 14344, 0xFFBEF850, WEXITED|WTRAPPED) = 0 14342: 26.4326 llseek(0, 0, SEEK_CUR) Err#29 ESPIPE 14342: 26.4327 _exit(2000) sec.pl ------ 14344: 23.4165 Received signal #15, SIGTERM, in poll() [caught] 14344: siginfo: SIGTERM pid=13532 uid=0 14344: 23.4167 poll(0xFFBEF838, 0, 100) Err#91 ERESTART 14344: 23.4172 sigaction(SIGTERM, 0xFFBEF048, 0xFFBEF0C8) = 0 14344: 23.4175 setcontext(0xFFBEF520) 14344: 23.4179 time() = 1083681735 14344: 23.4181 waitid(P_ALL, 0, 0xFFBEF788, WEXITED|WTRAPPED|WNOHANG) Err#10 ECHILD 14344: 23.4183 ioctl(2, TCGETA, 0xFFBEF874) Err#6 ENXIO 14344: 23.4185 ioctl(2, TCGETA, 0xFFBEF874) Err#6 ENXIO 14344: 23.4187 time() = 1083681735 14344: 23.4189 ioctl(2, TCGETA, 0xFFBEF874) Err#6 ENXIO 14344: 23.4201 time() = 1083681735 14344: 23.4204 time() = 1083681735 14344: 23.4206 ioctl(2, TCGETA, 0xFFBEF874) Err#6 ENXIO 14344: 23.4208 time() = 1083681735 14344: 23.4211 time() = 1083681735 14344: 23.4213 ioctl(2, TCGETA, 0xFFBEF874) Err#6 ENXIO 14344: 23.4215 ioctl(2, TCGETA, 0xFFBEF874) Err#6 ENXIO 14344: 23.4217 time() = 1083681735 14344: 23.4219 alarm(0) = 0 14344: 23.4221 sigaction(SIGALRM, 0xFFBEF798, 0xFFBEF848) = 0 14344: 23.4222 sigfillset(0xFF23C4C4) = 0 14344: 23.4224 sigprocmask(SIG_BLOCK, 0xFFBEF838, 0xFFBEF828) = 0 14344: 23.4225 alarm(3) = 0 14344: sigsuspend(0xFFBEF818) (sleeping...) 14344: 26.4168 Received signal #14, SIGALRM, in sigsuspend() [caught] 14344: 26.4171 sigsuspend(0xFFBEF818) Err#4 EINTR 14344: 26.4173 setcontext(0xFFBEF500) 14344: 26.4174 alarm(0) = 0 14344: 26.4176 sigprocmask(SIG_UNBLOCK, 0xFFBEF838, 0x00000000) = 0 14344: 26.4178 sigaction(SIGALRM, 0xFFBEF798, 0x00000000) = 0 14344: 26.4179 time() = 1083681738 14344: 26.4181 getcontext(0xFFBEF5E8) 14344: 26.4183 setcontext(0xFFBEF5E8) 14344: 26.4212 getcontext(0xFFBEF4D0) 14344: 26.4219 getcontext(0xFFBEF370) 14344: 26.4221 getcontext(0xFFBEF370) 14344: 26.4244 llseek(0, 0, SEEK_CUR) Err#29 ESPIPE 14344: 26.4246 _exit(0) iVillage Inc., 500 Seventh Avenue, New York, NY 10018 - iVillage Inc. is a leading women's media company that includes iVillage.com, Women.com, gURL.com, Astrology.com, Promotions.com, iVillage Parenting Network, The Newborn Channel, Lamaze Publishing, Business Women's Network, Diversity Best Practices, Best Practices in Corporate Communications, and iVillage Consulting. The information contained in this communication may be confidential, is intended only for the use of the recipient named above, and may be construed under applicable law to be a commercial email. If you have received this communication in error, please delete this message from your computer system. If you are the recipient named above and do not wish to receive any future commercial emails, please reply to the sender with a message stating such preference.
atonns@mail.ivillage.com - Tue, May 04, 2004:
I'm not sure what to do at this point about the /bin/sh - but I'm open to suggestions. I've checked syslog-ng 1.5.26 that Nate said he was using and it has the same /bin/sh line in afprogram.c. I've found this info from the 'sh' manpage:
I don't wether it's worth it, but you can try with /usr/xpg4/bin/sh, it is a POSIX compatibility version of sh. I use it because it suffers of less bugs than /bin/sh. Regards, -- Loïc Minier <lool@dooz.org>
participants (2)
-
atonns@mail.ivillage.com
-
Loic Minier