[syslog-ng] Problem with syslog-ng 1.6.8 and HUP
Tessier.Marc.2 at hydro.qc.ca
Tessier.Marc.2 at hydro.qc.ca
Wed Sep 10 17:13:18 CEST 2008
Hi,
I goofed when I explained our rotation, compression happens after files are moved and HUP is sent, like you're proposing actually. I'll try to remove the touch in the rotation and see if that changes anything by forcing syslog-ng to recreate the files. One other thing I noticed is that our syslog-ng build is a 32-bit version :
root at machine01# file /usr/sbin/syslog-ng
/usr/sbin/syslog-ng: ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+ Required, dynamically linked, stripped
Of course the machine is running on 64 bit hardware with a 64 bit OS. I guess if you don't have any ideas, I could compile 1.6.12 as a 64 bit binary and see if that resolves the issue too.
Thank you!
Marc Tessier
Analyste Support Technique
Support Technique des Infrastructures
courriel : tessier.marc.2 at hydro.qc.ca
Tel: (514) 840-3000 Poste #6467
-----Message d'origine-----
De : syslog-ng-bounces at lists.balabit.hu [mailto:syslog-ng-bounces at lists.balabit.hu] De la part de Evan Rempel
Envoyé : mercredi 10 septembre 2008 10:19
À : Syslog-ng users' and developers' mailing list
Objet : Re: [syslog-ng] Problem with syslog-ng 1.6.8 and HUP
I don't see anything that would leave the stale file handles, however, you really should do things in a different order.
rename/move the files
touch the new files (not necessary with syslog-ng) HUP/reload syslog-ng compress the files
If you do the compression before the HUP, then syslog-ng will be writing to the files as it is being compressed. That will loose records in the files as the compressions will potentially miss some lines at the end due to a race condition.
Evan.
Tessier.Marc.2 at hydro.qc.ca wrote:
> Hi,
>
> I have been working on an issue for the last few days with one of our
> main centralized syslog servers which is a SPARC Solaris 9 box.
> It is running syslog-ng 1.6.8. Our log rotation script moves,
> compresses and then touches a new file depending on size. Then script
> then sends an HUP to syslog-ng in the following way :
>
> /usr/bin/pkill -HUP syslog-ng > /dev/null 2>&1
>
> I have added some debugging to the script to make sure it grabs the
> proper PID, and logs it independantly :
>
> Sep 10 07:58:01 machine01 syslogmgr[8304]: info: signaling
> Syslog/Syslog-ng (HUP) at PID 4362.
>
> And it is :
>
> root at spdmze21# ps -ef | grep syslog-ng
> root 4362 1 2 Sep 07 ? 160:38 /usr/sbin/syslog-ng
>
> Now in the log files, it seems syslog-ng isn't always reacting to the
> kill. This is what I get as far as syslog-ng noticing the signal in
> the last 2 days :
>
> root at machine01# grep syslog-ng complete.log | grep restarting Sep 9
> 09:58:06 machine01 syslog-ng[4362]: SIGHUP received, restarting
> syslog-ng Sep 9 15:58:03 machine01 syslog-ng[4362]: SIGHUP received,
> restarting syslog-ng Sep 9 19:58:07 machine01 syslog-ng[4362]: SIGHUP
> received, restarting syslog-ng Sep 10 01:58:04 machine01
> syslog-ng[4362]: SIGHUP received, restarting syslog-ng Sep 10 03:58:05
> machine01 syslog-ng[4362]: SIGHUP received, restarting syslog-ng
>
> However, it was signaled everyhour :
>
> Sep 10 00:58:02 machine01 syslogmgr[20532]: [ID 702911 daemon.info]
> info: signaling Syslog/Syslog-ng (HUP) at PID 4362.
> Sep 10 01:58:04 machine01 syslogmgr[26686]: [ID 702911 daemon.info]
> info: signaling Syslog/Syslog-ng (HUP) at PID 4362.
> Sep 10 02:58:02 machine01 syslogmgr[3702]: [ID 702911 daemon.info] info:
> signaling Syslog/Syslog-ng (HUP) at PID 4362.
> Sep 10 03:58:05 machine01 syslogmgr[9935]: [ID 702911 daemon.info] info:
> signaling Syslog/Syslog-ng (HUP) at PID 4362.
> Sep 10 04:58:04 machine01 syslogmgr[17262]: [ID 702911 daemon.info]
> info: signaling Syslog/Syslog-ng (HUP) at PID 4362.
> Sep 10 05:58:02 machine01 syslogmgr[24178]: [ID 702911 daemon.info]
> info: signaling Syslog/Syslog-ng (HUP) at PID 4362.
> Sep 10 06:58:02 machine01 syslogmgr[1748]: [ID 702911 daemon.info] info:
> signaling Syslog/Syslog-ng (HUP) at PID 4362.
> Sep 10 07:58:01 machine01 syslogmgr[8304]: [ID 702911 daemon.info] info:
> signaling Syslog/Syslog-ng (HUP) at PID 4362.
>
> Now the problem this creates for me is that it leaves a bunch of open
> file descriptors which don't point to a file anymore :
>
> root at machine01# lsof -p 4362 -s | grep d80
> syslog-ng 4362 root 7w VREG 85,80 9699746 40
> /var/adm/log (/dev/md/dsk/d80)
> syslog-ng 4362 root 17w VREG 85,80 3223056 500412
> /var/adm/log (/dev/md/dsk/d80)
> syslog-ng 4362 root 18w VREG 85,80 11604669 612873
> /var/adm/log (/dev/md/dsk/d80)
> syslog-ng 4362 root 20w VREG 85,80 8617243 90
> /var/adm/log (/dev/md/dsk/d80)
> syslog-ng 4362 root 23w VREG 85,80 11698446 612889
> /var/adm/log (/dev/md/dsk/d80)
> ...
>
> This causes our rotation to mess up and thus the filesystem to
> eventually become filled. If I fire pkill manually, it works :
>
> root at machine01# /usr/bin/pkill -HUP syslog-ng > /dev/null 2>&1
> root at machine01# lsof -p 4362 -s | grep d80 root at machine01#
>
> The script itself is called every hour from root's crontab. At this
> point, I am completely stumped. Even looking at the changelog from
> 1.6.8 to 1.6.12 (latest 1.6 tree version) I don't see anything that
> addresses this issue. Now since I have not been able to reproduce a
> non-working HUP, I don't have a truss output of when it happens.
>
> Thank you for any help!
>
> Marc Tessier
> Analyste Support Technique
> Support Technique des Infrastructures courriel :
> tessier.marc.2 at hydro.qc.ca <mailto:tessier.marc.2 at hydro.qc.ca>
> Tel: (514) 840-3000 Poste #6467
>
> ______________________________________________________________________
> ________ Member info:
> https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation:
> http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.campin.net/syslog-ng/faq.html
>
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.campin.net/syslog-ng/faq.html
More information about the syslog-ng
mailing list