Security: syslog-ng 1.4.x and 1.5.x is vulnerable to buffer overflow
Hi, I'm writing this mail to announce that syslog-ng 1.4.x and 1.5.x are both vulnerable to a buffer overflow. Exploiting the bug needs a site specific exploit to be written, as the way the buffer is overwritten depends on the local configuration file. The buffer overflow can be triggered when templated output files or filename templates are used. Everybody is urged to upgrade to 1.4.16 or 1.5.21, these are available at the usual place, http://www.balabit.hu/en/downloads/syslog-ng/downloads/ The bug was found be me, so possibly nobody else knows the details. Of course diffing the new version with the previous one unveils the problem. Bugtraq announcement will be sent out soon. Debian package has been released and accepted (though mirrors need time to get the new one) ps: sigh, this was my first BoF :( -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
Balazs Scheidler <bazsi@balabit.hu> wrote: [snip]
Everybody is urged to upgrade to 1.4.16 or 1.5.21, these are available at the usual place, http://www.balabit.hu/en/downloads/syslog-ng/downloads/
I am having difficulties on Solaris 2.6 and 8 building 1.5.21. syslog-ng seems to need to link with libresolv, although it's not picked up. Linking it by hand gets the compile finished, but then it segfaults after a few seconds with: poll(0xFFBEFC70, 2, 600000) (sleeping...) signotifywait() (sleeping...) door_return(0x00000000, 0, 0x00000000, 0) (sleeping...) lwp_cond_wait(0xFF0D5550, 0xFF0D5560, 0xFF0CEDB8) (sleeping...) door_return(0x00000000, 0, 0x00000000, 0) (sleeping...) poll(0xFFBEFC70, 2, 600000) = 1 accept(2, 0xFFBEFB00, 0xFFBEFAFC, 1) = 4 fcntl(4, F_GETFL, 0xFFFFFFFF) = 130 fstat64(4, 0xFFBEF7C8) = 0 getsockopt(4, 65535, 8192, 0xFFBEF8C8, 0xFFBEF8C0, 0) = 0 fstat64(4, 0xFFBEF7C8) = 0 getsockopt(4, 65535, 8192, 0xFFBEF8C8, 0xFFBEF8C4, 0) = 0 setsockopt(4, 65535, 8192, 0xFFBEF8C8, 4, 0) = 0 fcntl(4, F_SETFL, 0x00000082) = 0 fcntl(4, F_SETFD, 0x00000001) = 0 time() = 1033145607 poll(0xFFBEFC68, 3, 100) = 1 read(4, " < 1 8 3 > S e p 2 7 ".., 2049) = 2049 Incurred fault #6, FLTBOUNDS %pc = 0xFF141AD8 siginfo: SIGSEGV SEGV_MAPERR addr=0x3804A888 Received signal #11, SIGSEGV [default] siginfo: SIGSEGV SEGV_MAPERR addr=0x3804A888 *** process killed *** Any ideas? Thanks in advance.
William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote:
Balazs Scheidler <bazsi@balabit.hu> wrote:
[snip]
Everybody is urged to upgrade to 1.4.16 or 1.5.21, these are available at the usual place, http://www.balabit.hu/en/downloads/syslog-ng/downloads/
I am having difficulties on Solaris 2.6 and 8 building 1.5.21. syslog-ng seems to need to link with libresolv, although it's not picked up. Linking it by hand gets the compile finished, but then it segfaults after a few seconds with:
poll(0xFFBEFC70, 2, 600000) (sleeping...) signotifywait() (sleeping...) door_return(0x00000000, 0, 0x00000000, 0) (sleeping...) lwp_cond_wait(0xFF0D5550, 0xFF0D5560, 0xFF0CEDB8) (sleeping...) door_return(0x00000000, 0, 0x00000000, 0) (sleeping...) poll(0xFFBEFC70, 2, 600000) = 1 accept(2, 0xFFBEFB00, 0xFFBEFAFC, 1) = 4 fcntl(4, F_GETFL, 0xFFFFFFFF) = 130 fstat64(4, 0xFFBEF7C8) = 0 getsockopt(4, 65535, 8192, 0xFFBEF8C8, 0xFFBEF8C0, 0) = 0 fstat64(4, 0xFFBEF7C8) = 0 getsockopt(4, 65535, 8192, 0xFFBEF8C8, 0xFFBEF8C4, 0) = 0 setsockopt(4, 65535, 8192, 0xFFBEF8C8, 4, 0) = 0 fcntl(4, F_SETFL, 0x00000082) = 0 fcntl(4, F_SETFD, 0x00000001) = 0 time() = 1033145607 poll(0xFFBEFC68, 3, 100) = 1 read(4, " < 1 8 3 > S e p 2 7 ".., 2049) = 2049 Incurred fault #6, FLTBOUNDS %pc = 0xFF141AD8 siginfo: SIGSEGV SEGV_MAPERR addr=0x3804A888 Received signal #11, SIGSEGV [default] siginfo: SIGSEGV SEGV_MAPERR addr=0x3804A888 *** process killed ***
Any ideas? Thanks in advance.
Here's a trace... This version was compiled without the res_init() call, and without -lresolv. # gdb ./syslog-ng GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "sparc-sun-solaris2.8"... (gdb) set args -F -C /common/logs -u logs -g logs (gdb) run Starting program: ./syslog-ng -F -C /common/logs -u logs -g logs [New LWP 1] [New LWP 2] [New LWP 3] [New LWP 4] [New LWP 5] Program received signal SIGSEGV, Segmentation fault. 0xff141f74 in realfree () from /usr/lib/libc.so.1 (gdb) bt #0 0xff141f74 in realfree () from /usr/lib/libc.so.1 #1 0xff142880 in cleanfree () from /usr/lib/libc.so.1 #2 0xff1419b4 in _malloc_unlocked () from /usr/lib/libc.so.1 #3 0xff1418a8 in malloc () from /usr/lib/libc.so.1 #4 0x2abf8 in xalloc () #5 0x2adc0 in ol_space_alloc () #6 0x199c0 in make_log_info () #7 0x1628c in do_handle_line () #8 0x16750 in do_read_line () #9 0x28e9c in read_callback () #10 0x28b78 in io_iter () #11 0x1548c in main_loop () #12 0x1607c in main () (gdb) The program is running. Exit anyway? (y or n) y # If there's a way I can help in debugging this further, please let me know. I refrain from posting my config file because it's quite large (over 100 lines). Thanks.
In article <3D9D1811.nail1Z311GE@andromeda.rutgers.edu>, William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote; } Program received signal SIGSEGV, Segmentation fault. } 0xff141f74 in realfree () from /usr/lib/libc.so.1 } (gdb) bt } #0 0xff141f74 in realfree () from /usr/lib/libc.so.1 } #1 0xff142880 in cleanfree () from /usr/lib/libc.so.1 } #2 0xff1419b4 in _malloc_unlocked () from /usr/lib/libc.so.1 } #3 0xff1418a8 in malloc () from /usr/lib/libc.so.1 } #4 0x2abf8 in xalloc () } #5 0x2adc0 in ol_space_alloc () As I wrote before mine was also dead at malloc(). I just tried ugly patching not to zero sized allocation, which means allocated size will be more than one byte if xalloc() tries to allocate 0 like this; if (size == 0) size++; Even with the patch, I still get the core; (gdb) bt #0 0xff141bec in realfree () from /usr/lib/libc.so.1 #1 0xff1424f8 in cleanfree () from /usr/lib/libc.so.1 #2 0xff14162c in _malloc_unlocked () from /usr/lib/libc.so.1 #3 0xff141520 in malloc () from /usr/lib/libc.so.1 #4 0x21cc0 in xalloc () #5 0x17840 in make_log_info (length=0, msg=0x0, prefix=0x0, flags=0) at log.c:272 #6 0x15ba4 in do_handle_line (self=0x55b68, length=164, data=0x56490 "<150>Oct 4 17:29:47 local@ks0003 named[13188]: [ID 866145 local2.info] Oct 04 17:29:47.956security: client 10.17.0.25#58886: query 'ks0302.XXX.ne.jp/IN' denied\n<150>Oct 4 17:29:50 local@ks0003 n"..., addr=0x56490, addrlen=0) at sources.c:68 #7 0x15d48 in do_read_line (h=0x56534, read=0xffbef998) at sources.c:138 #8 0x1ff54 in read_callback () #9 0x1fbcc in io_iter () #10 0x152d8 in main_loop (backend=0x39758) at main.c:192 #11 0x159c8 in main (argc=0, argv=0xffbefd04) at main.c:516 (gdb) frame 5 #5 0x17840 in make_log_info (length=0, msg=0x0, prefix=0x0, flags=0) at log.c:272 272 NEW_SPACE(self); from above trace and my experience at other development project, I think this is due to the memory handling problem by the program bug. -- Katsuhiro Kondou
On Fri, Oct 04, 2002 at 12:24:49AM -0400, William Yodlowsky wrote:
William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote:
Balazs Scheidler <bazsi@balabit.hu> wrote:
[snip]
Everybody is urged to upgrade to 1.4.16 or 1.5.21, these are available at the usual place, http://www.balabit.hu/en/downloads/syslog-ng/downloads/
I am having difficulties on Solaris 2.6 and 8 building 1.5.21. syslog-ng seems to need to link with libresolv, although it's not picked up. Linking it by hand gets the compile finished, but then it segfaults after a few seconds with:
poll(0xFFBEFC70, 2, 600000) (sleeping...) signotifywait() (sleeping...) door_return(0x00000000, 0, 0x00000000, 0) (sleeping...) lwp_cond_wait(0xFF0D5550, 0xFF0D5560, 0xFF0CEDB8) (sleeping...) door_return(0x00000000, 0, 0x00000000, 0) (sleeping...) poll(0xFFBEFC70, 2, 600000) = 1 accept(2, 0xFFBEFB00, 0xFFBEFAFC, 1) = 4 fcntl(4, F_GETFL, 0xFFFFFFFF) = 130 fstat64(4, 0xFFBEF7C8) = 0 getsockopt(4, 65535, 8192, 0xFFBEF8C8, 0xFFBEF8C0, 0) = 0 fstat64(4, 0xFFBEF7C8) = 0 getsockopt(4, 65535, 8192, 0xFFBEF8C8, 0xFFBEF8C4, 0) = 0 setsockopt(4, 65535, 8192, 0xFFBEF8C8, 4, 0) = 0 fcntl(4, F_SETFL, 0x00000082) = 0 fcntl(4, F_SETFD, 0x00000001) = 0 time() = 1033145607 poll(0xFFBEFC68, 3, 100) = 1 read(4, " < 1 8 3 > S e p 2 7 ".., 2049) = 2049 Incurred fault #6, FLTBOUNDS %pc = 0xFF141AD8 siginfo: SIGSEGV SEGV_MAPERR addr=0x3804A888 Received signal #11, SIGSEGV [default] siginfo: SIGSEGV SEGV_MAPERR addr=0x3804A888 *** process killed ***
Any ideas? Thanks in advance.
Here's a trace... This version was compiled without the res_init() call, and without -lresolv.
# gdb ./syslog-ng GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "sparc-sun-solaris2.8"... (gdb) set args -F -C /common/logs -u logs -g logs (gdb) run Starting program: ./syslog-ng -F -C /common/logs -u logs -g logs [New LWP 1] [New LWP 2] [New LWP 3] [New LWP 4] [New LWP 5]
Program received signal SIGSEGV, Segmentation fault. 0xff141f74 in realfree () from /usr/lib/libc.so.1 (gdb) bt #0 0xff141f74 in realfree () from /usr/lib/libc.so.1 #1 0xff142880 in cleanfree () from /usr/lib/libc.so.1 #2 0xff1419b4 in _malloc_unlocked () from /usr/lib/libc.so.1 #3 0xff1418a8 in malloc () from /usr/lib/libc.so.1 #4 0x2abf8 in xalloc () #5 0x2adc0 in ol_space_alloc () #6 0x199c0 in make_log_info () #7 0x1628c in do_handle_line () #8 0x16750 in do_read_line () #9 0x28e9c in read_callback () #10 0x28b78 in io_iter () #11 0x1548c in main_loop () #12 0x1607c in main () (gdb) The program is running. Exit anyway? (y or n) y #
If there's a way I can help in debugging this further, please let me know. I refrain from posting my config file because it's quite large (over 100 lines).
Hmm.. I've started syslog-ng on a solaris 8 system. It started and seems to work. I try to send log traffic to it, but at first I was unable to reproduce the problem. (the system I run it on is a Sun Ultra II with two 300 MHz processors and 768 MB RAM) The fact that it segfaults in malloc() seems to indicate that there's a problem in syslog-ng (double free, overwritten memory block chains or something) -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
Balazs Scheidler <bazsi@balabit.hu> wrote:
On Fri, Oct 04, 2002 at 12:24:49AM -0400, William Yodlowsky wrote:
#0 0xff141f74 in realfree () from /usr/lib/libc.so.1 #1 0xff142880 in cleanfree () from /usr/lib/libc.so.1 #2 0xff1419b4 in _malloc_unlocked () from /usr/lib/libc.so.1 #3 0xff1418a8 in malloc () from /usr/lib/libc.so.1 #4 0x2abf8 in xalloc () #5 0x2adc0 in ol_space_alloc () #6 0x199c0 in make_log_info () #7 0x1628c in do_handle_line () #8 0x16750 in do_read_line () #9 0x28e9c in read_callback () #10 0x28b78 in io_iter () #11 0x1548c in main_loop () #12 0x1607c in main () (gdb) The program is running. Exit anyway? (y or n) y #
If there's a way I can help in debugging this further, please let me know. I refrain from posting my config file because it's quite large (over 100 lines).
Hmm.. I've started syslog-ng on a solaris 8 system. It started and seems to work. I try to send log traffic to it, but at first I was unable to reproduce the problem. (the system I run it on is a Sun Ultra II with two 300 MHz processors and 768 MB RAM)
The fact that it segfaults in malloc() seems to indicate that there's a problem in syslog-ng (double free, overwritten memory block chains or something)
Some more info... * 1.5.13 runs flawlessly on that system * Our syslog clients seem to work fine (minimal config file) * Central syslog server segfaults (I know kondou@isc.org mentioned that it was their central server too) Since I haven't tried running 1.5.14-1.5.20 I'm going to give them a try to see if the problem is in one of those previous releases. That may make it easier to track down. Thanks!
In article <3D9D9B57.nailFTQ1OU1@andromeda.rutgers.edu>, William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote; } Since I haven't tried running 1.5.14-1.5.20 I'm going to give them a try } to see if the problem is in one of those previous releases. That may } make it easier to track down. And I think I can also give it try if you give me the method(or patch) to track down. -- Katsuhiro Kondou
On Sat, Oct 05, 2002 at 12:24:42AM +0900, Katsuhiro Kondou wrote:
In article <3D9D9B57.nailFTQ1OU1@andromeda.rutgers.edu>, William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote;
} Since I haven't tried running 1.5.14-1.5.20 I'm going to give them a try } to see if the problem is in one of those previous releases. That may } make it easier to track down.
And I think I can also give it try if you give me the method(or patch) to track down.
thanks in advance. And please also change libol versions accordingly, as it might also be the culprit. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
In article <20021007134411.GC24401@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote; } thanks in advance. And please also change libol versions accordingly, as it } might also be the culprit. Yeah, I've already upgraded to 0.3.3 for 1.5.21. -- Katsuhiro Kondou
William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote:
* Central syslog server segfaults (I know kondou@isc.org mentioned that it was their central server too)
Since I haven't tried running 1.5.14-1.5.20 I'm going to give them a try to see if the problem is in one of those previous releases. That may make it easier to track down.
Ok, here's what I did. I tested each the same way: - Compiled libol with: ./configure && make - Compiled syslog-ng with: ./configure --with-libol=../libol-VERSION && make - Tested with server (large) config file and invoked with: # cd src # truss -f ./syslog-ng -f ~/syslog-ng.conf -F -C /tmp/a -u logs -g # logs Results: libol-0.3.1 & syslog-ng-1.5.14 - worked libol-0.3.1 & syslog-ng-1.5.15 - worked libol-0.3.2 & syslog-ng-1.5.16 - build failed libol-0.3.2 & syslog-ng-1.5.17 - worked libol-0.3.3 & syslog-ng-1.5.18 - build failed libol-0.3.3 & syslog-ng-1.5.19 - segfault libol-0.3.3 & syslog-ng-1.5.20 - worked libol-0.3.3 & syslog-ng-1.5.21 - (removed res_init call) - WORKED Hmm. Before, I was linking with libresolv. Since removing res_init, that's no longer necessary, and it doesn't seem to segfault anymore. I'm going to poke at this a bit more, and if anything else turns up, I'll post. Thanks...
William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote:
William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote:
* Central syslog server segfaults (I know kondou@isc.org mentioned that it was their central server too)
Since I haven't tried running 1.5.14-1.5.20 I'm going to give them a try to see if the problem is in one of those previous releases. That may make it easier to track down.
Ok, here's what I did. I tested each the same way:
- Compiled libol with: ./configure && make
- Compiled syslog-ng with: ./configure --with-libol=../libol-VERSION && make
- Tested with server (large) config file and invoked with: # cd src # truss -f ./syslog-ng -f ~/syslog-ng.conf -F -C /tmp/a -u logs -g # logs
Results:
libol-0.3.1 & syslog-ng-1.5.14 - worked libol-0.3.1 & syslog-ng-1.5.15 - worked libol-0.3.2 & syslog-ng-1.5.16 - build failed libol-0.3.2 & syslog-ng-1.5.17 - worked libol-0.3.3 & syslog-ng-1.5.18 - build failed libol-0.3.3 & syslog-ng-1.5.19 - segfault libol-0.3.3 & syslog-ng-1.5.20 - worked libol-0.3.3 & syslog-ng-1.5.21 - (removed res_init call) - WORKED
Hmm. Before, I was linking with libresolv. Since removing res_init, that's no longer necessary, and it doesn't seem to segfault anymore.
I'm going to poke at this a bit more, and if anything else turns up, I'll post.
Sigh. I spoke to soon: # gdb /usr/local/sbin/syslog-ng GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "sparc-sun-solaris2.8"... (gdb) set args -F -C /common/logs -u logs -g logs (gdb) r Starting program: /usr/local/sbin/syslog-ng -F -C /common/logs -u logs -g logs [New LWP 1] [New LWP 2] [New LWP 3] [New LWP 4] [New LWP 5] Program received signal SIGSEGV, Segmentation fault. 0xff141f74 in realfree () from /usr/lib/libc.so.1 (gdb) bt #0 0xff141f74 in realfree () from /usr/lib/libc.so.1 #1 0xff142880 in cleanfree () from /usr/lib/libc.so.1 #2 0xff1419b4 in _malloc_unlocked () from /usr/lib/libc.so.1 #3 0xff1418a8 in malloc () from /usr/lib/libc.so.1 #4 0x2abf8 in xalloc () #5 0x2adc0 in ol_space_alloc () #6 0x199c0 in make_log_info () #7 0x1628c in do_handle_line () #8 0x16750 in do_read_line () #9 0x28e9c in read_callback () #10 0x28b78 in io_iter () #11 0x1548c in main_loop () #12 0x1607c in main () (gdb) The only difference between my test config file and the production one is that it listened on a different port, and didn't have the same volume of traffic.
William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote: Ok let me update again (sorry for the multiple posts): Retested in production: libol-0.3.2 & syslog-ng-1.5.17 - works fine libol-0.3.3 & syslog-ng-1.5.18 - build failed, untested libol-0.3.3 & syslog-ng-1.5.19 - segfaults in production libol-0.3.3 & syslog-ng-1.5.20 - segfaults in production libol-0.3.3 & syslog-ng-1.5.21 - segfaults in production libol-0.3.2 & syslog-ng-1.5.21 - segfaults in production So, it seems the bug is in syslog-ng, introduced somewhere between 1.5.17 and 1.5.19. I'll see if I can get 1.5.18 to build to narrow it down even further.
On Wed, Oct 09, 2002 at 12:27:24PM -0400, William Yodlowsky wrote:
William Yodlowsky <wyodlows@andromeda.rutgers.edu> wrote:
Ok let me update again (sorry for the multiple posts):
Retested in production:
libol-0.3.2 & syslog-ng-1.5.17 - works fine
libol-0.3.3 & syslog-ng-1.5.18 - build failed, untested libol-0.3.3 & syslog-ng-1.5.19 - segfaults in production libol-0.3.3 & syslog-ng-1.5.20 - segfaults in production libol-0.3.3 & syslog-ng-1.5.21 - segfaults in production
libol-0.3.2 & syslog-ng-1.5.21 - segfaults in production
So, it seems the bug is in syslog-ng, introduced somewhere between 1.5.17 and 1.5.19. I'll see if I can get 1.5.18 to build to narrow it down even further.
My suspicion is this code: void do_destroy_afinet_dest(struct log_handler *c, struct syslog_config *cfg, struct persistent_config *persistent) { CAST(afinet_dest, self, c); if (self->conn_fd) { /* KILL_RESOURCE(&self->conn_fd->super.super); */ closekill_fd(&self->conn_fd->super, 0); self->conn_fd = NULL; } } 1.5.17 had the commented out version, anything since 1.5.18 has the closekill_fd version. This code path is only used _iff_ a HUP is sent to syslog-ng. Is the segfault triggered by sending a HUP to the process, or it is simply crashing without HUP? -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
In article <20021010083117.GC24001@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote; } This code path is only used _iff_ a HUP is sent to syslog-ng. Is the } segfault triggered by sending a HUP to the process, or it is simply crashing } without HUP? Mine is dead soon after syslog-ng starts, so without SIGHUP. -- Katsuhiro Kondou
On Thu, Oct 10, 2002 at 05:41:20PM +0900, Katsuhiro Kondou wrote:
In article <20021010083117.GC24001@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote;
} This code path is only used _iff_ a HUP is sent to syslog-ng. Is the } segfault triggered by sending a HUP to the process, or it is simply crashing } without HUP?
Mine is dead soon after syslog-ng starts, so without SIGHUP.
I've started loading my Sparc server with logs. No crashes so far. do you have remote sources? (e.g. udp or tcp?) if yes, which one is loaded? this is my config now: source src { udp(port(2000)); sun-streams("/dev/log" door("/etc/.syslog_door")); internal(); }; destination dst { file("/var/log/messages"); }; log { source(src); destination(dst); }; I'm sending messages through udp and from locally as well. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
In article <20021010090900.GD24001@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote; } do you have remote sources? (e.g. udp or tcp?) Yes. } if yes, which one is loaded? Both. tcp from solaris has 32 connections and udp from network equipments like Catalyst and NetScreen has 50 sources. } this is my config now: Mine is; source local { sun-streams("/dev/log" door("/etc/.syslog_door")); internal(); }; source net_tcp { tcp(max-connections(100)); }; source net_udp { udp(); }; } I'm sending messages through udp and from locally as well. Mine sends thru tcp to another(backup) syslog server. -- Katsuhiro Kondou
On Thu, Oct 10, 2002 at 06:53:58PM +0900, Katsuhiro Kondou wrote:
In article <20021010090900.GD24001@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote;
} do you have remote sources? (e.g. udp or tcp?)
Yes.
} if yes, which one is loaded?
Both. tcp from solaris has 32 connections and udp from network equipments like Catalyst and NetScreen has 50 sources.
I've tested my setup with UDP. This means I have to add a TCP source. Do those TCP connections break?
} this is my config now:
Mine is;
source local { sun-streams("/dev/log" door("/etc/.syslog_door")); internal(); }; source net_tcp { tcp(max-connections(100)); }; source net_udp { udp(); };
} I'm sending messages through udp and from locally as well.
Mine sends thru tcp to another(backup) syslog server.
is the backup server restarted? I'm interested in whether the connection between your log server and the backup server is broken for some reason. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
In article <20021010100823.GA28012@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote; } Do those TCP connections break? What do you mean by 'break'? If you mean those connections may cause the segfault, yes it may do, since back trace of the core shows the message thru tcp. } is the backup server restarted? I'm interested in whether the connection } between your log server and the backup server is broken for some reason. No. I never touched the backup server while I restarted segfaulted server. -- Katsuhiro Kondou
On Thu, Oct 10, 2002 at 07:20:47PM +0900, Katsuhiro Kondou wrote:
In article <20021010100823.GA28012@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote;
} Do those TCP connections break?
What do you mean by 'break'? If you mean those connections may cause the segfault, yes it may do, since back trace of the core shows the message thru tcp.
I mean whether those connections are persistent, or they get closed from time-to-time (e.g. syslog-ng reload on the other side)
} is the backup server restarted? I'm interested in whether the connection } between your log server and the backup server is broken for some reason.
No. I never touched the backup server while I restarted segfaulted server.
I mean whether syslog-ng server itself is restarted (which means that the TCP connection gets closed) -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
In article <20021010102317.GA3131@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote; } I mean whether those connections are persistent, or they get closed from } time-to-time (e.g. syslog-ng reload on the other side) Former. Other syslog-ng servers keep its connection. } > No. I never touched the backup server while I restarted } > segfaulted server. } } I mean whether syslog-ng server itself is restarted (which means that the } TCP connection gets closed) I hope I don't misunderstand your question. I restarted syslog-ng server which soon got segfaulted, and the backup syslog-ng server was never restarted while doing so. -- Katsuhiro Kondou
Katsuhiro Kondou <kondou@isc.org> wrote:
In article <20021010090900.GD24001@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote;
} do you have remote sources? (e.g. udp or tcp?)
Yes.
Me too.
} if yes, which one is loaded?
Both. tcp from solaris has 32 connections and udp from network equipments like Catalyst and NetScreen has 50 sources.
I run just TCP. All clients are Solaris 2.6-2.8 running syslog-ng.
} this is my config now:
Mine is;
source local { sun-streams("/dev/log" door("/etc/.syslog_door")); internal(); }; source net_tcp { tcp(max-connections(100)); }; source net_udp { udp(); };
Mine is similar.
On Thu, Oct 10, 2002 at 09:44:44AM -0400, William Yodlowsky wrote:
Katsuhiro Kondou <kondou@isc.org> wrote:
In article <20021010090900.GD24001@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote;
} do you have remote sources? (e.g. udp or tcp?)
Yes.
Me too.
} if yes, which one is loaded?
Both. tcp from solaris has 32 connections and udp from network equipments like Catalyst and NetScreen has 50 sources.
I run just TCP. All clients are Solaris 2.6-2.8 running syslog-ng.
} this is my config now:
Mine is;
source local { sun-streams("/dev/log" door("/etc/.syslog_door")); internal(); }; source net_tcp { tcp(max-connections(100)); }; source net_udp { udp(); };
Mine is similar.
my testbox has been up for more than 3 hours now. I send about 5-10 messages per second through udp I've started 12 jobs from this shell script: while true; do echo valami | nc -q0 sun 2000; sleep 1; done & from my box. It's still up and running. I'll now add another TCP source which is not dropped after each message. (though the script above should stress syslog-ng much more) -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
In article <20021010152652.GA12473@balabit.hu>, Balazs Scheidler <bazsi@balabit.hu> wrote; } from my box. It's still up and running. I'll now add another TCP source } which is not dropped after each message. (though the script above should } stress syslog-ng much more) Is it helpful to define DEBUG_ALLOC and REALLY_DEBUG_ALLOC to see what is going on? -- Katsuhiro Kondou
Balazs Scheidler <bazsi@balabit.hu> wrote:
On Wed, Oct 09, 2002 at 12:27:24PM -0400, William Yodlowsky wrote:
So, it seems the bug is in syslog-ng, introduced somewhere between 1.5.17 and 1.5.19. I'll see if I can get 1.5.18 to build to narrow it down even further.
My suspicion is this code:
void do_destroy_afinet_dest(struct log_handler *c, struct syslog_config *cfg, struct persistent_config *persistent) { CAST(afinet_dest, self, c); if (self->conn_fd) { /* KILL_RESOURCE(&self->conn_fd->super.super); */ closekill_fd(&self->conn_fd->super, 0); self->conn_fd = NULL; } }
1.5.17 had the commented out version, anything since 1.5.18 has the closekill_fd version.
This code path is only used _iff_ a HUP is sent to syslog-ng. Is the segfault triggered by sending a HUP to the process, or it is simply crashing without HUP?
I don't HUP the process, it just segfaults on its own. Just to test, I changed the code above back to what it was in 1.5.17, and it segfaulted the same way.
On Fri, Sep 27, 2002 at 04:16:24PM +0200, Balazs Scheidler wrote:
ps: sigh, this was my first BoF :(
I really do feel bad for you, you must have felt great about the lack of overflows. That's ok, we'll keep using syslog-ng ;) -- If JavaScript is walking alone late at night through a bad part of town with a pocket full of $20 bills, ActiveX is dropping your trousers in the middle of the yard of a maximum-security prison, bending over, and yelling 'Come and get it, boys!'
Everybody is urged to upgrade to 1.4.16 or 1.5.21, these are
available at
the usual place, http://www.balabit.hu/en/downloads/syslog-ng/downloads/
Is there a signature file for syslog-ng-1.4.16.tar.gz? The URL leads to a blank page.
participants (5)
-
Balazs Scheidler
-
Gorm Jensen
-
Katsuhiro Kondou
-
Nate Campi
-
William Yodlowsky