segfault signal 11, core dump, 1.5.26, 1.6.0rc1
Hi all folks! I have a debian woody with kernel 2.4.20 on i386. it is a network log collector server with several udp and tcp sources. also some sources come from behind firewalls. it was working for weeks w/o problems with the 1.5.26-1.deb downloaded directly from balabit site. before that it was working for months with the 1.5.1x woody default pkg. which doesn't really work well with tcp connections. (that's why the upgrade happened) since some days the daemon segfaults each 2-5 minutes and dumps core. I have compiled on a woody machine the 1.6.0rc1 but that acts the same. I've tried with different config modifications, but it acts the same. Now after debugging it for days/hours I am really angry not finding the root of the problem. Also memtest86 found the memory clean and all other services on the machine are working fine. How should I fix this situation and get this working again? we have critical bussiness apps related to this service. thanks
On Fri, Apr 11, 2003 at 01:37:13PM +0200, narancs wrote:
Hi all folks!
I have a debian woody with kernel 2.4.20 on i386. it is a network log collector server with several udp and tcp sources. also some sources come from behind firewalls.
it was working for weeks w/o problems with the 1.5.26-1.deb downloaded directly from balabit site. before that it was working for months with the 1.5.1x woody default pkg. which doesn't really work well with tcp connections. (that's why the upgrade happened)
since some days the daemon segfaults each 2-5 minutes and dumps core. I have compiled on a woody machine the 1.6.0rc1 but that acts the same.
I've tried with different config modifications, but it acts the same.
Now after debugging it for days/hours I am really angry not finding the root of the problem.
Also memtest86 found the memory clean and all other services on the machine are working fine.
How should I fix this situation and get this working again? we have critical bussiness apps related to this service.
please compile libol and syslog-ng with symbols (--enable-debug for both libol and syslog-ng configure script), have syslog-ng dump core and do gdb syslog-ng -c core gdb) bt the results of the backtrace command is useful for finding the problems. Thanks. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
Balazs Scheidler írta:
On Fri, Apr 11, 2003 at 01:37:13PM +0200, narancs wrote:
please compile libol and syslog-ng with symbols (--enable-debug for both libol and syslog-ng configure script), have syslog-ng dump core and do
done, iboth are compiled with --enable-debug, and install the new pkg.
gdb syslog-ng -c core gdb) bt
the core dump is currently 14Mb and has thousands of lines by 'bt' like this: #3094 0x3243ae3 in ?? () #$line_num $hex_code in ?? ()
the results of the backtrace command is useful for finding the problems.
in this situation it does not seem to be :-( what are the possible errors that make this core dump unusable? the kernel has the grsec patch added, but /sbin/syslog-ng has chpax -srmp flags (means PaX functions are disabled for this process). what's next please?
On Tue, Apr 15, 2003 at 04:02:47PM +0200, narancs wrote:
Balazs Scheidler írta:
Also memtest86 found the memory clean and all other services on the machine are working fine.
I have just found 'less core' that such msg happens in the core: syslog-ng[pid] Memory corrupted!
no. it usually means that the consistency checks that syslog-ng performs found that some parts of the data structure was malformed. it seems to be more a syslog-ng than a hardware problem. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
participants (2)
-
Balazs Scheidler
-
narancs