Thanks for the detailed report. Seems to be a use after free or double free in the error path. I'll have a closer look.

On Jun 15, 2013 2:15 AM, "Johnson, Chris (HP TippingPoint Roseville)" <chris.johnson3@hp.com> wrote:

Hi all,

I've come across a situation where syslog-ng (3.3.3 and 3.3.9) aborts after trying to write to a disk that has no space and then is cleaned up.

 

I've been able to reproduce the 'error' condition with the following Set up:

In the config file:

###########################################################

# test log destination

#

filter f_test{program("LOGID_99-*" type("glob"));};

destination d_test { file("/var/testpartition/test.log" perm(0644) flags(no-multi-line)); };

log {

        source(s_local);

        filter(f_test);

        destination(d_test);

        flags(final);

};

 

1)      Fill up the '/var/testpartition' disk:

root@device:/var/testpartition# dd if=/dev/zero of=/var/testpartition/foo bs=1M

dd: writing '/var/testpartition/foo': No space left on device

3531+0 records in

3529+1 records out

2)      Send messages ('TEST: pre I/O error')to 'test.log' until syslog-ng complains:

2013-06-14 22:41:09.623 [device] [syslog-ng-ERROR:] "I/O error occurred while writing; fd='22', error='No space left on device (28)'"

2013-06-14 22:41:09.623 [device] [syslog-ng-NOTICE:] "Suspending write operation because of an I/O error; fd='22', time_reopen='60'"

3)      Restart syslog-ng with a SIGHUP

root@device:/var/testpartition# kill -s HUP $(cat /var/run/syslog-ng.pid)

 

Program received signal SIGHUP, Hangup.

0x00007f38ee8dad63 in __epoll_wait_nocancel () from /xxx/xxx/xxx/xxx/xxx/xxx/lib/libc.so.6

(gdb) continue

Continuing.

4)      Send another message ('TEST: post I/O error; post SIGHUP')to syslog-ng (it complains):

2013-06-14 22:41:37.147 [device] [syslog-ng-NOTICE:] "Configuration reload request received, reloading configuration;"

2013-06-14 22:41:59.610 [device] [syslog-ng-ERROR:] "I/O error occurred while writing; fd='23', error='No space left on device (28)'"

2013-06-14 22:41:59.610 [device] [syslog-ng-NOTICE:] "Suspending write operation because of an I/O error; fd='23', time_reopen='60'"

5)      Clean up '/var/testpartition' disk:

root@device:/var/testpartition# rm foo

6)      Send another message ('TEST: post I/O error; post SIGHUP 2'); syslog-ng aborts:

Program received signal SIGABRT, Aborted.

0x00007f38ee83eda5 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64

64      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.

        in ../nptl/sysdeps/unix/sysv/linux/raise.c

(gdb) bt

#0  0x00007f38ee83eda5 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64

#1  0x00007f38ee8402c3 in abort () at abort.c:88

#2  0x00007f38ee87af18 in __libc_message (do_abort=2, fmt=0x7f38ee927cc8 "*** glibc detected *** %s: %s: 0x%s ***\n")

    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170

#3  0x00007f38ee8804b8 in malloc_printerr (action=2, str=0x7f38ee927dd0 "double free or corruption (!prev)",

    ptr=<value optimized out>) at malloc.c:5891

#4  0x00007f38ee8825c6 in __libc_free (mem=0x7f38ee91e820) at malloc.c:3626

#5  0x00007f38ef7d989a in ?? ()

#6  0x0000000000401800 in ?? () at elf-init.c:99

#7  0x00007f38efcf0257 in log_proto_file_writer_flush (s=0x674de0) at logproto.c:306

#8  0x00007f38efcf04a8 in log_proto_file_writer_post (s=0x674de0,

    msg=0x6558a0 "2013-06-14T22:42:30.476+00:00 device LOGID_99-INFO: TEST: post I/O error; post SIGHUP 2\n", msg_len=89,

    consumed=0x7fffa53e570c) at logproto.c:380

#9  0x00007f38efcfc274 in log_proto_post (s=0x674de0,

    msg=0x6558a0 "2013-06-14T22:42:30.476+00:00 device LOGID_99-INFO: TEST: post I/O error; post SIGHUP 2\n", msg_len=89,

    consumed=0x7fffa53e570c) at logproto.h:95

#10 0x00007f38efcfc095 in log_writer_flush (self=0x655a30, flush_mode=LW_FLUSH_NORMAL) at logwriter.c:983

#11 0x00007f38efcf9be3 in log_writer_work_perform (s=0x655a30) at logwriter.c:129

#12 0x00007f38efcf9efe in log_writer_io_flush_output (s=0x655a30) at logwriter.c:198

#13 0x00007f38efd238b5 in iv_run_tasks (st=0x6036a0) at iv_task.c:46

#14 0x00007f38efd22a86 in iv_main () at iv_main.c:266

#15 0x00007f38efcfedf9 in main_loop_run () at mainloop.c:736

#16 0x00000000004017c9 in main (argc=1, argv=0x7fffa53e5968) at main.c:263

(gdb)

 

The last two test messages are being written (run together with the last partial message that generated the 'disk full' I/O error):

root@device:/var/testpartition# tail test.log

2013-06-14T22:40:58.140+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:40:59.150+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:41:00.180+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:41:01.176+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:41:02.170+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:41:03.010+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:41:03.930+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:41:05.040+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:41:06.180+00:00 device LOGID_99-INFO: TEST: pre I/O error

2013-06-14T22:41:07.140+00:00 device LOGID_99-INFO: T2013-06-14T22:42:30.476+00:00 device LOGID_99-INFO: TEST: post I/O error; post SIGHUP 2013-06-14T22:42:30.476+00:00 device LOGID_99-INFO: TEST: post I/O error; post SIGHUP 2

root@device:/var/testpartition#

 

Thanks,

Chris

----------------------------------------

Christopher Johnson

chris.johnson3@hp.com

HP Software - Security Product Group

(916) 785-2817

----------------------------------------

 


______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq