[Bug 164] New: syslog-ng sql destination starts dropping all messages
https://bugzilla.balabit.com/show_bug.cgi?id=164 Summary: syslog-ng sql destination starts dropping all messages Product: syslog-ng Version: 3.3.x Platform: PC OS/Version: Linux Status: NEW Severity: normal Priority: unspecified Component: syslog-ng AssignedTo: bazsi@balabit.hu ReportedBy: bugzilla.syslogng@feystorm.net Type of the Report: --- Estimated Hours: 0.0 syslog-ng 3.3.4 with afsql explicit commit deadlock patch applied libdbi 0.8.3 I have a box writing to an SQLite3 database, and it seems that every few days, syslog-ng gets into a state where it starts dropping all messages supposed to be written to the database. There are no errors written to the syslog-ng log. # syslog-ng-ctl stats SourceName;SourceId;SourceInstance;State;Type;Number ... dst.sql;d_sqlite#0;sqlite3,,,/var/log/logs.sql3,logs;a;dropped;61372 dst.sql;d_sqlite#0;sqlite3,,,/var/log/logs.sql3,logs;a;stored;30 destination;d_sqlite;;a;processed;122885 destination d_sqlite { sql( type('sqlite3') database("/var/log/logs.sql3") table("logs") columns("time", "time_r", "host", "facility", "priority", "program", "pid", "tag", "message") values("$S_UNIXTIME", "$R_UNIXTIME", "$FULLHOST", "$FACILITY_NUM", "$LEVEL_NUM", "$PROGRAM", "$PID", "$DBTAG", "$MSG") null("") flags(explicit-commits) flush_lines(10) flush_timeout(200) log_fifo_size(30) #debug ); }; (gdb) info threads Id Target Id Frame 2 Thread 0x3b1963e2700 (LWP 606) "syslog-ng" 0x000003b1959e84bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 * 1 Thread 0x3b1963e5b00 (LWP 605) "syslog-ng" 0x000003b19572fd73 in epoll_wait () from /lib64/libc.so.6 (gdb) bt #0 0x000003b19572fd73 in epoll_wait () from /lib64/libc.so.6 #1 0x000003b195fa176d in iv_epoll_poll (numfds=36, active=0x3c985736130, msec=3600000) at iv_method_epoll.c:73 #2 0x000003b195fa0d3b in iv_main () at iv_main.c:265 #3 0x000003b195f764d1 in main_loop_run () at mainloop.c:731 #4 0x0000004ff3a61d08 in main (argc=1, argv=0x3c9857362b8) at main.c:260 (gdb) thread 2 [Switching to thread 2 (Thread 0x3b1963e2700 (LWP 606))] #0 0x000003b1959e84bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 (gdb) bt full #0 0x000003b1959e84bc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 No symbol table info available. #1 0x000003b192b16170 in afsql_dd_database_thread (arg=0x4ff3cb5e00) at afsql.c:900 self = 0x4ff3cb5e00 #2 0x000003b195f784d5 in worker_thread_func (st=0x4ff3c82570) at misc.c:623 p = 0x4ff3c82570 res = 0x0 mask = {__val = {81923, 0 <repeats 15 times>}} #3 0x000003b195c77dc6 in ?? () from /usr/lib64/libglib-2.0.so.0 No symbol table info available. #4 0x000003b1959e3b2a in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #5 0x000003b19572f71d in clone () from /lib64/libc.so.6 No symbol table info available. (gdb) frame 1 #1 0x000003b192b16170 in afsql_dd_database_thread (arg=0x4ff3cb5e00) at afsql.c:900 900 afsql.c: No such file or directory. in afsql.c (gdb) print *self $1 = {super = {super = {super = {ref_cnt = {counter = 1}, flags = 1, cfg = 0x4ff3c7c4c0, pipe_next = 0x0, queue_data = 0x0, queue = 0x3b192b16f02 <afsql_dd_queue>, init = 0x3b192b165b5 <afsql_dd_init>, deinit = 0x3b192b16d42 <afsql_dd_deinit>, free_fn = 0x3b192b1703d <afsql_dd_free>, notify = 0}, optional = 0, group = 0x4ff3cb1420 "d_sqlite", id = 0x4ff3cb14b0 "d_sqlite#0", plugins = 0x0, drv_next = 0x0}, acquire_queue_data = 0x0, acquire_queue = 0x3b195f5715c <log_dest_driver_acquire_queue_method>, release_queue_data = 0x0, release_queue = 0x3b195f5724f <log_dest_driver_release_queue_method>, queues = 0x4ff3c724e0, log_fifo_size = 30, throttle = 0}, type = 0x4ff3cb5da0 "sqlite3", host = 0x4ff3cb5de0 "", port = 0x4ff3cb5fe0 "", user = 0x4ff3cb6000 "syslog-ng", password = 0x4ff3cb6020 "", database = 0x4ff3cb6040 "/var/log/logs.sql3", encoding = 0x4ff3cb6060 "UTF-8", columns = 0x4ff3cb38c0, values = 0x4ff3c72580, indexes = 0x0, table = 0x4ff3cb6080, fields_len = 9, fields = 0x4ff3cb17a0, null_value = 0x4ff3cb63a0 "", time_reopen = 60, num_retries = 3, flush_lines = 10, flush_timeout = 200, flush_lines_queued = 0, flags = 1, session_statements = 0x0, template_options = {ts_format = 0, time_zone = {0x0, 0x0}, time_zone_info = {0x4ff3cb2850, 0x4ff3cb2870}, frac_digits = 0}, dropped_messages = 0x4ff3cb14d0, stored_messages = 0x4ff3cb14d8, db_thread = 0x4ff3c82590, db_thread_mutex = 0x4ff3c82540, db_thread_wakeup_cond = 0x4ff3c82500, db_thread_terminate = 0, db_thread_suspended = 0, db_thread_suspend_target = {tv_sec = 0, tv_usec = 0}, queue = 0x4ff3cb1560, seq_num = 61484, pending_msg = 0x0, pending_msg_ack_needed = 0, dbi_ctx = 0x4ff3c88ad0, validated_tables = 0x4ff3c77d80, failed_message_counter = 0} For some reason gdb is unable to work with a gcore file I've taken, only with the live process, so I can get further information, but I may have to wait a day or two for the issue to recur. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=164 --- Comment #1 from Balazs Scheidler <bazsi@balabit.hu> 2012-02-29 12:42:17 --- Maybe the error that caused the transaction to fail persists, and causes syslog-ng not to be able to write messages. But it's just a wild guess. I'll try to work on my syslog-ng related backlog in the coming days, but I'm quite distracted at the moment. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=164 --- Comment #2 from Patrick <bugzilla.syslogng@feystorm.net> 2012-02-29 13:37:59 --- I have the internal syslog logs going out to a separate file, and this file is still working as when I perform `syslog-ctl-ng stats` I receive messages such as Feb 27 07:53:34 storm.feystorm.net syslog-ng[605]: EOF on control channel, closing connection; -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=164 --- Comment #3 from Gergely Nagy <algernon@balabit.hu> 2012-02-29 14:54:17 --- (In reply to comment #2)
I have the internal syslog logs going out to a separate file, and this file is still working as when I perform `syslog-ctl-ng stats` I receive messages such as Feb 27 07:53:34 storm.feystorm.net syslog-ng[605]: EOF on control channel, closing connection;
That is "normal", even if a bit misleading. When the syslog-ng-ctl program disconnects from syslog-ng, you get that message. It's harmless, it only means that the syslog-ng-ctl program disconnected from syslog-ng. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
https://bugzilla.balabit.com/show_bug.cgi?id=164 --- Comment #4 from Patrick <bugzilla.syslogng@feystorm.net> 2012-02-29 14:57:45 --- (In reply to comment #3)
(In reply to comment #2)
I have the internal syslog logs going out to a separate file, and this file is still working as when I perform `syslog-ctl-ng stats` I receive messages such as Feb 27 07:53:34 storm.feystorm.net syslog-ng[605]: EOF on control channel, closing connection;
That is "normal", even if a bit misleading. When the syslog-ng-ctl program disconnects from syslog-ng, you get that message. It's harmless, it only means that the syslog-ng-ctl program disconnected from syslog-ng.
Yes I know. I was only pointing out that the syslog-ng log was still working. So if there were errors writing to the database, they should be showing up in that log, but theyre not. -- Configure bugmail: https://bugzilla.balabit.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are watching all bug changes.
participants (1)
-
bugzilla@bugzilla.balabit.com