Hello, I'm testing syslog-ng with TCP transport, and I'm a little surprised- perhaps I'm doing something wrong? I'm finding severe message loss, which I had imagined TCP would eliminate. Briefly, my testing paradigm is as follows: On a client, I run a program which generates 5000 messages (LOCAL6/INFO) in rapid succession. The client is a Sparc Ultra 10, Solaris 7, and so uses the reliable door IPC mechanism for local message delivery. The loghost is an IBM 43P, AIX4.3.3, listening on TCP port 514. The messages have sequence numbers in their body, for example: Jan 4 10:34:24 local@peewee/peewee TEST[2362]: (3241) TEST MESSAGE (3241) Jan 4 10:34:24 local@peewee/peewee TEST[2362]: (3242) TEST MESSAGE (3242) I use a perl script to detect which are missing from the local (Sparc) logs and the remote (IBM) logs. Here is a sample result: SPARC (Client) ----------------------------------- Missing trials: Total found: 5000 (missing 0) IBM (remote Loghost) ----------------------------------- Missing trials: 3231-3240 3475-3523 3637-3662 3776-3805 4019-4087 4201-4232 4346-4381 4516-4532 4746-4791 4905-4907 Total found: 4682 (missing 318) Searching the IBM log corroborates this result. Therefore, log messages are being dropped somewhere. As this example shows, they are not lost on the Sparc; only on the IBM loghost. This effect is exacerbated by multiple clients- the client test program can listen on a port for a "go" signal so that I can coordinate a synchronized stress-test on the loghost. Massive losses using UDP are not surprising; but I would expect that with TCP the messages are not lost. Here are excerpts from the client and loghost config files to demostrate my belief that I'm using TCP: Client (SPARC): ----------------------------------- # ... destination d_vader { tcp("130.126.112.253" port(514)); }; destination d_local6 { file("/var/log/syslog/local6"); }; log { source(local); filter(f_local6); destination(d_vader); destination(d_local6); }; Loghost (IBM) ----------------------------------- source local { unix-dgram("/dev/log"); tcp(ip(0.0.0.0) port(514)); internal(); }; #... destination d_local6 { file("/var/log/syslog/local6"); }; log { source(local); filter(f_local6); destination(d_local6); }; Please ignore that I'm calling messages received from a network source "local"-- this is only during testing. Would anybody like to share some insight into this problem? I have another comment: When I run syslog-ng on Solaris (not AIX, though), I get the following result. # /usr/local/sbin/syslog-ng Terminated A daemon process is left running, however, and seems to operate normally (save for the previous issue). Following is 'truss' output for the initial syslog-ng process, containing a few lines preceeding the receipt of a kill signal: open("/etc/passwd", O_RDONLY) = 4 fstat64(4, 0xFFBEEB88) = 0 ioctl(4, TCGETA, 0xFFBEEB14) Err#25 ENOTTY read(4, " r o o t : x : 0 : 1 : S".., 8192) = 756 llseek(4, 0xFFFFFFFFFFFFFD2D, SEEK_CUR) = 33 close(4) = 0 ioctl(3, TCGETA, 0xFFBEF294) Err#25 ENOTTY llseek(3, 0, SEEK_CUR) = 4128 close(3) = 0 fork1() = 2539 signotifywait() = 15 lwp_mutex_wakeup(0xFF3DFDF0) = 0 lwp_mutex_lock(0xFF3DFDF0) = 0 lwp_mutex_wakeup(0xFF14D4D0) = 0 lwp_mutex_lock(0xFF14D4D0) = 0 lwp_mutex_wakeup(0xFF150C58) = 0 lwp_mutex_lock(0xFF150C58) = 0 lwp_sema_post(0xFF00FE78) = 0 lwp_sema_wait(0xFF00FE78) = 0 lwp_mutex_wakeup(0xFF150C58) = 0 lwp_mutex_lock(0xFF150C58) = 0 Received signal #15, SIGTERM [default] siginfo: SIGTERM pid=2539 uid=0 lwp_sigredirect(1, SIGTERM) = 0 *** process killed *** Does anyone know offhand why this should happen? Thanks very much for any advice you can offer. -- Jonathan Marks Systems Administrator, Production Systems Group Computing and Communication Services Office University of Illinois at Urbana-Champaign