Hi everyone, I am using syslog-ng in a system which is supposed to be fail safe to avoid data loss. However I encountered an unexpected delay and messages are lost.. To test it I am proceeding like this: syslog-ng-config: options { sync (0); time_reopen (10); log_fifo_size (1000); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (no); keep_hostname (yes); } ... destination d_cron { file("/var/log/cron"); }; ... logger -t tester -p cron.info "power failure test" & tail -f /var/log/cron Waiting aprox. 2 seconds then I power down. The previous command shows the message, but after reboot it is lost. I supposed that it's happening due the fact that the buffers are not flushed to disk. So I tried strace on syslog-ng process. No read/write syscall for the message is made.. Another case is with a cycle of messages: for i in `seq 1 1000`;do logger -t tester -p cron.info "power failure test: $i";echo ">> $i";done After 3-4 seconds all messages are written. But not before that time... I know about the fsync(yes) option and I tried it, but with high load of messages It doesn't work proper (syslog-ng.config becomes corrupted and messages are also lost) There is something I am missing here? Configuration option, maybe? Regards
On Tue, 15 Jan 2008 13:50:20 +0200, dstuxo said:
I am using syslog-ng in a system which is supposed to be fail safe to avoid data loss. However I encountered an unexpected delay and messages are lost.
This turns out to be a *lot* harder to achieve than you would expect...
I know about the fsync(yes) option and I tried it, but with high load of messages It doesn't work proper (syslog-ng.config becomes corrupted and messages are also lost)
On any given hardware, there is an *absolute* upper limit to how many fsync() operations it can support per second - and it's often quite low, on the order of 50-75 per second. This is mainly due to the fact that an fsync will almost always cause a disk seek, which can take up to 10 milliseconds (restricting you to 100 per second). There's only a few ways around this: 0) Many/most disk drives have a rather small (8M to 64M) cache in front of them - if you care about performance, you enable it, if you care about "fail safe", you turn it off, because if you do a "wait 2 seconds then I power down", anything that had made it to the disk's cache but wasn't written out *will go away*. If you need to run it with cache on, you need to make sure that (a) you have a UPS so it never loses power before you can (b) make sure that you properly shut down the disks, including a 'flush cache" and waiting for it to complete, before pulling the power... 1) Tell syslog-ng to only fsync every <insert time unit here>, and be prepared to lose the last few second's worth if there's a crash. 2) Be prepared to spend US$400,000 and up for a high-end EMC or similar disk array that can handle insanely high I/O loads. Part of this cost is the battery backup for the disks, so that even if you "wait 2 seconds and power down", the disks don't actually spin down. 3) Learn to accept that unless you spend lots of money, there's lots of ways that you can lose messages.
Valdis.Kletnieks@vt.edu wrote:
On Tue, 15 Jan 2008 13:50:20 +0200, dstuxo said:
I am using syslog-ng in a system which is supposed to be fail safe to avoid data loss. However I encountered an unexpected delay and messages are lost.
This turns out to be a *lot* harder to achieve than you would expect...
I know about the fsync(yes) option and I tried it, but with high load of messages It doesn't work proper (syslog-ng.config becomes corrupted and messages are also lost)
On any given hardware, there is an *absolute* upper limit to how many fsync() operations it can support per second - and it's often quite low, on the order of 50-75 per second. This is mainly due to the fact that an fsync will almost always cause a disk seek, which can take up to 10 milliseconds (restricting you to 100 per second). There's only a few ways around this:
0) Many/most disk drives have a rather small (8M to 64M) cache in front of them - if you care about performance, you enable it, if you care about "fail safe", you turn it off, because if you do a "wait 2 seconds then I power down", anything that had made it to the disk's cache but wasn't written out *will go away*. If you need to run it with cache on, you need to make sure that (a) you have a UPS so it never loses power before you can (b) make sure that you properly shut down the disks, including a 'flush cache" and waiting for it to complete, before pulling the power...
1) Tell syslog-ng to only fsync every <insert time unit here>, and be prepared to lose the last few second's worth if there's a crash.
2) Be prepared to spend US$400,000 and up for a high-end EMC or similar disk array that can handle insanely high I/O loads. Part of this cost is the battery backup for the disks, so that even if you "wait 2 seconds and power down", the disks don't actually spin down.
3) Learn to accept that unless you spend lots of money, there's lots of ways that you can lose messages.
There are a lot cheaper solutions to this problem. You can purchase a raid controller card with its own battery backed cache, often refferred to as a fast write cache. This cache will immediately respond to a fsync since the data is "permenant" with regards to the storage system capturing it. In reality it may only live for 1 week in the event of a power loss. You can use standard disk behind the conroller, but you should ensure that the disk drives themselves do NOT have the write cache enabled. If it is, then a power loss will loose the data in the disk cache. When the disk drives spin up again, the caching raid card will committ the writes to the disk from its battery backed cache. These cards are in the order of $1-2,000. This means that you can enable the fsync feature of syslog-ng without the huge performance penalty. The caching raid card will coellesce the individual fsync writes into a large write to the actual media. We have seen write speeds twice as fast as read speeds even with fsync applications. You will still have some upper limit which is often around 100,000 IO/sec, so that means that you have a limit of 100,000 syslog messages per second. As an example, see the LSI MegaRAID SAS 8480E controller which has 256MB of battery backed cache that lasts 72 hours without power and supports up to 140,000 I/O per second. You don't have to boot the OS to finish the writes, just get power to the box/drives for 30 seconds and you have it committed to disk. A small portable UPS can do this. I am in no way affiliated with LSI, and have never use the product mentioned. Evan Rempel.
Thank you for your suggestions. However, my mistake... I forgot to tell you that it is about an embedded board, with 4GB compact flash card for storage (CF2IDE adapter). This a development platform and no changes can be made to it at this time. Anyways, there is still one missing part... After few more tests: I changed block I/O scheduler from CFQ to NOOP (more suitable in this case). Result: Why sysklog can handle the same messages very well and syslog-ng not? The same delay when starting to log. This gives me the clue that the problem is not with flushing buffers. I am using syslog-ng-2.0.3 (subjective choice) There is any major improvement (in term of speed) between this version and the last one? ,Regards Evan Rempel wrote:
Valdis.Kletnieks@vt.edu wrote:
On Tue, 15 Jan 2008 13:50:20 +0200, dstuxo said:
I am using syslog-ng in a system which is supposed to be fail safe to avoid data loss. However I encountered an unexpected delay and messages are lost.
This turns out to be a *lot* harder to achieve than you would expect...
I know about the fsync(yes) option and I tried it, but with high load of messages It doesn't work proper (syslog-ng.config becomes corrupted and messages are also lost)
On any given hardware, there is an *absolute* upper limit to how many fsync() operations it can support per second - and it's often quite low, on the order of 50-75 per second. This is mainly due to the fact that an fsync will almost always cause a disk seek, which can take up to 10 milliseconds (restricting you to 100 per second). There's only a few ways around this:
0) Many/most disk drives have a rather small (8M to 64M) cache in front of them - if you care about performance, you enable it, if you care about "fail safe", you turn it off, because if you do a "wait 2 seconds then I power down", anything that had made it to the disk's cache but wasn't written out *will go away*. If you need to run it with cache on, you need to make sure that (a) you have a UPS so it never loses power before you can (b) make sure that you properly shut down the disks, including a 'flush cache" and waiting for it to complete, before pulling the power...
1) Tell syslog-ng to only fsync every <insert time unit here>, and be prepared to lose the last few second's worth if there's a crash.
2) Be prepared to spend US$400,000 and up for a high-end EMC or similar disk array that can handle insanely high I/O loads. Part of this cost is the battery backup for the disks, so that even if you "wait 2 seconds and power down", the disks don't actually spin down.
3) Learn to accept that unless you spend lots of money, there's lots of ways that you can lose messages.
There are a lot cheaper solutions to this problem. You can purchase a raid controller card with its own battery backed cache, often refferred to as a fast write cache. This cache will immediately respond to a fsync since the data is "permenant" with regards to the storage system capturing it. In reality it may only live for 1 week in the event of a power loss.
You can use standard disk behind the conroller, but you should ensure that the disk drives themselves do NOT have the write cache enabled. If it is, then a power loss will loose the data in the disk cache.
When the disk drives spin up again, the caching raid card will committ the writes to the disk from its battery backed cache.
These cards are in the order of $1-2,000.
This means that you can enable the fsync feature of syslog-ng without the huge performance penalty. The caching raid card will coellesce the individual fsync writes into a large write to the actual media.
We have seen write speeds twice as fast as read speeds even with fsync applications.
You will still have some upper limit which is often around 100,000 IO/sec, so that means that you have a limit of 100,000 syslog messages per second.
As an example, see the LSI MegaRAID SAS 8480E controller which has 256MB of battery backed cache that lasts 72 hours without power and supports up to 140,000 I/O per second. You don't have to boot the OS to finish the writes, just get power to the box/drives for 30 seconds and you have it committed to disk. A small portable UPS can do this.
I am in no way affiliated with LSI, and have never use the product mentioned.
Evan Rempel. _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html
On Wed, 2008-01-16 at 16:18 +0200, dstuxo wrote:
Thank you for your suggestions.
However, my mistake... I forgot to tell you that it is about an embedded board, with 4GB compact flash card for storage (CF2IDE adapter). This a development platform and no changes can be made to it at this time.
Anyways, there is still one missing part... After few more tests: I changed block I/O scheduler from CFQ to NOOP (more suitable in this case). Result: Why sysklog can handle the same messages very well and syslog-ng not? The same delay when starting to log. This gives me the clue that the problem is not with flushing buffers.
I am using syslog-ng-2.0.3 (subjective choice)
There is any major improvement (in term of speed) between this version and the last one?
syslog-ng has multiple input/output channels and the writes to disk is also controlled by the main poll loop. e.g. * when a message is received, it is added to the outgoing queue * then when the poll() indicates that the destination file/tcp/udp socket becomes writable, it is written to the target. Using poll() this way can add some latencies. strace has a parameter to display timing information related to syscalls, can you measure how long it takes for syslog-ng to receive the message (recvfrom() syscall) and write it out (write() syscall). I might be able to reduce this latency somewhat, if this is indeed the problem. -- Bazsi
Hi, Here is a new test result.. I also attached some strace logs. ------------------------------------------- * syslog-ng commnad: $ strace -T -tt -f -ff -s 4096 -o /root/localfile syslog-ng & ------------------------------------------- * file desciptors: root@localhost:/root> lsof -p 2506 COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME syslog-ng 2506 root cwd DIR 3,1 4096 46980 /root syslog-ng 2506 root rtd DIR 3,1 4096 2 / syslog-ng 2506 root txt REG 3,1 129364 109801 /usr/sbin/syslog-ng syslog-ng 2506 root mem REG 3,1 110488 31754 /lib/ld-2.5.so syslog-ng 2506 root mem REG 3,1 1207784 32811 /lib/libc-2.5.so syslog-ng 2506 root mem REG 3,1 12916 31647 /lib/libevtlog.so.0.0.0 syslog-ng 2506 root mem REG 3,1 82052 32865 /lib/libpthread-2.5.so syslog-ng 2506 root mem REG 3,1 77132 32851 /lib/libnsl-2.5.so syslog-ng 2506 root mem REG 3,1 594160 31648 /lib/libglib-2.0.so.0.1200.0 syslog-ng 2506 root mem REG 3,1 32632 32896 /lib/libwrap.so.0.7.6 syslog-ng 2506 root mem REG 3,1 28384 31758 /lib/librt-2.5.so syslog-ng 2506 root 0r CHR 1,3 1919 /dev/null syslog-ng 2506 root 1w CHR 1,3 1919 /dev/null syslog-ng 2506 root 2w CHR 1,3 1919 /dev/null syslog-ng 2506 root 3r REG 0,3 0 4026531849 /proc/kmsg syslog-ng 2506 root 4u unix 0xdd9d4b00 8863 /dev/log syslog-ng 2506 root 5w REG 3,1 91370 62735 /var/log/messages syslog-ng 2506 root 7w REG 3,1 6386 62771 /var/log/maillog syslog-ng 2506 root 10w REG 3,1 103498 62724 /var/log/cron.log root@localhost:/root> ------------------------------------------- * The script used for testing: #!/bin/sh for i in `seq 1 10000`; do echo "THIS IS A TEST MESSAGE $i" logger -t tester -p cron.info "THIS IS A TEST MESSAGE $i" done #eof ------------------------------------------- Balazs Scheidler wrote:
syslog-ng has multiple input/output channels and the writes to disk is also controlled by the main poll loop.
e.g.
* when a message is received, it is added to the outgoing queue * then when the poll() indicates that the destination file/tcp/udp socket becomes writable, it is written to the target.
Using poll() this way can add some latencies.
strace has a parameter to display timing information related to syscalls, can you measure how long it takes for syslog-ng to receive the message (recvfrom() syscall) and write it out (write() syscall).
I might be able to reduce this latency somewhat, if this is indeed the problem.
11:52:16.621420 time(NULL) = 1200570736 <0.000008> 11:52:16.621476 gettimeofday({1200570736, 621502}, NULL) = 0 <0.000008> 11:52:16.621533 time(NULL) = 1200570736 <0.000008> 11:52:16.621589 gettimeofday({1200570736, 621616}, NULL) = 0 <0.000009> 11:52:16.621647 time(NULL) = 1200570736 <0.000008> 11:52:16.621703 gettimeofday({1200570736, 621730}, NULL) = 0 <0.000008> 11:52:16.621760 time(NULL) = 1200570736 <0.000008> 11:52:16.621817 gettimeofday({1200570736, 621844}, NULL) = 0 <0.000008> 11:52:16.621875 time(NULL) = 1200570736 <0.000008> 11:52:16.621931 gettimeofday({1200570736, 621958}, NULL) = 0 <0.000009> 11:52:16.621988 time(NULL) = 1200570736 <0.000008> 11:52:16.622045 gettimeofday({1200570736, 622072}, NULL) = 0 <0.000008> 11:52:16.622103 time(NULL) = 1200570736 <0.000008> 11:52:16.622159 gettimeofday({1200570736, 622185}, NULL) = 0 <0.000008> 11:52:16.622215 time(NULL) = 1200570736 <0.000008> 11:52:16.622273 gettimeofday({1200570736, 622301}, NULL) = 0 <0.000009> 11:52:16.622332 poll([{fd=5, events=POLLOUT, revents=POLLOUT}, {fd=4, events=POLLIN}, {fd=3, events=POLLIN}], 3, 599859) = 1 <0.000009> 11:52:16.622405 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:16.622516 write(5, "Jan 17 11:52:16 localhost kernel: EXT3-fs: mounted filesystem with ordered data mode.\n", 86) = 86 <0.000011> 11:52:16.622607 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:16.622716 write(5, "Jan 17 11:52:16 localhost kernel: kjournald starting. Commit interval 5 seconds\n", 81) = 81 <0.000011> 11:52:16.622805 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:16.622913 write(5, "Jan 17 11:52:16 localhost kernel: EXT3 FS on hda4, internal journal\n", 68) = 68 <0.000012> 11:52:16.623012 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:16.623121 write(5, "Jan 17 11:52:16 localhost kernel: EXT3-fs: recovery complete.\n", 62) = 62 <0.000011> 11:52:16.623203 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000009> 11:52:16.623311 write(5, "Jan 17 11:52:16 localhost kernel: EXT3-fs: mounted filesystem with ordered data mode.\n", 86) = 86 <0.000011> 11:52:16.623402 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:16.623509 write(5, "Jan 17 11:52:16 localhost kernel: ipmi message handler version 39.1\n", 68) = 68 <0.000012> 11:52:16.623593 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000009> 11:52:16.623700 write(5, "Jan 17 11:52:16 localhost kernel: ipmi device interface\n", 56) = 56 <0.000011> 11:52:16.623779 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:16.623887 write(5, "Jan 17 11:52:16 localhost kernel: IPMI System Interface driver.\n", 64) = 64 <0.000011> 11:52:16.623969 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:16.624080 write(5, "Jan 17 11:52:16 localhost kernel: ipmi_si: Unable to find any System Interface(s)\n", 82) = 82 <0.000011> 11:52:16.624169 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:16.624295 write(5, "Jan 17 11:52:16 localhost kernel: Clocksource tsc unstable (delta = -450184348 ns)\n", 83) = 83 <0.000011> 11:52:16.624385 gettimeofday({1200570736, 624412}, NULL) = 0 <0.000008> 11:52:16.624444 poll([{fd=4, events=POLLIN, revents=POLLIN}, {fd=3, events=POLLIN}], 2, 599857) = 1 <26.867669> 11:52:43.492202 gettimeofday({1200570763, 492258}, NULL) = 0 <0.000009> 11:52:43.492306 accept(4, {sa_family=AF_FILE, path="üH"}, [2]) = 6 <0.000016> 11:52:43.492424 setsockopt(6, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0 <0.000009> 11:52:43.492485 fcntl64(6, F_GETFL) = 0x2 (flags O_RDWR) <0.000008> 11:52:43.492543 fcntl64(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000011> 11:52:43.492599 fcntl64(6, F_GETFD) = 0 <0.000007> 11:52:43.492651 fcntl64(6, F_SETFD, FD_CLOEXEC) = 0 <0.000008> 11:52:43.492720 gettimeofday({1200570763, 492747}, NULL) = 0 <0.000009> 11:52:43.492780 poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN}], 3, 572989) = 1 <0.000009> 11:52:43.492850 gettimeofday({1200570763, 492875}, NULL) = 0 <0.000008> 11:52:43.492908 read(6, "<19>Jan 17 11:52:43 mail.local: lockmailbox /var/mail/root failed; error code 75 \0", 8192) = 82 <0.000010> 11:52:43.493000 gettimeofday({1200570763, 493026}, NULL) = 0 <0.000008> 11:52:43.493060 time(NULL) = 1200570763 <0.000008> 11:52:43.493130 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000009> 11:52:43.493239 gettimeofday({1200570763, 493265}, NULL) = 0 <0.000008> 11:52:43.493304 time(NULL) = 1200570763 <0.000008> 11:52:43.493358 open("/var/log/maillog", O_WRONLY|O_CREAT|O_NOCTTY|O_APPEND|O_NONBLOCK|O_LARGEFILE, 0600) = 7 <0.000022> 11:52:43.493438 fcntl64(7, F_GETFD) = 0 <0.000007> 11:52:43.493490 fcntl64(7, F_SETFD, FD_CLOEXEC) = 0 <0.000008> 11:52:43.493542 fchown32(7, 0, -1) = 0 <0.000013> 11:52:43.493601 fchown32(7, -1, 0) = 0 <0.000009> 11:52:43.493656 fchmod(7, 0600) = 0 <0.000010> 11:52:43.493714 time(NULL) = 1200570763 <0.000008> 11:52:43.493773 read(6, "", 8192) = 0 <0.000008> 11:52:43.493830 close(6) = 0 <0.000012> 11:52:43.493891 gettimeofday({1200570763, 493917}, NULL) = 0 <0.000008> 11:52:43.493954 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN, revents=POLLIN}], 2, 10000) = 1 <0.000182> 11:52:43.494199 gettimeofday({1200570763, 494227}, NULL) = 0 <0.000008> 11:52:43.494264 accept(4, {sa_family=AF_FILE, path="üH"}, [2]) = 6 <0.000012> 11:52:43.494375 setsockopt(6, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0 <0.000008> 11:52:43.494433 fcntl64(6, F_GETFL) = 0x2 (flags O_RDWR) <0.000008> 11:52:43.494515 fcntl64(6, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000010> 11:52:43.494573 fcntl64(6, F_GETFD) = 0 <0.000008> 11:52:43.494624 fcntl64(6, F_SETFD, FD_CLOEXEC) = 0 <0.000008> 11:52:43.494688 gettimeofday({1200570763, 494714}, NULL) = 0 <0.000008> 11:52:43.494747 poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN, revents=POLLIN}, {fd=4, events=POLLIN}], 3, 9999) = 1 <0.000020> 11:52:43.494828 gettimeofday({1200570763, 494854}, NULL) = 0 <0.000008> 11:52:43.494887 read(6, "<22>Jan 17 11:52:43 sendmail[2403]: m0HA11G6002573: to=<root@localhost.localdomain>, ctladdr=<root@localhost.localdomain> (0/0), delay=01:51:42, xdelay=00:00:30, mailer=local, pri=660874, dsn=4.0.0, stat=Deferred: local mailer (/usr/sbin/mail.local) exited with EX_TEMPFAIL\0", 8192) = 274 <0.000009> 11:52:43.495038 gettimeofday({1200570763, 495064}, NULL) = 0 <0.000008> 11:52:43.495095 time(NULL) = 1200570763 <0.000008> 11:52:43.495154 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:43.495260 gettimeofday({1200570763, 495286}, NULL) = 0 <0.000008> 11:52:43.495319 time(NULL) = 1200570763 <0.000008> 11:52:43.495377 read(6, 0x8077188, 8192) = -1 EAGAIN (Resource temporarily unavailable) <0.000008> 11:52:43.495438 gettimeofday({1200570763, 495464}, NULL) = 0 <0.000008> 11:52:43.495497 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN}, {fd=6, events=POLLIN}], 3, 9998) = 0 <9.996768> 11:52:53.492361 gettimeofday({1200570773, 492400}, NULL) = 0 <0.000009> 11:52:53.492448 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:53.492579 write(7, "Jan 17 11:52:43 localhost mail.local: lockmailbox /var/mail/root failed; error code 75 \n", 88) = 88 <0.000786> 11:52:53.493450 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:52:53.493567 write(7, "Jan 17 11:52:43 localhost sendmail[2403]: m0HA11G6002573: to=<root@localhost.localdomain>, ctladdr=<root@localhost.localdomain> (0/0), delay=01:51:42, xdelay=00:00:30, mailer=local, pri=660874, dsn=4.0.0, stat=Deferred: local mailer (/usr/sbin/mail.local) exited with EX_TEMPFAIL\n", 280) = 280 <0.000012> 11:52:53.493724 gettimeofday({1200570773, 493751}, NULL) = 0 <0.000008> 11:52:53.493783 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN, revents=POLLIN}, {fd=6, events=POLLIN}], 3, 562988) = 1 <12.102484> 11:53:05.596377 gettimeofday({1200570785, 596411}, NULL) = 0 <0.000009> 11:53:05.596457 accept(4, {sa_family=AF_FILE, path="üH"}, [2]) = 8 <0.000017> 11:53:05.596577 setsockopt(8, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0 <0.000009> 11:53:05.596638 fcntl64(8, F_GETFL) = 0x2 (flags O_RDWR) <0.000007> 11:53:05.596695 fcntl64(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008> 11:53:05.596750 fcntl64(8, F_GETFD) = 0 <0.000008> 11:53:05.596802 fcntl64(8, F_SETFD, FD_CLOEXEC) = 0 <0.000008> 11:53:05.596875 gettimeofday({1200570785, 596903}, NULL) = 0 <0.000009> 11:53:05.596937 poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 4, 550885) = 2 <0.000011> 11:53:05.597013 gettimeofday({1200570785, 597040}, NULL) = 0 <0.000009> 11:53:05.597075 accept(4, {sa_family=AF_FILE, path="üH"}, [2]) = 9 <0.000016> 11:53:05.597186 setsockopt(9, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0 <0.000008> 11:53:05.597244 fcntl64(9, F_GETFL) = 0x2 (flags O_RDWR) <0.000008> 11:53:05.597298 fcntl64(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000007> 11:53:05.597352 fcntl64(9, F_GETFD) = 0 <0.000008> 11:53:05.597404 fcntl64(9, F_SETFD, FD_CLOEXEC) = 0 <0.000008> 11:53:05.597468 read(8, "<78>Jan 17 11:53:05 tester: THIS IS A TEST MESSAGE 1\0", 8192) = 53 <0.000012> 11:53:05.597553 gettimeofday({1200570785, 597580}, NULL) = 0 <0.000009> 11:53:05.597613 time(NULL) = 1200570785 <0.000008> 11:53:05.597678 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:53:05.597787 gettimeofday({1200570785, 597815}, NULL) = 0 <0.000008> 11:53:05.597856 time(NULL) = 1200570785 <0.000008> 11:53:05.597946 open("/var/log/cron.log", O_WRONLY|O_CREAT|O_NOCTTY|O_APPEND|O_NONBLOCK|O_LARGEFILE, 0600) = 10 <0.000015> 11:53:05.598020 fcntl64(10, F_GETFD) = 0 <0.000007> 11:53:05.598074 fcntl64(10, F_SETFD, FD_CLOEXEC) = 0 <0.000008> 11:53:05.598126 fchown32(10, 0, -1) = 0 <0.000015> 11:53:05.598189 fchown32(10, -1, 0) = 0 <0.000010> 11:53:05.598244 fchmod(10, 0600) = 0 <0.000010> 11:53:05.598302 time(NULL) = 1200570785 <0.000008> 11:53:05.598360 read(8, "", 8192) = 0 <0.000009> 11:53:05.598417 close(8) = 0 <0.000013> 11:53:05.598480 gettimeofday({1200570785, 598507}, NULL) = 0 <0.000008> 11:53:05.598540 poll([{fd=10, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 5, 550883) = 3 <0.000009> 11:53:05.598627 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:53:05.598746 write(10, "Jan 17 11:53:05 localhost tester: THIS IS A TEST MESSAGE 1\n", 59) = 59 <0.000017> 11:53:05.598836 gettimeofday({1200570785, 598865}, NULL) = 0 <0.000008> 11:53:05.598896 poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 4, 550883) = 2 <0.000009> 11:53:05.598972 gettimeofday({1200570785, 598998}, NULL) = 0 <0.000052> 11:53:05.599079 accept(4, {sa_family=AF_FILE, path="üH"}, [2]) = 8 <0.000015> 11:53:05.599190 setsockopt(8, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0 <0.000009> 11:53:05.599247 fcntl64(8, F_GETFL) = 0x2 (flags O_RDWR) <0.000008> 11:53:05.599302 fcntl64(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000007> 11:53:05.599356 fcntl64(8, F_GETFD) = 0 <0.000008> 11:53:05.599408 fcntl64(8, F_SETFD, FD_CLOEXEC) = 0 <0.000008> 11:53:05.599468 read(9, "<78>Jan 17 11:53:05 tester: THIS IS A TEST MESSAGE 2\0", 8192) = 53 <0.000010> 11:53:05.599547 gettimeofday({1200570785, 599574}, NULL) = 0 <0.000009> 11:53:05.599605 time(NULL) = 1200570785 <0.000008> 11:53:05.599662 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:53:05.599764 gettimeofday({1200570785, 599792}, NULL) = 0 <0.000009> 11:53:05.599825 time(NULL) = 1200570785 <0.000008> 11:53:05.599881 read(9, "", 8192) = 0 <0.000009> 11:53:05.599936 close(9) = 0 <0.000012> 11:53:05.599997 gettimeofday({1200570785, 600024}, NULL) = 0 <0.000009> 11:53:05.600055 poll([{fd=10, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 5, 550882) = 3 <0.000009> 11:53:05.600141 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:53:05.600291 write(10, "Jan 17 11:53:05 localhost tester: THIS IS A TEST MESSAGE 2\n", 59) = 59 <0.000011> 11:53:05.600375 gettimeofday({1200570785, 600402}, NULL) = 0 <0.000008> 11:53:05.600434 poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 4, 550881) = 2 <0.000010> 11:53:05.600509 gettimeofday({1200570785, 600535}, NULL) = 0 <0.000008> 11:53:05.600571 accept(4, {sa_family=AF_FILE, path="üH"}, [2]) = 9 <0.000037> 11:53:05.600705 setsockopt(9, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0 <0.000016> 11:53:05.600771 fcntl64(9, F_GETFL) = 0x2 (flags O_RDWR) <0.000008> 11:53:05.600826 fcntl64(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008> 11:53:05.600879 fcntl64(9, F_GETFD) = 0 <0.000008> 11:53:05.600932 fcntl64(9, F_SETFD, FD_CLOEXEC) = 0 <0.000007> 11:53:05.600990 read(8, "<78>Jan 17 11:53:05 tester: THIS IS A TEST MESSAGE 3\0", 8192) = 53 <0.000011> 11:53:05.601070 gettimeofday({1200570785, 601097}, NULL) = 0 <0.000009> 11:53:05.601129 time(NULL) = 1200570785 <0.000009> 11:53:05.601186 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:53:05.601289 gettimeofday({1200570785, 601316}, NULL) = 0 <0.000009> 11:53:05.601349 time(NULL) = 1200570785 <0.000008> 11:53:05.601423 read(8, "", 8192) = 0 <0.000008> 11:53:05.601480 close(8) = 0 <0.000012> 11:53:05.601541 gettimeofday({1200570785, 601568}, NULL) = 0 <0.000008> 11:53:05.601600 poll([{fd=10, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 5, 550880) = 3 <0.000009> 11:53:05.601684 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000009> 11:53:05.601795 write(10, "Jan 17 11:53:05 localhost tester: THIS IS A TEST MESSAGE 3\n", 59) = 59 <0.000012> 11:53:05.601876 gettimeofday({1200570785, 601903}, NULL) = 0 <0.000008> 11:53:05.601935 poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 4, 550880) = 2 <0.000009> 11:53:05.602009 gettimeofday({1200570785, 602036}, NULL) = 0 <0.000008> 11:53:05.602071 accept(4, {sa_family=AF_FILE, path="üH"}, [2]) = 8 <0.000014> 11:53:05.602180 setsockopt(8, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0 <0.000008> 11:53:05.602238 fcntl64(8, F_GETFL) = 0x2 (flags O_RDWR) <0.000008> 11:53:05.602293 fcntl64(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000007> 11:53:05.602346 fcntl64(8, F_GETFD) = 0 <0.000008> 11:53:05.602398 fcntl64(8, F_SETFD, FD_CLOEXEC) = 0 <0.000007> 11:53:05.602456 read(9, "<78>Jan 17 11:53:05 tester: THIS IS A TEST MESSAGE 4\0", 8192) = 53 <0.000010> 11:53:05.602534 gettimeofday({1200570785, 602561}, NULL) = 0 <0.000009> 11:53:05.602592 time(NULL) = 1200570785 <0.000008> 11:53:05.602649 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:53:05.602750 gettimeofday({1200570785, 602777}, NULL) = 0 <0.000008> 11:53:05.602811 time(NULL) = 1200570785 <0.000008> 11:53:05.602866 read(9, "", 8192) = 0 <0.000008> 11:53:05.602921 close(9) = 0 <0.000012> 11:53:05.602981 gettimeofday({1200570785, 603008}, NULL) = 0 <0.000009> 11:53:05.603039 poll([{fd=10, events=POLLOUT, revents=POLLOUT}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 5, 550879) = 3 <0.000009> 11:53:05.603123 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=56, ...}) = 0 <0.000010> 11:53:05.603232 write(10, "Jan 17 11:53:05 localhost tester: THIS IS A TEST MESSAGE 4\n", 59) = 59 <0.000012> 11:53:05.603314 gettimeofday({1200570785, 603341}, NULL) = 0 <0.000008> 11:53:05.603373 poll([{fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN, revents=POLLIN|POLLHUP}, {fd=4, events=POLLIN, revents=POLLIN}], 4, 550878) = 2 <0.000009> 11:53:05.603447 gettimeofday({1200570785, 603474}, NULL) = 0 <0.000009> 11:53:05.603509 accept(4, {sa_family=AF_FILE, path="üH"}, [2]) = 9 <0.000013> 11:53:05.603617 setsockopt(9, SOL_SOCKET, SO_KEEPALIVE, [0], 4) = 0 <0.000008> 11:53:05.603675 fcntl64(9, F_GETFL) = 0x2 (flags O_RDWR) <0.000008> 11:53:05.603729 fcntl64(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008> 11:53:05.603782 fcntl64(9, F_GETFD) = 0 <0.000008> 11:53:05.603835 fcntl64(9, F_SETFD, FD_CLOEXEC) = 0 <0.000007> 11:53:05.603893 read(8, "<78>Jan 17 11:53:05 tester: THIS IS A TEST MESSAGE 5\0", 8192) = 53 <0.000010>
On Thu, 2008-01-17 at 14:12 +0200, dstuxo wrote:
Hi,
Here is a new test result.. I also attached some strace logs.
as it seems syslog-ng waits 10 seconds to write out the log message. Hmm.. this seems to indicate that flush_timeout is 10 seconds and flush_lines is non-zero. The default flush_lines value is 0. Can you post your complete configuration file? -- Bazsi
Since I made some changes to the config since then, just to be sure, I attached another 3 strace logs with corresponding config file. I can see some delay for each test... * The script used for testing: #!/bin/sh for i in `seq 1 10000`; do echo "THIS IS A TEST MESSAGE $i" logger -t tester -p cron.info "THIS IS A TEST MESSAGE $i" done #eof Regards, Balazs Scheidler wrote:
On Thu, 2008-01-17 at 14:12 +0200, dstuxo wrote:
Hi,
Here is a new test result.. I also attached some strace logs.
as it seems syslog-ng waits 10 seconds to write out the log message.
Hmm.. this seems to indicate that flush_timeout is 10 seconds and flush_lines is non-zero.
The default flush_lines value is 0. Can you post your complete configuration file?
# syslog-ng configuration file. # # This should behave pretty much like the original syslog on RedHat. But # it could be configured a lot smarter. # # See syslog-ng(8) and syslog-ng.conf(5) for more information. # # 20000925 gb@sysfive.com # # Updated by Frank Crawford (<Frank.Crawford@ac3.com.au>) - 10 Aug 2002 # - for Red Hat 7.3 # - totally do away with klogd # - add message "kernel:" as is done with klogd. # # Updated by Frank Crawford (<Frank.Crawford@ac3.com.au>) - 22 Aug 2002 # - use the log_prefix option as per Balazs Scheidler's email # # Updated by Jose Pedro Oliveira (<jpo at di.uminho.pt>) - 05 Apr 2003 # - corrected filters 'f_filter2' and 'f_filter6' # these filters were only allowing messages of one specific # priority level; they should be allowing messages from that # priority and upper levels. # # Updated by Jose Pedro Oliveira (<jpo at di.uminho.pt>) - 25 Jan 2005 # - Don't sync the d_mail destination # # Updated by Jose Pedro Oliveira (<jpo at di.uminho.pt>) - 01 Feb 2005 # - /proc/kmsg is a file not a pipe. # (https://lists.balabit.hu/pipermail/syslog-ng/2005-February/006963.html) # options { sync (0); time_reopen (10); log_fifo_size (1000); long_hostnames (off); use_dns (no); use_fqdn (no); create_dirs (no); keep_hostname (yes); }; source s_sys { file ("/proc/kmsg" log_prefix("kernel: ")); unix-stream ("/dev/log"); internal(); # udp(ip(0.0.0.0) port(514)); }; destination d_cons { file("/dev/console"); }; destination d_mesg { file("/var/log/messages"); }; destination d_auth { file("/var/log/secure"); }; destination d_mail { file("/var/log/maillog" sync(10)); }; destination d_spol { file("/var/log/spooler"); }; destination d_boot { file("/var/log/boot.log"); }; destination d_cron { file("/var/log/cron"); }; destination d_mlal { usertty("*"); }; #filter f_filter1 { facility(kern); }; filter f_filter2 { level(info..emerg) and not facility(mail,authpriv,cron); }; filter f_filter3 { facility(authpriv); }; filter f_filter4 { facility(mail); }; filter f_filter5 { level(emerg); }; filter f_filter6 { facility(uucp) or (facility(news) and level(crit..emerg)); }; filter f_filter7 { facility(local7); }; filter f_filter8 { facility(cron); }; #log { source(s_sys); filter(f_filter1); destination(d_cons); }; log { source(s_sys); filter(f_filter2); destination(d_mesg); }; log { source(s_sys); filter(f_filter3); destination(d_auth); }; log { source(s_sys); filter(f_filter4); destination(d_mail); }; log { source(s_sys); filter(f_filter5); destination(d_mlal); }; log { source(s_sys); filter(f_filter6); destination(d_spol); }; log { source(s_sys); filter(f_filter7); destination(d_boot); }; log { source(s_sys); filter(f_filter8); destination(d_cron); };
On Thu, 2008-01-17 at 17:27 +0200, dstuxo wrote:
Since I made some changes to the config since then, just to be sure, I attached another 3 strace logs with corresponding config file.
I can see some delay for each test...
* The script used for testing: #!/bin/sh
for i in `seq 1 10000`; do echo "THIS IS A TEST MESSAGE $i" logger -t tester -p cron.info "THIS IS A TEST MESSAGE $i" done #eof
I've checked the strace dumps that you posted, and I see nothing out of the ordinary here. For example: first three test messages in test_1.txt: 14:41:09.227599 read(6, "<78>Jan 17 14:41:09 tester: THIS IS A TEST MESSAGE 1\0", 8192) = 53 <0.000012> 14:41:09.228875 write(10, "Jan 17 14:41:09 localhost tester: THIS IS A TEST MESSAGE 1\n", 59) = 59 <0.000017> 14:41:09.229576 read(9, "<78>Jan 17 14:41:09 tester: THIS IS A TEST MESSAGE 2\0", 8192) = 53 <0.000010> 14:41:09.230433 write(10, "Jan 17 14:41:09 localhost tester: THIS IS A TEST MESSAGE 2\n", 59) = 59 <0.000011> 14:41:09.231096 read(6, "<78>Jan 17 14:41:09 tester: THIS IS A TEST MESSAGE 3\0", 8192) = 53 <0.000010> 14:41:09.231924 write(10, "Jan 17 14:41:09 localhost tester: THIS IS A TEST MESSAGE 3\n", 59) = 59 <0.000011> test_2.txt: 14:48:06.211659 read(6, "<78>Jan 17 14:48:06 tester: THIS IS A TEST MESSAGE 1\0", 8192) = 53 <0.000011> 14:48:06.212941 write(8, "Jan 17 14:48:06 localhost tester: THIS IS A TEST MESSAGE 1\n", 59) = 59 <0.001210> 14:48:06.214877 read(7, "<78>Jan 17 14:48:06 tester: THIS IS A TEST MESSAGE 2\0", 8192) = 53 <0.000011> 14:48:06.215676 write(8, "Jan 17 14:48:06 localhost tester: THIS IS A TEST MESSAGE 2\n", 59) = 59 <0.000013> 14:48:06.216339 read(6, "<78>Jan 17 14:48:06 tester: THIS IS A TEST MESSAGE 3\0", 8192) = 53 <0.000010> 14:48:06.217114 write(8, "Jan 17 14:48:06 localhost tester: THIS IS A TEST MESSAGE 3\n", 59) = 59 <0.000011> 14:48:06.217798 read(7, "<78>Jan 17 14:48:06 tester: THIS IS A TEST MESSAGE 4\0", 8192) = 53 <0.000010> 14:48:06.218596 write(8, "Jan 17 14:48:06 localhost tester: THIS IS A TEST MESSAGE 4\n", 59) = 59 <0.000011> test_3.txt: 14:54:19.460453 read(8, "<78>Jan 17 14:54:19 tester: THIS IS A TEST MESSAGE 1\0", 8192) = 53 <0.000011> 14:54:19.460891 read(8, 0x8079138, 8192) = -1 EAGAIN (Resource temporarily unavailable) <0.000008> 14:54:19.461267 write(9, "Jan 17 14:54:19 localhost tester: THIS IS A TEST MESSAGE 1\n", 59) = 59 <0.000016> 14:54:19.463589 read(8, "<78>Jan 17 14:54:19 tester: THIS IS A TEST MESSAGE 2\0", 8192) = 53 <0.000009> 14:54:19.464037 read(8, 0x8079138, 8192) = -1 EAGAIN (Resource temporarily unavailable) <0.000009> 14:54:19.464409 write(9, "Jan 17 14:54:19 localhost tester: THIS IS A TEST MESSAGE 2\n", 59) = 59 <0.000013> 14:54:19.466650 read(8, "<78>Jan 17 14:54:19 tester: THIS IS A TEST MESSAGE 3\0", 8192) = 53 <0.000010> 14:54:19.467075 read(8, 0x8079138, 8192) = -1 EAGAIN (Resource temporarily unavailable) <0.000008> 14:54:19.467447 write(9, "Jan 17 14:54:19 localhost tester: THIS IS A TEST MESSAGE 3\n", 59) = 59 <0.000014> The latency between the read and the accompanying write system call is about 0.001sec. If syslog-ng issues the write system call, then it is up to the kernel and the disks to get it written. You can ask syslog-ng to issue an fsync on the file(), but as you mention, that seriously degrades performance. Or was I missing something? syslog-ng received the message and wrote it to a destination file within a couple of msecs. If you turn off power after 2-3 seconds, and the message is lost, it is probably not syslog-ng's fault. -- Bazsi
Hi again, I forgot to say that the test from previous message was made with version 2.0.7
participants (4)
-
Balazs Scheidler
-
dstuxo
-
Evan Rempel
-
Valdis.Kletnieks@vt.edu