Hi, I've got syslog-ng on my mail server and today I noticed while trying to debug something else that there was nothing in the maillog. I thought this was very odd as there should be lots of stuff, so I checked syslog-ng was running. It was but I decided to restart it and then the maillog started to fill up again. Now it's been several days since it's written anything to the mail log, but it has continued writing to the /var/log/messages destination during that time. Here is my config: options { chain_hostnames(off); sync(0); stats(43200); log_fifo_size(30000); }; source src { unix-stream("/dev/log" max-connections(1000)); internal(); pipe("/proc/kmsg"); }; destination messages { file("/var/log/messages"); }; destination d_net { tcp("ip_to_logserver" port(logserver_port) ); }; destination maillog { file("/var/log/maillog"); }; destination mailerr { file("/var/log/mail.err"); }; filter f_mail { facility(mail); }; filter f_mailerr { facility(mail) and level(err); }; filter f_notmailjunk { not (program("postfix/*") and not level(err)); }; log { source(src); filter(f_mail); destination(maillog); }; log { source(src); filter(f_mailerr); destination(mailerr); }; log { source(src); filter(f_notmailjunk); destination(messages); destination(d_net); }; Possible idea: My logserver was flooded as I was doing something else on it and it's likely that logs didn't get through (I know, I know - I wanted a separate machine for this of course but my boss holds the money...). Is it possible that the logger filled up locally backlogged messages for the logserver and this caused logs to be lost for the other destination maillog as syslog-ng was full or something? Does this sound like it or not at all? I am using syslog-ng 1.6.11 by the way. Thanks Hari -- Hari Sekhon
On Mon, 2007-07-02 at 11:11 +0100, Hari Sekhon wrote:
Hi,
I've got syslog-ng on my mail server and today I noticed while trying to debug something else that there was nothing in the maillog. I thought this was very odd as there should be lots of stuff, so I checked syslog-ng was running. It was but I decided to restart it and then the maillog started to fill up again.
Now it's been several days since it's written anything to the mail log, but it has continued writing to the /var/log/messages destination during that time.
Here is my config:
options { chain_hostnames(off); sync(0); stats(43200); log_fifo_size(30000); }; source src { unix-stream("/dev/log" max-connections(1000)); internal(); pipe("/proc/kmsg"); }; destination messages { file("/var/log/messages"); }; destination d_net { tcp("ip_to_logserver" port(logserver_port) ); }; destination maillog { file("/var/log/maillog"); }; destination mailerr { file("/var/log/mail.err"); }; filter f_mail { facility(mail); }; filter f_mailerr { facility(mail) and level(err); }; filter f_notmailjunk { not (program("postfix/*") and not level(err)); }; log { source(src); filter(f_mail); destination(maillog); }; log { source(src); filter(f_mailerr); destination(mailerr); }; log { source(src); filter(f_notmailjunk); destination(messages); destination(d_net); };
Possible idea: My logserver was flooded as I was doing something else on it and it's likely that logs didn't get through (I know, I know - I wanted a separate machine for this of course but my boss holds the money...). Is it possible that the logger filled up locally backlogged messages for the logserver and this caused logs to be lost for the other destination maillog as syslog-ng was full or something?
Does this sound like it or not at all?
I am using syslog-ng 1.6.11 by the way.
syslog-ng uses a completely nonblocking I/O model and it should process messages as long as it is not blocked on something external. If it is blocked, then it will do nothing at all. You stated that it was doing everything, except writing messages to your mail log. An strace dump of the syslog-ng process while it was stalled like this could have helped somewhat. Your config says that you have STATS enabled, can you check if it indicated message loss? As I see you are using postfix, postfix is often chroot-ed (Debian uses /var/spool/postfix for example), and you are not reading "/var/spool/postfix/dev/log". This might be a possible problem. Personally I've never seen anything like this. (but hey, that's what bugreports are for :) The best practice when something like this happens is to gather all information possible, like: * strace syslog-ng (with long strings, e.g. -s 256) * strace the sending process * check the syslog transport medium if applicable (e.g. via tcpdump) * let the syslog-ng process dump core (by enabling core files and sending a QUIT signal) If messages flow into syslog-ng (confirmed using tcpdump or by stracing the sender process), and it actually receives the messages (confirmed by straceing syslog-ng and checking whether it gets this message via recv), and still it does not write it into any of its destinations while it should have done so (confirmed by checking the configuration file/log files), then it is a bug in syslog-ng. The problem is that we have not yet confirmed the first couple of steps. -- Bazsi
Yes I appreciate that you need debug info, but like I said it started working again after I restarted syslog-ng so there isn't really any opportunity for me strace the actual problem. I know about getting logs from chroots, but this service is not chrooted, it should use the same logging socket as everything else. Is it possible that one blocked destination could affect another, last I heard from you on this was that each destination was a separate line and if one blocked it would only block that one destination, ie each destination is independent, is that right? If I can make this error reproducable then I'll try to get you some straces etc. and possibly a core dump. Thanks -h Hari Sekhon Balazs Scheidler wrote:
On Mon, 2007-07-02 at 11:11 +0100, Hari Sekhon wrote:
Hi,
I've got syslog-ng on my mail server and today I noticed while trying to debug something else that there was nothing in the maillog. I thought this was very odd as there should be lots of stuff, so I checked syslog-ng was running. It was but I decided to restart it and then the maillog started to fill up again.
Now it's been several days since it's written anything to the mail log, but it has continued writing to the /var/log/messages destination during that time.
Here is my config:
options { chain_hostnames(off); sync(0); stats(43200); log_fifo_size(30000); }; source src { unix-stream("/dev/log" max-connections(1000)); internal(); pipe("/proc/kmsg"); }; destination messages { file("/var/log/messages"); }; destination d_net { tcp("ip_to_logserver" port(logserver_port) ); }; destination maillog { file("/var/log/maillog"); }; destination mailerr { file("/var/log/mail.err"); }; filter f_mail { facility(mail); }; filter f_mailerr { facility(mail) and level(err); }; filter f_notmailjunk { not (program("postfix/*") and not level(err)); }; log { source(src); filter(f_mail); destination(maillog); }; log { source(src); filter(f_mailerr); destination(mailerr); }; log { source(src); filter(f_notmailjunk); destination(messages); destination(d_net); };
Possible idea: My logserver was flooded as I was doing something else on it and it's likely that logs didn't get through (I know, I know - I wanted a separate machine for this of course but my boss holds the money...). Is it possible that the logger filled up locally backlogged messages for the logserver and this caused logs to be lost for the other destination maillog as syslog-ng was full or something?
Does this sound like it or not at all?
I am using syslog-ng 1.6.11 by the way.
syslog-ng uses a completely nonblocking I/O model and it should process messages as long as it is not blocked on something external. If it is blocked, then it will do nothing at all.
You stated that it was doing everything, except writing messages to your mail log.
An strace dump of the syslog-ng process while it was stalled like this could have helped somewhat. Your config says that you have STATS enabled, can you check if it indicated message loss?
As I see you are using postfix, postfix is often chroot-ed (Debian uses /var/spool/postfix for example), and you are not reading "/var/spool/postfix/dev/log". This might be a possible problem.
Personally I've never seen anything like this. (but hey, that's what bugreports are for :)
The best practice when something like this happens is to gather all information possible, like: * strace syslog-ng (with long strings, e.g. -s 256) * strace the sending process * check the syslog transport medium if applicable (e.g. via tcpdump) * let the syslog-ng process dump core (by enabling core files and sending a QUIT signal)
If messages flow into syslog-ng (confirmed using tcpdump or by stracing the sender process), and it actually receives the messages (confirmed by straceing syslog-ng and checking whether it gets this message via recv), and still it does not write it into any of its destinations while it should have done so (confirmed by checking the configuration file/log files), then it is a bug in syslog-ng.
The problem is that we have not yet confirmed the first couple of steps.
On Mon, 2007-07-02 at 12:16 +0100, Hari Sekhon wrote:
Yes I appreciate that you need debug info, but like I said it started working again after I restarted syslog-ng so there isn't really any opportunity for me strace the actual problem.
I know about getting logs from chroots, but this service is not chrooted, it should use the same logging socket as everything else.
Is it possible that one blocked destination could affect another, last I heard from you on this was that each destination was a separate line and if one blocked it would only block that one destination, ie each destination is independent, is that right?
I don't really remember the context where I said that. Destinations are really somewhat independent (they use a different transport state), but otherwise I never saw one destination to stall while all the others work.
If I can make this error reproducable then I'll try to get you some straces etc. and possibly a core dump.
thanks. -- Bazsi
In the syslong-ng 1.6.x series, there is no "reopen" mchanism for disk based files that become closed. The question is how they get closed. If they are closed due to an idle timeout (all syslog destinations do this I think), then when a new message to that destination is processed, the destination will be reopened, even disk files. If the disk file was closed due to an error (IO error of some kind), then the file is never reopened, unless the destination goes through an idle timeout and reopen sequence. In all cases a reload/restart of syslog-ng causes all destinations to be closed and reopened. I have seen cases where a busy destination (ours was mail, just like yours) becomes closed due to a full filesystem. No other destinations became closed because they did not have messages processed during the interval when the filesystem was full. Something occurs to free up some space on the filesystem, so new messages all get processed correctly, however, the mail destination never became idel, but was never opened again. I would really like to have file destinations handled just like network destination and adhere to the reopen configuration setting. I am not sure how the syslog-ng 2.0.x series behaves in these circumstances. Evan Rempel. Hari Sekhon wrote:
Hi,
I've got syslog-ng on my mail server and today I noticed while trying to debug something else that there was nothing in the maillog. I thought this was very odd as there should be lots of stuff, so I checked syslog-ng was running. It was but I decided to restart it and then the maillog started to fill up again.
Now it's been several days since it's written anything to the mail log, but it has continued writing to the /var/log/messages destination during that time.
Here is my config:
options { chain_hostnames(off); sync(0); stats(43200); log_fifo_size(30000); }; source src { unix-stream("/dev/log" max-connections(1000)); internal(); pipe("/proc/kmsg"); }; destination messages { file("/var/log/messages"); }; destination d_net { tcp("ip_to_logserver" port(logserver_port) ); }; destination maillog { file("/var/log/maillog"); }; destination mailerr { file("/var/log/mail.err"); }; filter f_mail { facility(mail); }; filter f_mailerr { facility(mail) and level(err); }; filter f_notmailjunk { not (program("postfix/*") and not level(err)); }; log { source(src); filter(f_mail); destination(maillog); }; log { source(src); filter(f_mailerr); destination(mailerr); }; log { source(src); filter(f_notmailjunk); destination(messages); destination(d_net); };
Possible idea: My logserver was flooded as I was doing something else on it and it's likely that logs didn't get through (I know, I know - I wanted a separate machine for this of course but my boss holds the money...). Is it possible that the logger filled up locally backlogged messages for the logserver and this caused logs to be lost for the other destination maillog as syslog-ng was full or something?
Does this sound like it or not at all?
I am using syslog-ng 1.6.11 by the way.
Thanks
Hari
Thanks for that, it sounds like a very plausible explanation. In my case though, the filesystem is monitored for space and I don't remember it becoming full. Good explanation though, I will keep an eye out for that it is does happen on any other machines as well. -h Hari Sekhon Evan Rempel wrote:
In the syslong-ng 1.6.x series, there is no "reopen" mchanism for disk based files that become closed. The question is how they get closed.
If they are closed due to an idle timeout (all syslog destinations do this I think), then when a new message to that destination is processed, the destination will be reopened, even disk files.
If the disk file was closed due to an error (IO error of some kind), then the file is never reopened, unless the destination goes through an idle timeout and reopen sequence.
In all cases a reload/restart of syslog-ng causes all destinations to be closed and reopened.
I have seen cases where a busy destination (ours was mail, just like yours) becomes closed due to a full filesystem. No other destinations became closed because they did not have messages processed during the interval when the filesystem was full. Something occurs to free up some space on the filesystem, so new messages all get processed correctly, however, the mail destination never became idel, but was never opened again.
I would really like to have file destinations handled just like network destination and adhere to the reopen configuration setting.
I am not sure how the syslog-ng 2.0.x series behaves in these circumstances.
Evan Rempel.
Hari Sekhon wrote:
Hi,
I've got syslog-ng on my mail server and today I noticed while trying to debug something else that there was nothing in the maillog. I thought this was very odd as there should be lots of stuff, so I checked syslog-ng was running. It was but I decided to restart it and then the maillog started to fill up again.
Now it's been several days since it's written anything to the mail log, but it has continued writing to the /var/log/messages destination during that time.
Here is my config:
options { chain_hostnames(off); sync(0); stats(43200); log_fifo_size(30000); }; source src { unix-stream("/dev/log" max-connections(1000)); internal(); pipe("/proc/kmsg"); }; destination messages { file("/var/log/messages"); }; destination d_net { tcp("ip_to_logserver" port(logserver_port) ); }; destination maillog { file("/var/log/maillog"); }; destination mailerr { file("/var/log/mail.err"); }; filter f_mail { facility(mail); }; filter f_mailerr { facility(mail) and level(err); }; filter f_notmailjunk { not (program("postfix/*") and not level(err)); }; log { source(src); filter(f_mail); destination(maillog); }; log { source(src); filter(f_mailerr); destination(mailerr); }; log { source(src); filter(f_notmailjunk); destination(messages); destination(d_net); };
Possible idea: My logserver was flooded as I was doing something else on it and it's likely that logs didn't get through (I know, I know - I wanted a separate machine for this of course but my boss holds the money...). Is it possible that the logger filled up locally backlogged messages for the logserver and this caused logs to be lost for the other destination maillog as syslog-ng was full or something?
Does this sound like it or not at all?
I am using syslog-ng 1.6.11 by the way.
Thanks
Hari
_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html
participants (3)
-
Balazs Scheidler
-
Evan Rempel
-
Hari Sekhon