[syslog-ng] remote logging not reliable
stucky
stucky101 at gmail.com
Tue Feb 28 20:09:43 CET 2006
guys
I've been pulling my hair out here. I gonna go insane....
Here's the story.
I first setup syslog-ng-1.6.6 for remote syslogging a year ago.
I seemed to work out great - all my stuff appeared to be logged remotely
and everything was dandy.
Then I decided to come up with some sort of check that would tell me if a
machine was not logging remotely anymore
cause of network problems or whatever. I have a perl logchecker which I
basically told to check for the cron.hourly log message
and if it finds it to grep for the source host and add this host to a file.
I was then gonna check this list against a pre-defined array
of hosts that should be logging blablaba....
This is how I discovered that not everything is really logged. All my
clients have this conf file:
options { chain_hostnames(0);
time_reopen(10);
time_reap(360);
log_fifo_size(2048);
sync(1);
stats(0);
create_dirs(yes);
owner(root);
group(root);
perm(0644);
dir_perm(0755);
use_dns(yes);
dns_cache(yes); };
source src { internal();
unix-stream("/dev/log");
file("/proc/kmsg"); };
destination messages { file("/var/log/messages"); };
destination auth { file("/var/log/auth"); };
destination cron { file("/var/log/cron"); };
destination mail { file("/var/log/mail"); };
destination loghost { tcp("**********" port(5000)); };
filter auth { facility(auth, authpriv); };
filter cron { facility(cron); };
filter mail { facility(mail); };
filter nagios { not ( match("Accepted publickey for nagios from ********")
or
match("COMMAND=/usr/local/nagios/home/check_duplex")
or
match("session opened for user nagios") or
match("session closed for user nagios")); };
log { source(src); filter(cron); destination(cron); destination(loghost);
flags(final); };
log { source(src); filter(mail); destination(mail); destination(loghost);
flags(final); };
log { source(src); filter(auth); filter(nagios); destination(auth);
destination(loghost); flags(final); };
log { source(src); filter(nagios); destination(messages);
destination(loghost); };
If I check the cronlog locally on a client I can see that all the hourly
logs are there:
Feb 27 12:01:01 sbeta crond[31964]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 13:01:01 sbeta crond[9232]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 14:01:01 sbeta crond[18969]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 15:01:01 sbeta crond[28630]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 16:01:01 sbeta crond[5311]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 17:01:01 sbeta crond[14977]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 18:01:01 sbeta crond[22335]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 19:01:01 sbeta crond[31995]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 20:01:01 sbeta crond[9336]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 21:01:01 sbeta crond[19002]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 22:01:01 sbeta crond[28663]: (root) CMD (run-parts /etc/cron.hourly)
Feb 27 23:01:01 sbeta crond[5928]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 00:01:01 sbeta crond[15495]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 01:01:01 sbeta crond[25160]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 02:01:01 sbeta crond[2394]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 03:01:01 sbeta crond[12087]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 04:01:01 sbeta crond[21753]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 05:01:01 sbeta crond[31573]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 06:01:01 sbeta crond[8840]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 07:01:01 sbeta crond[18506]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 08:01:01 sbeta crond[28166]: (root) CMD (run-parts /etc/cron.hourly)
Feb 28 09:01:01 sbeta crond[5433]: (root) CMD (run-parts /etc/cron.hourly)
However, checking the remote cron logs for that machine for that whole of
February shows
this:
Feb 21 12:01:01 sbeta crond[23754]: (root) CMD (run-parts /etc/cron.hourly)
Feb 24 05:01:01 sbeta crond[5012]: (root) CMD (run-parts /etc/cron.hourly)
Feb 24 17:01:01 sbeta crond[26525]: (root) CMD (run-parts /etc/cron.hourly)
Feb 25 03:01:01 sbeta crond[28212]: (root) CMD (run-parts /etc/cron.hourly)
Feb 25 13:01:01 sbeta crond[30890]: (root) CMD (run-parts /etc/cron.hourly)
Feb 26 01:01:01 sbeta crond[20620]: (root) CMD (run-parts /etc/cron.hourly)
Feb 16 19:01:01 sbeta crond[6299]: (root) CMD (run-parts /etc/cron.hourly)
Feb 17 00:01:01 sbeta crond[22138]: (root) CMD (run-parts /etc/cron.hourly)
Feb 17 04:01:01 sbeta crond[28292]: (root) CMD (run-parts /etc/cron.hourly)
Feb 17 08:01:01 sbeta crond[1917]: (root) CMD (run-parts /etc/cron.hourly)
Feb 17 12:01:01 sbeta crond[7812]: (root) CMD (run-parts /etc/cron.hourly)
Feb 17 16:01:01 sbeta crond[14136]: (root) CMD (run-parts /etc/cron.hourly)
Feb 17 19:01:01 sbeta crond[10669]: (root) CMD (run-parts /etc/cron.hourly)
Feb 18 00:01:01 sbeta crond[26489]: (root) CMD (run-parts /etc/cron.hourly)
Feb 18 04:01:01 sbeta crond[32742]: (root) CMD (run-parts /etc/cron.hourly)
Feb 18 08:01:01 sbeta crond[6852]: (root) CMD (run-parts /etc/cron.hourly)
Feb 18 12:01:01 sbeta crond[13106]: (root) CMD (run-parts /etc/cron.hourly)
Feb 18 15:01:01 sbeta crond[9768]: (root) CMD (run-parts /etc/cron.hourly)
Feb 18 20:01:01 sbeta crond[25588]: (root) CMD (run-parts /etc/cron.hourly)
Feb 12 08:01:01 sbeta crond[22554]: (root) CMD (run-parts /etc/cron.hourly)
Feb 12 12:01:01 sbeta crond[28808]: (root) CMD (run-parts /etc/cron.hourly)
Feb 12 16:01:01 sbeta crond[2735]: (root) CMD (run-parts /etc/cron.hourly)
Feb 12 20:01:01 sbeta crond[9063]: (root) CMD (run-parts /etc/cron.hourly)
Feb 13 00:01:01 sbeta crond[15268]: (root) CMD (run-parts /etc/cron.hourly)
Feb 14 17:01:01 sbeta crond[22845]: (root) CMD (run-parts /etc/cron.hourly)
Feb 14 21:01:02 sbeta crond[29177]: (root) CMD (run-parts /etc/cron.hourly)
Feb 15 22:01:01 sbeta crond[12945]: (root) CMD (run-parts /etc/cron.hourly)
Feb 16 05:01:01 sbeta crond[17917]: (root) CMD (run-parts /etc/cron.hourly)
Feb 16 16:01:01 sbeta crond[29758]: (root) CMD (run-parts /etc/cron.hourly)
Feb 16 18:01:01 sbeta crond[16687]: (root) CMD (run-parts /etc/cron.hourly)
As you can see it seemingly randomly logs this stuff now and then. So this
machine hasn't been
sending logs remotely since Feb 21. ? As I said all clients run the same
version of syslog-ng and the
exact same config - yet some of then seem to log cron stuff every hour and
others don't.
I get a log ot those 'Error connecting to remote host AF_INET' messages BUT
i get those for EVERY host so
why is there no consistency ?
I know what everbody is gonna this - your network is hosed but here is the
problem with that theory.
I can log on to that box and I will immediately see this on the loghost in
/var/log/auth
Feb 28 10:43:06 sbeta sshd[21815]: Accepted publickey for stucky from
******** port 47427 ssh2
Feb 28 10:43:06 sbeta sshd(pam_unix)[21819]: session opened for user stucky
by (uid=0)
This will work every time for all other machines as well. I'd understand if
it didn't log at all or with a slight delay
like say 1 minute. Besides it appears cron stuff really gets lost but then
why doesn't any of the auth/authpriv
stuff get lost ? I can't help feeling it has something to do with logging
cron but looking at the config I clearly told
it to log to the local file first and then again to the remote host just
like auth/authpriv stuff.
Here is my logservers config:
options { chain_hostnames(0);
time_reopen(10);
time_reap(360);
log_fifo_size(2048);
use_fqdn(no);
use_dns(yes);
dns_cache(yes);
keep_hostname(yes);
long_hostnames(off);
sync(1);
stats(0);
create_dirs(yes);
perm(0640);
dir_perm(0750);
};
source src { internal();
unix-stream("/dev/log");
file("/proc/kmsg");
udp (ip("*********"));
tcp (ip("*********")
port(5000)
max-connections(1000)
keep-alive(yes)); };
destination messages { file("/var/log/messages"); };
destination auth { file("/var/log/auth"); };
destination cron { file("/var/log/cron"); };
destination mail { file("/var/log/mail"); };
destination arch {
file("/usr/local/var/log_archive/$HOST/$YEAR/$MONTH/$DAY/archive"); };
filter auth { facility(auth, authpriv); };
filter cron { facility(cron); };
filter mail { facility(mail); };
filter messages { not facility(auth, authpriv, cron, mail); };
filter nagios { not ( match("Accepted publickey for nagios from *******") or
match("COMMAND=/usr/local/nagios/home/check_duplex")
or
match("session opened for user nagios") or
match("session closed for user nagios")); };
filter junk { not ( match("Accepted publickey for oracle from ***********")
or
match("Accepted publickey for oracle from ***********")
or
match("Accepted publickey for stucky from ***********")
or
match("COMMAND=/usr/local/nagios/home/check_duplex") or
match(".+reconnecting to LDAP server.+sleeping") or
match("session opened for user root") or
match(".+AF_INET client connected from .+") or
match(".+AF_INET client dropped connection from.+") or
match(".+Connection broken to AF_INET.+")); };
log { source(src); filter(cron); destination(cron); destination(arch);
flags(final); };
log { source(src); filter(mail); destination(mail); destination(arch);
flags(final); };
log { source(src); filter(auth); filter(nagios); destination(auth);
destination(arch); flags(final); };
log { source(src); filter(junk); filter(nagios); destination(messages);
destination(arch); };
And here's the really messed up thing. EVERY time I set up a tcpdump on the
loghost at the full hour
to see whether the box actually sends stuff to it - it does ! Yet, if I wait
again for a while and check the logs
I can see again that a log of hourly cronlogs are skipped.
Can anybody please take a look at my configs and at least tell me that they
are not completely wrong ?
I'm running out of ideas.
--
stucky
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20060228/2eb7b1ca/attachment-0001.html
More information about the syslog-ng
mailing list