[syslog-ng] Message loss (probably) within syslog-ng
Vincent Haverlant
vincent at haverlant.org
Sun Mar 5 19:45:56 CET 2006
Hi,
I get a kind of message loss trouble like in some previous message with
the subject "remote logging not reliable", but in his case, the remote
logging was done other tcp and this was pointed as the probable cause of
message loss. In my case only udp is involved.
Description of the infra:
About 2500 unix hosts sending logs via their original syslog daemon
(Solaris or RedHat).
They are set up that way:
=====
kern.info @logserver
*.error;user.none @logserver
auth.info @logserver
=====
The central syslog-ng (1.9.9) server is a solaris 8 host configured that
way:
=====
options {
time_reopen (1);
time_reap(600);
stats_freq(60);
log_fifo_size (25000000);
keep_hostname (yes);
long_hostnames (no);
use_dns (yes);
dns_cache (yes);
dns_cache_size(3000);
use_fqdn (no); # utilisation du nom court de la machine
owner("root"); # Logs owner
group("sys"); # Logs group owner
perm(0755);
dir_owner("sysexplo"); # Directory Owner
dir_group("sys"); # Directpry Group
dir_perm(0775); # Directory Perm
create_dirs (yes);
use_time_recvd(yes);
# gc_idle_threshold(1000);
# gc_busy_threshold(100000);
};
#
# Configuration directives for remote logs
#
source lan {
udp (port(514));
};
destination hostfiles {
file("/projets/SYS/sysexplo/syslogdata/$YEAR$MONTH$DAY.logremote/$HOST"
owner("sysexplo")
group("sys")
perm(0755)
template("$ISODATE $HOST $MSG\n")
);
};
log {
source(lan);
destination(hostfiles);
flags(final);
};
#
#local logs
#
source localmsg {
sun-stream("/dev/log" door("/etc/.syslog_door"));
internal();
};
destination syslog {
file("/var/log/syslog");
};
destination authlog {
file("/var/adm/authlog");
};
destination messages {
file("/var/adm/messages");
};
# filters to mimic traditional Solaris logging
filter f_mail {
facility(mail);
};
filter f_auth {
level(info) and facility(auth, authpriv);
};
filter f_not_mail {
not facility(mail);
};
log {
source(localmsg);
filter(f_auth);
destination(authlog);
};
log {
source(localmsg);
filter(f_mail);
destination(syslog);
};
log {
source(localmsg);
filter(f_not_mail);
destination(messages);
};
log {
source(localmsg);
destination(hostfiles);
flags(final);
}; # Also save logs from local host
#
=====
This generates between 3 to 8 GiB logs per day on the logserver.
Everything went fine until we wanted to check a specific event on a
specific host which generates something like 10 lines of auth.info logs
for which only 5 lines were saved on the log server. It appeared that
dome packets were randomly ignored. But not sure if it was a network
issue (udp) or syslog-ng we produced the following tool to dump what
arrives on the udp socket:
====
syslog-dump.c
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <sys/errno.h>
#define BUFFLEN 2047
int main (int argc, char **argv) {
int syslog_socket;
int syslog_port=514;
struct sockaddr_in syslog_addr;
unsigned char syslog_buffer[BUFFLEN+1];
int len,ret;
/* create socket */
syslog_socket=socket(PF_INET, SOCK_DGRAM, 0);
if (syslog_socket==-1) {
printf("Error creating socket");
exit(1);
}
bzero(&syslog_addr,sizeof(syslog_addr));
syslog_addr.sin_family = AF_INET;
syslog_addr.sin_addr.s_addr = htonl(INADDR_ANY);
syslog_addr.sin_port = htons(syslog_port);
ret=bind(syslog_socket, (struct sockaddr *) &syslog_addr,
sizeof(syslog_addr));
if (ret==-1) {
perror("Error binding to socket");
exit(1);
}
while (len=read(syslog_socket, syslog_buffer, BUFFLEN)){
printf("%s\n",syslog_buffer);
bzero(syslog_buffer, BUFFLEN+1);
}
}
====
We run this instead of syslog-ng for a few minutes and produces som 20000 lines of log on 2 hosts in 20 seconds using logger. During the tests I also kept receiving the normal messages from my 2500 hosts in addition to my test messages.
I tested it for more than half an hour and never lost any message using my syslog-dump program (something like 20 consecutive tests). The next half hour I repeated the test with syslog-ng. Unfortunately I lost around 10% to 20% of my test messages every time.
I started with syslog-ng 1.6.9 but upgraded to 1.9.9 to be sure I had all the improvements. Has anybody an idea of where my packets get lost, is there a tunning only solution, should I start looking at the code to find my lost packets ? I'd appreciate a small explanation of syslog-ng internal if that is the case :)
Aside from that I would like to say that I'm quite happy with syslog-ng features.
Regards,
Vincent.
More information about the syslog-ng
mailing list