Losing to much remote sent logs
Hello there, I've started playing around with syslog-ng 3.3.4 ose a few days ago but I'm still experiencing some trouble. First of all we want to use syslog-ng to send all of our logs via udp to a central syslog server. This includes of course syslogs, apache logs and custom generated applogs. These logs are generated from 400 clients and produces a minimum of 300 mio. log lines a day. The problem is really simple: I'm losing log lines :P Most of the time everything goes well but when the logs are peaking high 1-5% logs are getting lost. Last night the stats of the server and a client said 0 drops but when I counted the lines I found lost lines. The server has 24g ram & 8 cores and I can rule out a network problem for sure. So now to my questions, has anyone else an idea where I can tweak my cfg or where I have to look to find more clues? Is tcp the only way to get around it? I've attached my syslog server cfg. The so_rcvbuf buffer is the same size as the os net.core.rmem settings. And as described in the various balabit blog posts I played around with log_fetch_limit and flush_lines already. syslog-ng.conf: @version: 3.3 options { threaded(yes); owner("root"); group("root"); perm(0660); dir_owner("root"); dir_group("root"); dir_perm(0770); create_dirs(yes); stats_freq(600); stats_level(2); chain_hostnames(yes); normalize_hostnames(yes); check_hostname(yes); dns_cache(yes); dns_cache_size(16384); dns_cache_expire(3600); dns_cache_expire_failed(60); log_msg_size(16384); log_fifo_size(100000); use_fqdn(yes); #disabled 4 debugging # flush_lines(200); }; source s_src { unix-dgram("/dev/log"); internal(); file("/proc/kmsg" program_override("kernel")); }; source s_net { udp( log_fetch_limit(400) so_rcvbuf(51200000) keep_hostname(yes) keep_timestamp(no) ip("10.8.4.10") port(514) ); tcp( so_rcvbuf(51200000) so_keepalive(yes) keep_hostname(no) keep_timestamp(no) ip("10.8.4.10") port(514) ); syslog(); }; filter f_syslog { not program(access.log) and not program(error.log) and not program(beetle.log) and not program(edge.log); }; filter f_apache { program(access.log) or program(error.log); }; filter f_applogs { program(beetle.log) or program(edge.log); }; template t_plain { template("$MSG\n"); template_escape(no); }; destination d_messages { file("/var/log/messages"); }; destination d_remote { file("/log/syslog/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST"); }; destination d_apache { file("/log/apache/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); }; destination d_applogs { file("/log/applogs/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); }; log { source(s_src); destination(d_messages); }; log { source(s_net); filter(f_syslog); destination(d_remote); }; log { source(s_net); filter(f_apache); destination(d_apache); }; log { source(s_net); filter(f_applogs); destination(d_applogs); }; Thanks Daniel Neubacher
If possible, I would try swapping the $HOST macro for $SOURCEIP to avoid doing any DNS lookups, cached or not. It's unlikely to help, but it sounds like you've already tried the basic tuning things. I will say that I'm very surprised you're losing log lines. What is your peak logs per second, and how long are the peaks? On Fri, Mar 2, 2012 at 3:40 AM, Daniel Neubacher <daniel.neubacher@xing.com> wrote:
Hello there,
I’ve started playing around with syslog-ng 3.3.4 ose a few days ago but I’m still experiencing some trouble. First of all we want to use syslog-ng to send all of our logs via udp to a central syslog server. This includes of course syslogs, apache logs and custom generated applogs. These logs are generated from 400 clients and produces a minimum of 300 mio. log lines a day.
The problem is really simple: I’m losing log lines :P Most of the time everything goes well but when the logs are peaking high 1-5% logs are getting lost.
Last night the stats of the server and a client said 0 drops but when I counted the lines I found lost lines. The server has 24g ram & 8 cores and I can rule out a network problem for sure.
So now to my questions, has anyone else an idea where I can tweak my cfg or where I have to look to find more clues? Is tcp the only way to get around it?
I’ve attached my syslog server cfg. The so_rcvbuf buffer is the same size as the os net.core.rmem settings. And as described in the various balabit blog posts I played around with log_fetch_limit and flush_lines already.
syslog-ng.conf:
@version: 3.3
options {
threaded(yes);
owner("root");
group("root");
perm(0660);
dir_owner("root");
dir_group("root");
dir_perm(0770);
create_dirs(yes);
stats_freq(600);
stats_level(2);
chain_hostnames(yes);
normalize_hostnames(yes);
check_hostname(yes);
dns_cache(yes);
dns_cache_size(16384);
dns_cache_expire(3600);
dns_cache_expire_failed(60);
log_msg_size(16384);
log_fifo_size(100000);
use_fqdn(yes);
#disabled 4 debugging
# flush_lines(200);
};
source s_src {
unix-dgram("/dev/log");
internal();
file("/proc/kmsg" program_override("kernel"));
};
source s_net {
udp(
log_fetch_limit(400)
so_rcvbuf(51200000)
keep_hostname(yes)
keep_timestamp(no)
ip("10.8.4.10")
port(514)
);
tcp(
so_rcvbuf(51200000)
so_keepalive(yes)
keep_hostname(no)
keep_timestamp(no)
ip("10.8.4.10")
port(514)
);
syslog();
};
filter f_syslog {
not program(access.log) and
not program(error.log) and
not program(beetle.log) and
not program(edge.log);
};
filter f_apache {
program(access.log) or
program(error.log);
};
filter f_applogs {
program(beetle.log)
or program(edge.log);
};
template t_plain {
template("$MSG\n"); template_escape(no);
};
destination d_messages { file("/var/log/messages"); };
destination d_remote { file("/log/syslog/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST"); };
destination d_apache { file("/log/apache/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); };
destination d_applogs { file("/log/applogs/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); };
log {
source(s_src);
destination(d_messages);
};
log {
source(s_net);
filter(f_syslog);
destination(d_remote);
};
log {
source(s_net);
filter(f_apache);
destination(d_apache);
};
log {
source(s_net);
filter(f_applogs);
destination(d_applogs);
};
Thanks
Daniel Neubacher
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Thanks for the answer. Disabling DNS would be really painful. I will play around some more today and try it as a last resort. The Baseline for a webserver is 146k logs per hour, the minimum is 22k and the maximum 365k. The peaks are only happening in the night for 3-4 hours because of the local mail traffic. Today I will roll out my tcp logging conf but I'm not too happy about that. -----Ursprüngliche Nachricht----- Von: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng-bounces@lists.balabit.hu] Im Auftrag von Martin Holste Gesendet: Freitag, 2. März 2012 16:00 An: Syslog-ng users' and developers' mailing list Betreff: Re: [syslog-ng] Losing to much remote sent logs If possible, I would try swapping the $HOST macro for $SOURCEIP to avoid doing any DNS lookups, cached or not. It's unlikely to help, but it sounds like you've already tried the basic tuning things. I will say that I'm very surprised you're losing log lines. What is your peak logs per second, and how long are the peaks? On Fri, Mar 2, 2012 at 3:40 AM, Daniel Neubacher <daniel.neubacher@xing.com> wrote:
Hello there,
I've started playing around with syslog-ng 3.3.4 ose a few days ago but I'm still experiencing some trouble. First of all we want to use syslog-ng to send all of our logs via udp to a central syslog server. This includes of course syslogs, apache logs and custom generated applogs. These logs are generated from 400 clients and produces a minimum of 300 mio. log lines a day.
The problem is really simple: I'm losing log lines :P Most of the time everything goes well but when the logs are peaking high 1-5% logs are getting lost.
Last night the stats of the server and a client said 0 drops but when I counted the lines I found lost lines. The server has 24g ram & 8 cores and I can rule out a network problem for sure.
So now to my questions, has anyone else an idea where I can tweak my cfg or where I have to look to find more clues? Is tcp the only way to get around it?
I've attached my syslog server cfg. The so_rcvbuf buffer is the same size as the os net.core.rmem settings. And as described in the various balabit blog posts I played around with log_fetch_limit and flush_lines already.
syslog-ng.conf:
@version: 3.3
options {
threaded(yes);
owner("root");
group("root");
perm(0660);
dir_owner("root");
dir_group("root");
dir_perm(0770);
create_dirs(yes);
stats_freq(600);
stats_level(2);
chain_hostnames(yes);
normalize_hostnames(yes);
check_hostname(yes);
dns_cache(yes);
dns_cache_size(16384);
dns_cache_expire(3600);
dns_cache_expire_failed(60);
log_msg_size(16384);
log_fifo_size(100000);
use_fqdn(yes);
#disabled 4 debugging
# flush_lines(200);
};
source s_src {
unix-dgram("/dev/log");
internal();
file("/proc/kmsg" program_override("kernel"));
};
source s_net {
udp(
log_fetch_limit(400)
so_rcvbuf(51200000)
keep_hostname(yes)
keep_timestamp(no)
ip("10.8.4.10")
port(514)
);
tcp(
so_rcvbuf(51200000)
so_keepalive(yes)
keep_hostname(no)
keep_timestamp(no)
ip("10.8.4.10")
port(514)
);
syslog();
};
filter f_syslog {
not program(access.log) and
not program(error.log) and
not program(beetle.log) and
not program(edge.log);
};
filter f_apache {
program(access.log) or
program(error.log);
};
filter f_applogs {
program(beetle.log)
or program(edge.log);
};
template t_plain {
template("$MSG\n"); template_escape(no);
};
destination d_messages { file("/var/log/messages"); };
destination d_remote { file("/log/syslog/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST"); };
destination d_apache { file("/log/apache/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); };
destination d_applogs { file("/log/applogs/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); };
log {
source(s_src);
destination(d_messages);
};
log {
source(s_net);
filter(f_syslog);
destination(d_remote);
};
log {
source(s_net);
filter(f_apache);
destination(d_apache);
};
log {
source(s_net);
filter(f_applogs);
destination(d_applogs);
};
Thanks
Daniel Neubacher
______________________________________________________________________ ________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Given that you have only file() destinations, performance should not be an issue, so something is definitely wrong. One other shot in the dark: is your log server a VM, and if so, is there any chance that its not getting enough resources because of the extra mail traffic? On Mon, Mar 5, 2012 at 4:30 AM, Daniel Neubacher <daniel.neubacher@xing.com> wrote:
Thanks for the answer. Disabling DNS would be really painful. I will play around some more today and try it as a last resort. The Baseline for a webserver is 146k logs per hour, the minimum is 22k and the maximum 365k. The peaks are only happening in the night for 3-4 hours because of the local mail traffic. Today I will roll out my tcp logging conf but I'm not too happy about that.
-----Ursprüngliche Nachricht----- Von: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng-bounces@lists.balabit.hu] Im Auftrag von Martin Holste Gesendet: Freitag, 2. März 2012 16:00 An: Syslog-ng users' and developers' mailing list Betreff: Re: [syslog-ng] Losing to much remote sent logs
If possible, I would try swapping the $HOST macro for $SOURCEIP to avoid doing any DNS lookups, cached or not. It's unlikely to help, but it sounds like you've already tried the basic tuning things. I will say that I'm very surprised you're losing log lines. What is your peak logs per second, and how long are the peaks?
On Fri, Mar 2, 2012 at 3:40 AM, Daniel Neubacher <daniel.neubacher@xing.com> wrote:
Hello there,
I've started playing around with syslog-ng 3.3.4 ose a few days ago but I'm still experiencing some trouble. First of all we want to use syslog-ng to send all of our logs via udp to a central syslog server. This includes of course syslogs, apache logs and custom generated applogs. These logs are generated from 400 clients and produces a minimum of 300 mio. log lines a day.
The problem is really simple: I'm losing log lines :P Most of the time everything goes well but when the logs are peaking high 1-5% logs are getting lost.
Last night the stats of the server and a client said 0 drops but when I counted the lines I found lost lines. The server has 24g ram & 8 cores and I can rule out a network problem for sure.
So now to my questions, has anyone else an idea where I can tweak my cfg or where I have to look to find more clues? Is tcp the only way to get around it?
I've attached my syslog server cfg. The so_rcvbuf buffer is the same size as the os net.core.rmem settings. And as described in the various balabit blog posts I played around with log_fetch_limit and flush_lines already.
syslog-ng.conf:
@version: 3.3
options {
threaded(yes);
owner("root");
group("root");
perm(0660);
dir_owner("root");
dir_group("root");
dir_perm(0770);
create_dirs(yes);
stats_freq(600);
stats_level(2);
chain_hostnames(yes);
normalize_hostnames(yes);
check_hostname(yes);
dns_cache(yes);
dns_cache_size(16384);
dns_cache_expire(3600);
dns_cache_expire_failed(60);
log_msg_size(16384);
log_fifo_size(100000);
use_fqdn(yes);
#disabled 4 debugging
# flush_lines(200);
};
source s_src {
unix-dgram("/dev/log");
internal();
file("/proc/kmsg" program_override("kernel"));
};
source s_net {
udp(
log_fetch_limit(400)
so_rcvbuf(51200000)
keep_hostname(yes)
keep_timestamp(no)
ip("10.8.4.10")
port(514)
);
tcp(
so_rcvbuf(51200000)
so_keepalive(yes)
keep_hostname(no)
keep_timestamp(no)
ip("10.8.4.10")
port(514)
);
syslog();
};
filter f_syslog {
not program(access.log) and
not program(error.log) and
not program(beetle.log) and
not program(edge.log);
};
filter f_apache {
program(access.log) or
program(error.log);
};
filter f_applogs {
program(beetle.log)
or program(edge.log);
};
template t_plain {
template("$MSG\n"); template_escape(no);
};
destination d_messages { file("/var/log/messages"); };
destination d_remote { file("/log/syslog/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST"); };
destination d_apache { file("/log/apache/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); };
destination d_applogs { file("/log/applogs/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); };
log {
source(s_src);
destination(d_messages);
};
log {
source(s_net);
filter(f_syslog);
destination(d_remote);
};
log {
source(s_net);
filter(f_apache);
destination(d_apache);
};
log {
source(s_net);
filter(f_applogs);
destination(d_applogs);
};
Thanks
Daniel Neubacher
______________________________________________________________________ ________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
No it's not a vm. But tcp syslog is performing pretty good. I just wonder what happens if the syslog server isn't reachable for a long time. I tested it with another syslog server and it went rogue. Today it didn't lose a line so tcp is pretty much the answer :P -----Ursprüngliche Nachricht----- Von: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng-bounces@lists.balabit.hu] Im Auftrag von Martin Holste Gesendet: Montag, 5. März 2012 17:59 An: Syslog-ng users' and developers' mailing list Betreff: Re: [syslog-ng] Losing to much remote sent logs Given that you have only file() destinations, performance should not be an issue, so something is definitely wrong. One other shot in the dark: is your log server a VM, and if so, is there any chance that its not getting enough resources because of the extra mail traffic? On Mon, Mar 5, 2012 at 4:30 AM, Daniel Neubacher <daniel.neubacher@xing.com> wrote:
Thanks for the answer. Disabling DNS would be really painful. I will play around some more today and try it as a last resort. The Baseline for a webserver is 146k logs per hour, the minimum is 22k and the maximum 365k. The peaks are only happening in the night for 3-4 hours because of the local mail traffic. Today I will roll out my tcp logging conf but I'm not too happy about that.
-----Ursprüngliche Nachricht----- Von: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng-bounces@lists.balabit.hu] Im Auftrag von Martin Holste Gesendet: Freitag, 2. März 2012 16:00 An: Syslog-ng users' and developers' mailing list Betreff: Re: [syslog-ng] Losing to much remote sent logs
If possible, I would try swapping the $HOST macro for $SOURCEIP to avoid doing any DNS lookups, cached or not. It's unlikely to help, but it sounds like you've already tried the basic tuning things. I will say that I'm very surprised you're losing log lines. What is your peak logs per second, and how long are the peaks?
On Fri, Mar 2, 2012 at 3:40 AM, Daniel Neubacher <daniel.neubacher@xing.com> wrote:
Hello there,
I've started playing around with syslog-ng 3.3.4 ose a few days ago but I'm still experiencing some trouble. First of all we want to use syslog-ng to send all of our logs via udp to a central syslog server. This includes of course syslogs, apache logs and custom generated applogs. These logs are generated from 400 clients and produces a minimum of 300 mio. log lines a day.
The problem is really simple: I'm losing log lines :P Most of the time everything goes well but when the logs are peaking high 1-5% logs are getting lost.
Last night the stats of the server and a client said 0 drops but when I counted the lines I found lost lines. The server has 24g ram & 8 cores and I can rule out a network problem for sure.
So now to my questions, has anyone else an idea where I can tweak my cfg or where I have to look to find more clues? Is tcp the only way to get around it?
I've attached my syslog server cfg. The so_rcvbuf buffer is the same size as the os net.core.rmem settings. And as described in the various balabit blog posts I played around with log_fetch_limit and flush_lines already.
syslog-ng.conf:
@version: 3.3
options {
threaded(yes);
owner("root");
group("root");
perm(0660);
dir_owner("root");
dir_group("root");
dir_perm(0770);
create_dirs(yes);
stats_freq(600);
stats_level(2);
chain_hostnames(yes);
normalize_hostnames(yes);
check_hostname(yes);
dns_cache(yes);
dns_cache_size(16384);
dns_cache_expire(3600);
dns_cache_expire_failed(60);
log_msg_size(16384);
log_fifo_size(100000);
use_fqdn(yes);
#disabled 4 debugging
# flush_lines(200);
};
source s_src {
unix-dgram("/dev/log");
internal();
file("/proc/kmsg" program_override("kernel"));
};
source s_net {
udp(
log_fetch_limit(400)
so_rcvbuf(51200000)
keep_hostname(yes)
keep_timestamp(no)
ip("10.8.4.10")
port(514)
);
tcp(
so_rcvbuf(51200000)
so_keepalive(yes)
keep_hostname(no)
keep_timestamp(no)
ip("10.8.4.10")
port(514)
);
syslog();
};
filter f_syslog {
not program(access.log) and
not program(error.log) and
not program(beetle.log) and
not program(edge.log);
};
filter f_apache {
program(access.log) or
program(error.log);
};
filter f_applogs {
program(beetle.log)
or program(edge.log);
};
template t_plain {
template("$MSG\n"); template_escape(no);
};
destination d_messages { file("/var/log/messages"); };
destination d_remote { file("/log/syslog/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST"); };
destination d_apache { file("/log/apache/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); };
destination d_applogs { file("/log/applogs/${R_YEAR}/${R_MONTH}/${R_DAY}/$HOST/$PROGRAM" template(t_plain)); };
log {
source(s_src);
destination(d_messages);
};
log {
source(s_net);
filter(f_syslog);
destination(d_remote);
};
log {
source(s_net);
filter(f_apache);
destination(d_apache);
};
log {
source(s_net);
filter(f_applogs);
destination(d_applogs);
};
Thanks
Daniel Neubacher
_____________________________________________________________________ _ ________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________ ________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________ ________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Le 06/03/2012 11:49, Daniel Neubacher a écrit :
No it's not a vm. But tcp syslog is performing pretty good. I just wonder what happens if the syslog server isn't reachable for a long time. I tested it with another syslog server and it went rogue.
Today it didn't lose a line so tcp is pretty much the answer :P [...] Hello,
For a benchmark, I have stressed (10 000 to 20 000msg/sec) a syslogd server which transmits all logs it received to a Syslog-NG server over udp. I was able to reach a score of 90% of lost messages. udp is very good way to have problem with your log management solution I think. ++ Christophe ***************************************************** "Le contenu de ce courriel et ses éventuelles pièces jointes sont confidentiels. Ils s'adressent exclusivement à la personne destinataire. Si cet envoi ne vous est pas destiné, ou si vous l'avez reçu par erreur, et afin de ne pas violer le secret des correspondances, vous ne devez pas le transmettre à d'autres personnes ni le reproduire. Merci de le renvoyer à l'émetteur et de le détruire. Attention : L'organisme de l'émetteur du message ne pourra être tenu responsable de l'altération du présent courriel. Il appartient au destinataire de vérifier que les messages et pièces jointes reçus ne contiennent pas de virus. Les opinions contenues dans ce courriel et ses éventuelles pièces jointes sont celles de l'émetteur. Elles ne reflètent pas la position de l'organisme sauf s'il en est disposé autrement dans le présent courriel." ******************************************************
For a benchmark, I have stressed (10 000 to 20 000msg/sec) a syslogd server which transmits all logs it received to a Syslog-NG server over udp. I was able to reach a score of 90% of lost messages.
udp is very good way to have problem with your log management solution I think.
That doesn't sound right at all. We get much better performance with UDP: zero drops at around 15k/sec with a lot of bursting to over 20k.
Do you have any tuning magic in your syslog-ng config? Or any other configuration changes in the os? Or are most of your logs from a single server? My setup here is pretty much normal... a big server, gigabit Ethernet and 400 log hosts. Don't know where to search for the problem. -----Ursprüngliche Nachricht----- Von: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng-bounces@lists.balabit.hu] Im Auftrag von Martin Holste Gesendet: Dienstag, 6. März 2012 15:44 An: Syslog-ng users' and developers' mailing list Betreff: Re: [syslog-ng] Losing to much remote sent logs
For a benchmark, I have stressed (10 000 to 20 000msg/sec) a syslogd server which transmits all logs it received to a Syslog-NG server over udp. I was able to reach a score of 90% of lost messages.
udp is very good way to have problem with your log management solution I think.
That doesn't sound right at all. We get much better performance with UDP: zero drops at around 15k/sec with a lot of bursting to over 20k. ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Le 06/03/2012 15:43, Martin Holste a écrit :
For a benchmark, I have stressed (10 000 to 20 000msg/sec) a syslogd server which transmits all logs it received to a Syslog-NG server over udp. I was able to reach a score of 90% of lost messages.
udp is very good way to have problem with your log management solution I think. That doesn't sound right at all. We get much better performance with UDP: zero drops at around 15k/sec with a lot of bursting to over 20k. I was not as clear as required.
On my one year old benchmark, the server sending the logs has a standard syslogd daemon and was hosted on a redhat vm. The sending server sent logs over udp. The receiving server was also on a redhat server vm but hosting a Syslog-ng 3.0.x daemon. [syslogd | RH on VM] ---- UDP ------> [syslog-ng 3.0.x | RH on VM] I reached a 90% loss rate for a total of 100000 messages sent, on a 6600msg/sec rate. Everything went well after : - installing the 3.0.x Syslog-NG on sending server - and using TCP to send logs Hope it helps Christophe -- Christophe Brocas CNAMTS/DDSI/MRSSI 12, allées Haussmann 33300 Bordeaux christophe.brocas@cnamts.fr 3072R/0x0661CBBA fixe +33(0)5.57.85.53.55 mob +33(0)6.77.05.19.01 ***************************************************** "Le contenu de ce courriel et ses éventuelles pièces jointes sont confidentiels. Ils s'adressent exclusivement à la personne destinataire. Si cet envoi ne vous est pas destiné, ou si vous l'avez reçu par erreur, et afin de ne pas violer le secret des correspondances, vous ne devez pas le transmettre à d'autres personnes ni le reproduire. Merci de le renvoyer à l'émetteur et de le détruire. Attention : L'organisme de l'émetteur du message ne pourra être tenu responsable de l'altération du présent courriel. Il appartient au destinataire de vérifier que les messages et pièces jointes reçus ne contiennent pas de virus. Les opinions contenues dans ce courriel et ses éventuelles pièces jointes sont celles de l'émetteur. Elles ne reflètent pas la position de l'organisme sauf s'il en est disposé autrement dans le présent courriel." ******************************************************
On Tue, Mar 6, 2012 at 5:20 PM, Christophe Brocas <christophe.brocas@cnamts.fr> wrote:
Le 06/03/2012 15:43, Martin Holste a écrit :
For a benchmark, I have stressed (10 000 to 20 000msg/sec) a syslogd server which transmits all logs it received to a Syslog-NG server over udp. I was able to reach a score of 90% of lost messages.
udp is very good way to have problem with your log management solution I think. That doesn't sound right at all. We get much better performance with UDP: zero drops at around 15k/sec with a lot of bursting to over 20k. I was not as clear as required.
On my one year old benchmark, the server sending the logs has a standard syslogd daemon and was hosted on a redhat vm.
The sending server sent logs over udp.
The receiving server was also on a redhat server vm but hosting a Syslog-ng 3.0.x daemon.
[syslogd | RH on VM] ---- UDP ------> [syslog-ng 3.0.x | RH on VM]
I believe that was what was mentioned elsewhere/previously that the VMs might be a concern... especially for UDP. See, UDP is an *un*reliable protocol, and it means that the buffers etc. needs to be serviced in time, and there are now buffer/window changes happening un the fly as with TCP.
I reached a 90% loss rate for a total of 100000 messages sent, on a 6600msg/sec rate.
Everything went well after : - installing the 3.0.x Syslog-NG on sending server - and using TCP to send logs
Hope it helps Christophe
-- Christophe Brocas CNAMTS/DDSI/MRSSI 12, allées Haussmann 33300 Bordeaux christophe.brocas@cnamts.fr 3072R/0x0661CBBA fixe +33(0)5.57.85.53.55 mob +33(0)6.77.05.19.01
***************************************************** "Le contenu de ce courriel et ses éventuelles pièces jointes sont confidentiels. Ils s'adressent exclusivement à la personne destinataire. Si cet envoi ne vous est pas destiné, ou si vous l'avez reçu par erreur, et afin de ne pas violer le secret des correspondances, vous ne devez pas le transmettre à d'autres personnes ni le reproduire. Merci de le renvoyer à l'émetteur et de le détruire.
Attention : L'organisme de l'émetteur du message ne pourra être tenu responsable de l'altération du présent courriel. Il appartient au destinataire de vérifier que les messages et pièces jointes reçus ne contiennent pas de virus. Les opinions contenues dans ce courriel et ses éventuelles pièces jointes sont celles de l'émetteur. Elles ne reflètent pas la position de l'organisme sauf s'il en est disposé autrement dans le présent courriel." ******************************************************
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
On Fri, 2012-03-02 at 08:59 -0600, Martin Holste wrote:
If possible, I would try swapping the $HOST macro for $SOURCEIP to avoid doing any DNS lookups, cached or not. It's unlikely to help, but it sounds like you've already tried the basic tuning things. I will say that I'm very surprised you're losing log lines. What is your peak logs per second, and how long are the peaks?
syslog-ng _always_ resolves names if use_dns() is enabled, regardless of the macros used later. This is because it is one of the first things that syslog-ng does after receiving a message, much earlier than actually producing an output, which possibly includes $HOST. Anyway, DNS lookups are cached, and that should cover the most obvious performance problems with DNS. -- Bazsi
On Fri, 2012-03-02 at 10:40 +0100, Daniel Neubacher wrote:
Hello there,
I’ve started playing around with syslog-ng 3.3.4 ose a few days ago but I’m still experiencing some trouble. First of all we want to use syslog-ng to send all of our logs via udp to a central syslog server. This includes of course syslogs, apache logs and custom generated applogs. These logs are generated from 400 clients and produces a minimum of 300 mio. log lines a day.
The problem is really simple: I’m losing log lines :P Most of the time everything goes well but when the logs are peaking high 1-5% logs are getting lost.
Last night the stats of the server and a client said 0 drops but when I counted the lines I found lost lines. The server has 24g ram & 8 cores and I can rule out a network problem for sure.
So now to my questions, has anyone else an idea where I can tweak my cfg or where I have to look to find more clues? Is tcp the only way to get around it?
I’ve attached my syslog server cfg. The so_rcvbuf buffer is the same size as the os net.core.rmem settings. And as described in the various balabit blog posts I played around with log_fetch_limit and flush_lines already.
I've received reports, that syslog-ng's latency over UDP sources became worse with the threaded architecture and threaded(yes) is specified. One possible workaround until I get around to look into the issue a bit deeper is to disable the global threaded() option and enable the "threaded" flag on all non-udp sources and all destinations. This should basically be the same, except that udp traffic will be processed in the main thread, which effectively reduces the latency for the udp source. I saw that you rolled out tcp() already, but this information could help someone else trying to tackle the same issue in the future. -- Bazsi
participants (5)
-
Balazs Scheidler
-
Christophe Brocas
-
Daniel Neubacher
-
Hendrik Visage
-
Martin Holste