Debugging tools for syslog-ng
Hi, I wanted to know if syslog-ng developers has some tools like mysqltuner or just a shell scripts to check syslog-ng configuration and get some recommendations on tuning? There is a lot of information about all the buffers/window sizes, etc. and it's very complicated to set the proper values if I have a bunch of different sources/destinations. For example if I'm using flow-control+multiple destinations it can stop reading the source at any time and I have no idea when and why it's happening and which value should I tune. -- Best regards, Koldaev Anton
----- Original message -----
Hi,
I wanted to know if syslog-ng developers has some tools like mysqltuner or just a shell scripts to check syslog-ng configuration and get some recommendations on tuning?
There is a lot of information about all the buffers/window sizes, etc. and it's very complicated to set the proper values if I have a bunch of different sources/destinations.
For example if I'm using flow-control+multiple destinations it can stop reading the source at any time and I have no idea when and why it's happening and which value should I tune.
I agree that it's too complicated, and I also pondered with the idea to automatically configure those, however I didn't have enough time to pursue it myself. (hint, I'd love a patch in this area!) In general, performance wise you want to increase stuff (log-fetch-limit, log-iw-size, flush-lines for file destinations), memory-use and reliability wise you want to decrease them. Also, you have to make sure that sum(log-iw-size) < log-fifo-size.
Thanks for your reply How can I understand when it's enough to increase things? Is there any manual way to get current values of each buffer, etc? Also since I'm logging a lot of things I'd love to know if there are some other ways to lose messages without seeing them in "dropped"?
In general, performance wise you want to increase stuff (log-fetch-limit, log-iw-size, flush-lines for file destinations), memory-use and reliability wise you want to decrease them. Also, you have to make sure that sum(log-iw-size) < log-fifo-size. So you propose just randomly tune those params? I just don't understand how should I get check if it helped. I need to see the current state of each buffer(to be able to get some statistics data) to see if it helps.
And one more specific question:
If flow-control is in use and one of the destinations cannot accept the messages, the other destinations do not receive any messages either, because syslog-ng stops reading the source. Why there is no messages about it in syslog-ng logs? It must be error, don't you think so? And what if I don't have flow-control enabled?
----- Original message -----
Thanks for your reply How can I understand when it's enough to increase things? Is there any manual way to get current values of each buffer, etc?
well, I tend to use loggen for performance tests, also you can query syslog-ng internal statistics using 'syslog-ng-ctl stats'
Also since I'm logging a lot of things I'd love to know if there are some other ways to lose messages without seeing them in "dropped"?
syslog-ng counts everything it dropped using the dropped counters for destinations (which is a log-fifo overflow btw) messages can be lost outside syslog-ng because of transport reasons: * udp shouldn't be used for anything serious. * connection breaks can cause message loss
In general, performance wise you want to increase stuff (log-fetch-limit, log-iw-size, flush-lines for file destinations), memory-use and reliability wise you want to decrease them. Also, you have to make sure that sum(log-iw-size) < log-fifo-size. So you propose just randomly tune those params? I just don't understand how should I get check if it helped.
no :) random tuning would be slow to converge to the ideal values. I need to see the current state of
each buffer(to be able to get some statistics data) to see if it helps.
syslog-ng-ctl stats displays the current values of statistics as a csv file. also you can ask syslog-ng to measure more stats by increasing stats-level (at the cost of some performance)
And one more specific question:
If flow-control is in use and one of the destinations cannot accept the messages, the other destinations do not receive any messages either, because syslog-ng stops reading the source.
this is not true. syslog-ng stops sources individually when their window is full.
Why there is no messages about it in syslog-ng logs? It must be error, don't you think so? And what if I don't have flow-control enabled?
If flow-control is in use and one of the destinations cannot accept the messages, the other destinations do not receive any messages either, because syslog-ng stops reading the source.
this is not true. syslog-ng stops sources individually when their window is full.
But it's a quote from http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.3-guide... On Mon, Nov 5, 2012 at 9:34 AM, Balazs Scheidler <bazsi77@gmail.com> wrote:
**
----- Original message -----
Thanks for your reply How can I understand when it's enough to increase things? Is there any manual way to get current values of each buffer, etc?
well, I tend to use loggen for performance tests, also you can query syslog-ng internal statistics using 'syslog-ng-ctl stats'
Also since I'm logging a lot of things I'd love to know if there are some other ways to lose messages without seeing them in "dropped"?
syslog-ng counts everything it dropped using the dropped counters for destinations (which is a log-fifo overflow btw)
messages can be lost outside syslog-ng because of transport reasons: * udp shouldn't be used for anything serious. * connection breaks can cause message loss
In general, performance wise you want to increase stuff (log-fetch-limit, log-iw-size, flush-lines for file destinations), memory-use and reliability wise you want to decrease them. Also, you have to make sure that sum(log-iw-size) < log-fifo-size. So you propose just randomly tune those params? I just don't understand how should I get check if it helped.
no :) random tuning would be slow to converge to the ideal values.
I need to see the current state of
each buffer(to be able to get some statistics data) to see if it helps.
syslog-ng-ctl stats displays the current values of statistics as a csv file.
also you can ask syslog-ng to measure more stats by increasing stats-level (at the cost of some performance)
And one more specific question:
If flow-control is in use and one of the destinations cannot accept
the
messages, the other destinations do not receive any messages either, because syslog-ng stops reading the source.
this is not true. syslog-ng stops sources individually when their window is full.
Why there is no messages about it in syslog-ng logs? It must be error, don't you think so? And what if I don't have flow-control enabled?
-- Best regards, Koldaev Anton
----- Original message -----
If flow-control is in use and one of the destinations cannot accept the messages, the other destinations do not receive any messages either, because syslog-ng stops reading the source.
I misunderstood what you wrote here. The docs is correct. For a single source, throttling the source side means that neither destinations receive messages.
this is not true. syslog-ng stops sources individually when their window is full.
But it's a quote from http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.3-guide...
On Mon, Nov 5, 2012 at 9:34 AM, Balazs Scheidler <bazsi77@gmail.com> wrote:
**
----- Original message -----
Thanks for your reply How can I understand when it's enough to increase things? Is there any manual way to get current values of each buffer, etc?
well, I tend to use loggen for performance tests, also you can query syslog-ng internal statistics using 'syslog-ng-ctl stats'
Also since I'm logging a lot of things I'd love to know if there are some other ways to lose messages without seeing them in "dropped"?
syslog-ng counts everything it dropped using the dropped counters for destinations (which is a log-fifo overflow btw)
messages can be lost outside syslog-ng because of transport reasons: * udp shouldn't be used for anything serious. * connection breaks can cause message loss
In general, performance wise you want to increase stuff (log-fetch-limit, log-iw-size, flush-lines for file destinations), memory-use and reliability wise you want to decrease them. Also, you have to make sure that sum(log-iw-size) < log-fifo-size. So you propose just randomly tune those params? I just don't understand how should I get check if it helped.
no :) random tuning would be slow to converge to the ideal values.
I need to see the current state of
each buffer(to be able to get some statistics data) to see if it helps.
syslog-ng-ctl stats displays the current values of statistics as a csv file.
also you can ask syslog-ng to measure more stats by increasing stats-level (at the cost of some performance)
And one more specific question:
If flow-control is in use and one of the destinations cannot accept
the
messages, the other destinations do not receive any messages either, because syslog-ng stops reading the source.
this is not true. syslog-ng stops sources individually when their window is full.
Why there is no messages about it in syslog-ng logs? It must be error, don't you think so? And what if I don't have flow-control enabled?
-- Best regards, Koldaev Anton
this seems to be a bug in the sql destination. 1000 seems to be the window size for your source, the queue becomes filled, but then the sql destination doesn't flush messages. or does it? Looks like it doesn't write anything after filling up to 1000. I spent a few hours on waiting for it to be flushed but got no results. log_iw_size for my source should be 20000:
syslog(ip(0.0.0.0) transport("tcp") port(5141) max-connections(200) log_iw_size(20000) flags("threaded") log_fetch_limit(100));
it might also happen that it's slow. syslog-ng maxes out the queue, then stops until messages are emptied. once there are free slots it starts again: fills it up, stalls. Looks like it never gets any free slots in my configuration since I don't see any logging after the queue is maxed out for a few hours.
Could you give me any recommendations? Looks like log_iw_size just doesn't work for my source and I have no idea how to fix it. Thanks On Wed, Nov 7, 2012 at 9:38 AM, Balazs Scheidler <bazsi77@gmail.com> wrote:
**
----- Original message -----
If flow-control is in use and one of the destinations cannot accept the messages, the other destinations do not receive any messages either, because syslog-ng stops reading the source.
I misunderstood what you wrote here. The docs is correct. For a single source, throttling the source side means that neither destinations receive messages.
this is not true. syslog-ng stops sources individually when their window is full.
But it's a quote from
http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.3-guide...
On Mon, Nov 5, 2012 at 9:34 AM, Balazs Scheidler <bazsi77@gmail.com> wrote:
**
----- Original message -----
Thanks for your reply How can I understand when it's enough to increase things? Is there any manual way to get current values of each buffer, etc?
well, I tend to use loggen for performance tests, also you can query syslog-ng internal statistics using 'syslog-ng-ctl stats'
Also since I'm logging a lot of things I'd love to know if there are some other ways to lose messages without seeing them in "dropped"?
syslog-ng counts everything it dropped using the dropped counters for destinations (which is a log-fifo overflow btw)
messages can be lost outside syslog-ng because of transport reasons: * udp shouldn't be used for anything serious. * connection breaks can cause message loss
In general, performance wise you want to increase stuff (log-fetch-limit, log-iw-size, flush-lines for file destinations), memory-use and reliability wise you want to decrease them. Also, you have to make sure that sum(log-iw-size) < log-fifo-size. So you propose just randomly tune those params? I just don't understand how should I get check if it helped.
no :) random tuning would be slow to converge to the ideal values.
I need to see the current state of
each buffer(to be able to get some statistics data) to see if it helps.
syslog-ng-ctl stats displays the current values of statistics as a csv file.
also you can ask syslog-ng to measure more stats by increasing stats-level (at the cost of some performance)
And one more specific question:
If flow-control is in use and one of the destinations cannot
accept
the
messages, the other destinations do not receive any messages either, because syslog-ng stops reading the source.
this is not true. syslog-ng stops sources individually when their window is full.
Why there is no messages about it in syslog-ng logs? It must be error, don't you think so? And what if I don't have flow-control enabled?
-- Best regards, Koldaev Anton
-- Best regards, Koldaev Anton
----- Original message -----
this seems to be a bug in the sql destination. 1000 seems to be the window size for your source, the queue becomes filled, but then the sql destination doesn't flush messages. or does it? Looks like it doesn't write anything after filling up to 1000. I spent a few hours on waiting for it to be flushed but got no results. log_iw_size for my source should be 20000:
syslog(ip(0.0.0.0) transport("tcp") port(5141) max-connections(200) log_iw_size(20000) flags("threaded") log_fetch_limit(100));
syslog driver divides the log-iw-size() evenly accross all permitted connections (max-connections() option), that way you get a window size of 100 for each connection.
it might also happen that it's slow. syslog-ng maxes out the queue, then stops until messages are emptied. once there are free slots it starts again: fills it up, stalls. Looks like it never gets any free slots in my configuration since I don't see any logging after the queue is maxed out for a few hours.
but you do see logging in the sql table for a while, then it stalls completely? that's interesting, and has no connection with window sizes, and seems to be a bug in sql.
Could you give me any recommendations? Looks like log_iw_size just doesn't work for my source and I have no idea how to fix it.
like I said above, it doesn't seem to be window or tuning related. the sql driver stalls, which shoudln't happen if your sql server is available. Algernon, can you have a look? I don't when I can get there :(
but you do see logging in the sql table for a while, then it stalls completely? that's interesting, and has no connection with window sizes, and seems to be a bug in sql. Yes I see logging in sql table for a while, then it stalls completely.
like I said above, it doesn't seem to be window or tuning related. the sql driver stalls, which shoudln't happen if your sql server is available. I'm able to write manually to my sql server when syslog-ng is already stalled. I've tried different mysql installations and it doesn't look like it the problem is on mysql side.
On Fri, Nov 9, 2012 at 12:06 AM, Balazs Scheidler <bazsi77@gmail.com> wrote:
**
----- Original message -----
this seems to be a bug in the sql destination. 1000 seems to be the window size for your source, the queue becomes filled, but then the sql destination doesn't flush messages. or does it? Looks like it doesn't write anything after filling up to 1000. I spent a few hours on waiting for it to be flushed but got no results. log_iw_size for my source should be 20000:
syslog(ip(0.0.0.0) transport("tcp") port(5141) max-connections(200) log_iw_size(20000) flags("threaded") log_fetch_limit(100));
syslog driver divides the log-iw-size() evenly accross all permitted connections (max-connections() option), that way you get a window size of 100 for each connection.
it might also happen that it's slow. syslog-ng maxes out the queue, then stops until messages are emptied. once there are free slots it starts again: fills it up, stalls. Looks like it never gets any free slots in my configuration since I
don't
see any logging after the queue is maxed out for a few hours.
but you do see logging in the sql table for a while, then it stalls completely? that's interesting, and has no connection with window sizes, and seems to be a bug in sql.
Could you give me any recommendations? Looks like log_iw_size just doesn't work for my source and I have no idea how to fix it.
like I said above, it doesn't seem to be window or tuning related. the sql driver stalls, which shoudln't happen if your sql server is available.
Algernon, can you have a look? I don't when I can get there :(
-- Best regards, Koldaev Anton
Anton Koldaev <koldaevav@gmail.com> writes:
like I said above, it doesn't seem to be window or tuning related. the sql driver stalls, which shoudln't happen if your sql server is available. I'm able to write manually to my sql server when syslog-ng is already stalled. I've tried different mysql installations and it doesn't look like it the problem is on mysql side.
Can you strace syslog-ng, to see what it is doing? -- |8]
Anton Koldaev <koldaevav@gmail.com> writes:
I wanted to know if syslog-ng developers has some tools like mysqltuner or just a shell scripts to check syslog-ng configuration and get some recommendations on tuning?
My bottleneck is usually not syslog-ng, so I use perf/tuning tools to whatever is on the other end (be that a database, filesystem or network). To see how much I need to tune the various syslog-ng buffers, I do load testing in a simulated environment, and base my settings on the number of dropped messages, and tune both the receiving end and syslog-ng until the drop count gets to zero during peak-like loads. So far, this method worked remarkably well, but most of my setups have reasonably low incoming log volume, most time is spent post-processing them, which I usually do outside of syslog-ng.
For example if I'm using flow-control+multiple destinations it can stop reading the source at any time and I have no idea when and why it's happening and which value should I tune.
It would be nice if syslog-ng would log an info (so that I don't need to enable debug logging on a live system) level message when flow-control kicks in (and when it stops). For bonus points, if it could tell what triggered it, and which source it applies to, that'd be great. I don't think we can do this yet, though. -- |8]
My current issue: syslog ~ % watch -d 'sudo syslog-ng-ctl stats | sort -rnk2 -t ";" | grep "_custom"' dst.sql;d_mysql_example_custom#0;mysql,10.0.0.1,3306,syslog_production,custom_example_${HO;a;stored; *1000* dst.sql;d_mysql_example_custom#0;mysql,10.0.0.1,3306,syslog_production,custom_example_${HO;a;dropped;0 dst.file;d_app_example_custom#0;/logs/example/custom.log;o;stored;0 dst.file;d_app_example_custom#0;/logs/example/custom.log;o;processed;351305 dst.file;d_app_example_custom#0;/logs/example/custom.log;o;dropped;0 destination;d_mysql_example_custom;;a;processed;331953 destination;d_app_example_custom;;a;processed;351305 It just stops to read the source after a random time(1-2-3hours) with 1000 stored statements. There are no problems at mysql destination. My current configuration: https://gist.github.com/9f5619573d2f3e9f071c I've already tried to tune all the values, it doesn't seem to help. Also I'm not able to enable debug logs due to https://bugzilla.balabit.com/show_bug.cgi?id=208 On Mon, Nov 5, 2012 at 2:32 PM, Gergely Nagy <algernon@balabit.hu> wrote:
Anton Koldaev <koldaevav@gmail.com> writes:
I wanted to know if syslog-ng developers has some tools like mysqltuner or just a shell scripts to check syslog-ng configuration and get some recommendations on tuning?
My bottleneck is usually not syslog-ng, so I use perf/tuning tools to whatever is on the other end (be that a database, filesystem or network). To see how much I need to tune the various syslog-ng buffers, I do load testing in a simulated environment, and base my settings on the number of dropped messages, and tune both the receiving end and syslog-ng until the drop count gets to zero during peak-like loads.
So far, this method worked remarkably well, but most of my setups have reasonably low incoming log volume, most time is spent post-processing them, which I usually do outside of syslog-ng.
For example if I'm using flow-control+multiple destinations it can stop reading the source at any time and I have no idea when and why it's happening and which value should I tune.
It would be nice if syslog-ng would log an info (so that I don't need to enable debug logging on a live system) level message when flow-control kicks in (and when it stops). For bonus points, if it could tell what triggered it, and which source it applies to, that'd be great.
I don't think we can do this yet, though.
-- |8]
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Best regards, Koldaev Anton
hi, ----- Original message -----
My current issue:
syslog ~ % watch -d 'sudo syslog-ng-ctl stats | sort -rnk2 -t ";" | grep "_custom"'
dst.sql;d_mysql_example_custom#0;mysql,10.0.0.1,3306,syslog_production,custom_example_${HO;a;stored; *1000* dst.sql;d_mysql_example_custom#0;mysql,10.0.0.1,3306,syslog_production,custom_example_${HO;a;dropped;0 dst.file;d_app_example_custom#0;/logs/example/custom.log;o;stored;0 dst.file;d_app_example_custom#0;/logs/example/custom.log;o;processed;351305 dst.file;d_app_example_custom#0;/logs/example/custom.log;o;dropped;0 destination;d_mysql_example_custom;;a;processed;331953 destination;d_app_example_custom;;a;processed;351305
It just stops to read the source after a random time(1-2-3hours) with 1000 stored statements. There are no problems at mysql destination. My current configuration: https://gist.github.com/9f5619573d2f3e9f071c
for some reasons sql destination has stalled, as it seems. though I've not seen such issues recently.
I've already tried to tune all the values, it doesn't seem to help.
this seems to be a bug in the sql destination. 1000 seems to be the window size for your source, the queue becomes filled, but then the sql destination doesn't flush messages. or does it? it might also happen that it's slow. syslog-ng maxes out the queue, then stops until messages are emptied. once there are free slots it starts again: fills it up, stalls.
Also I'm not able to enable debug logs due to https://bugzilla.balabit.com/show_bug.cgi?id=208
On Mon, Nov 5, 2012 at 2:32 PM, Gergely Nagy <algernon@balabit.hu> wrote:
Anton Koldaev <koldaevav@gmail.com> writes:
I wanted to know if syslog-ng developers has some tools like mysqltuner or just a shell scripts to check syslog-ng configuration and get some recommendations on tuning?
My bottleneck is usually not syslog-ng, so I use perf/tuning tools to whatever is on the other end (be that a database, filesystem or network). To see how much I need to tune the various syslog-ng buffers, I do load testing in a simulated environment, and base my settings on the number of dropped messages, and tune both the receiving end and syslog-ng until the drop count gets to zero during peak-like loads.
So far, this method worked remarkably well, but most of my setups have reasonably low incoming log volume, most time is spent post-processing them, which I usually do outside of syslog-ng.
For example if I'm using flow-control+multiple destinations it can stop reading the source at any time and I have no idea when and why it's happening and which value should I tune.
It would be nice if syslog-ng would log an info (so that I don't need to enable debug logging on a live system) level message when flow-control kicks in (and when it stops). For bonus points, if it could tell what triggered it, and which source it applies to, that'd be great.
I don't think we can do this yet, though.
-- |8]
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Best regards, Koldaev Anton
participants (3)
-
Anton Koldaev
-
Balazs Scheidler
-
Gergely Nagy