log statistics level
When Syslog-ng logs its hourly statistics, it uses level NOTICE. This seems like a classic case for INFO instead. NOTICE is an event of which someone might want to take notice, e.g. "nightly backup has started" or "RAID rebuild is complete." INFO is just information someone might want to refer to later. -- Bryan Henderson San Jose, California
On Sun, 2007-03-04 at 17:38 +0000, Bryan Henderson wrote:
When Syslog-ng logs its hourly statistics, it uses level NOTICE. This seems like a classic case for INFO instead.
NOTICE is an event of which someone might want to take notice, e.g. "nightly backup has started" or "RAID rebuild is complete." INFO is just information someone might want to refer to later.
I see no point in changing this separately, one would need to go through all internal messages and recheck that their priorities are properly set. There are currently about 111 messages in syslog-ng (probably a bit less, my grep pattern was not perfect). I did this audit and committed a change and I also changed the "Log statistics" priority to info. You can find the results in tomorrow's snapshot, however I'd appreciate if you could check my judgement and see if I did something wrong. Thanks in advance. -- Bazsi
You can find the results in tomorrow's snapshot, however I'd appreciate if you could check my judgement and see if I did something wrong.
In main.c, there are four messages generated at NOTICE level that I think are really INFO: syslog-ng starting up SIGHUP received, reloading configuration Termination requested via signal, terminating syslog-ng shutting down The reason these aren't worthy of notice is that the system administrator already knows about these events; he ordered them. Something worthy of notice is something that happens independently. While almost all the error reports are issued at ERROR level, I found these that seem to be the same kind of errors, but have the lower NOTICE level: cfg_lex.c: unknown parse flag cfg_grammar.c: The value specified for time_sleep is too large. And here's one that's CRITICAL: macros.c: Internal error, unknown macro referenced Isn't this just another error? If a result of this is that Syslog fails completely, it would be worth an ALERT, but then the message should mention that Syslog failed completely. CRITICAL is just for when an entire major system is in imminent danger of collapse and needs immediate attention. I think a system could continue useful work for quite a while without Syslog service. In afsocket.c, "Number of allowed concurrent connections exceeded" is an ERROR. If this is what I think it is, it's not a case of Syslog being broken, but nonetheless something that might need attention, so I would call it WARNING or NOTICE. And in affile.c, "Destination file is too old, removing" is to me business as usual, not something someone is likely to want to respond to, so I would make it INFO. -- Bryan Henderson San Jose, California
Bryan Henderson wrote:
You can find the results in tomorrow's snapshot, however I'd appreciate if you could check my judgement and see if I did something wrong.
In main.c, there are four messages generated at NOTICE level that I think are really INFO:
syslog-ng starting up SIGHUP received, reloading configuration Termination requested via signal, terminating syslog-ng shutting down
The reason these aren't worthy of notice is that the system administrator already knows about these events; he ordered them.
That's why I need to notice them, in case I didn't order them. If I were to break into a box, one of the first things I would do is compromise the logging mechanism :-)
Something worthy of notice is something that happens independently.
While almost all the error reports are issued at ERROR level, I found these that seem to be the same kind of errors, but have the lower NOTICE level:
cfg_lex.c: unknown parse flag cfg_grammar.c: The value specified for time_sleep is too large.
And here's one that's CRITICAL:
macros.c: Internal error, unknown macro referenced
Isn't this just another error? If a result of this is that Syslog fails completely, it would be worth an ALERT, but then the message should mention that Syslog failed completely. CRITICAL is just for when an entire major system is in imminent danger of collapse and needs immediate attention. I think a system could continue useful work for quite a while without Syslog service.
Again I would think that the logging system is more important than other applications. System logging needs to be a guaranteed thing, afte all, isn't that why were using syslog-ng in the first place :-)
In afsocket.c, "Number of allowed concurrent connections exceeded" is an ERROR. If this is what I think it is, it's not a case of Syslog being broken, but nonetheless something that might need attention, so I would call it WARNING or NOTICE.
I would say this should be an alert because something isn't working until a human interceeds. I don't want mere notices that one of 500 hosts can't log to my central server. I want that fixed ASAP.
And in affile.c, "Destination file is too old, removing" is to me business as usual, not something someone is likely to want to respond to, so I would make it INFO.
Any time data is overwritten/removed/lost I want to know about it. Perhaps a warning so that if I need to I can retrieve the log file from backup and place it into the log archive stream. INFO is something that should only be looked at during an incident followup. This could be trouble shooting, internal audit, legislated laws or sopeona. WARNING is something that I should know about, but isn't reporting that something is broken. I guess it is fairly obvious that many people would like diffrent logging levels for different types of events. The safest approach is to use the description from the syslog RFC documentation. I did not take any care to fit my descriptions of the logging levels to the RFC document. Evan.
And here's one that's CRITICAL:
macros.c: Internal error, unknown macro referenced
Isn't this just another error? If a result of this is that Syslog fails completely, it would be worth an ALERT, but then the message should mention that Syslog failed completely. CRITICAL is just for when an entire major system is in imminent danger of collapse and needs immediate attention. I think a system could continue useful work for quite a while without Syslog service. Again I would think that the logging system is more important than other applications. System logging needs to be a guaranteed thing, afte all, isn't that why were using syslog-ng in the first place :-)
That still leaves the issue that this message doesn't say Syslog service is stopping. I don't know if Syslog does in fact terminate as a consequence of this error, but I do know that there are about a hundred other messages that indicate Syslog isn't logging what it's supposed to and this is the only message I could find in all of syslog-ng system that is CRITICAL.
And in affile.c, "Destination file is too old, removing" is to me business as usual, not something someone is likely to want to respond to, so I would make it INFO.
Any time data is overwritten/removed/lost I want to know about it. Perhaps a warning so that if I need to I can retrieve the log file from backup and place it into the log archive stream.
Would you want a warning when you type an 'rm' command that data has been lost? How about when your log rotater purges data according to your retention policy? What does this message mean, anyway? I thought it just indicated that Syslog did what it's supposed to do.
WARNING is something that I should know about, but isn't reporting that something is broken.
That's NOTICE too. WARNING is something that might be broken, but the message issuer can't tell. Or might be broken soon. -- Bryan Henderson San Jose, California
FYI, here is the definition from RFC3164: 0 Emergency: system is unusable 1 Alert: action must be taken immediately 2 Critical: critical conditions 3 Error: error conditions 4 Warning: warning conditions 5 Notice: normal but significant condition 6 Informational: informational messages 7 Debug: debug-level messages The explanations for critical, error and warning are not exactly stellar. At least it clarifies, that critical does *not* require immediate action as does an alert. Other systems (and the english language?) would have it the other way around. Anyway, within the syslog system we should stick to syslog-speak. And yes, it makes sense to think carefully about assigning severities. And no, they should not change once rolled out. Just my 2c, Peter Klausner
And no, they should not change once rolled out.
I can't see the logic in that. If the levels mean something, then if a message is issued with the wrong level, that's a bug. Like any bug, there are rare situations where the fact that existing users have adjusted to the bug means the bug has to be redefined as correct. But when you do that, you're trading the needs of these users for the needs of all the users who have not made such adjustments, including all new users forever. As this discussion has shown, the levels are pretty ambiguous in this program, so there's a lot of leeway in choosing them. I agree with giving history a lot of weight in making such arbitrary selections. -- Bryan Henderson San Jose, California
On Sun, 2007-03-11 at 03:04 +0000, Bryan Henderson wrote:
You can find the results in tomorrow's snapshot, however I'd appreciate if you could check my judgement and see if I did something wrong.
In main.c, there are four messages generated at NOTICE level that I think are really INFO:
syslog-ng starting up SIGHUP received, reloading configuration syslog-ng shutting down
I would disagree here, startup/shutdown/config reload messages are quite significant.
Termination requested via signal, terminating
That's ok, the shutting down message is generated, the reason why it got terminated is less important.
The reason these aren't worthy of notice is that the system administrator already knows about these events; he ordered them. Something worthy of notice is something that happens independently.
While almost all the error reports are issued at ERROR level, I found these that seem to be the same kind of errors, but have the lower NOTICE level:
cfg_lex.c: unknown parse flag
fixed.
cfg_grammar.c: The value specified for time_sleep is too large.
time_sleep is maximized to 500msec in this case, but syslog-ng goes on. It is worth fixing up the configuration, but does not qualify as an error.
And here's one that's CRITICAL:
macros.c: Internal error, unknown macro referenced
this is an internal error, and might just be converted to a failed assertion. It should really never happen. I've converted it to g_assert_not_reached().
Isn't this just another error? If a result of this is that Syslog fails completely, it would be worth an ALERT, but then the message should mention that Syslog failed completely. CRITICAL is just for when an entire major system is in imminent danger of collapse and needs immediate attention. I think a system could continue useful work for quite a while without Syslog service.
I've completely removed the message, it is a situation that should never be triggered. It is worth a crash if it is. (there are other cases with asserts like this in the code).
In afsocket.c, "Number of allowed concurrent connections exceeded" is an ERROR. If this is what I think it is, it's not a case of Syslog being broken, but nonetheless something that might need attention, so I would call it WARNING or NOTICE.
I think that's worth an error as it means that an application, trying to send a log message failed to do so. It means that the system is losing messages.
And in affile.c, "Destination file is too old, removing" is to me business as usual, not something someone is likely to want to respond to, so I would make it INFO.
Sure, this is not very important, syslog-ng removes the file that was requested using overwrite_if_older(). I made this INFO and clarified the text. Here's the patch I've just commited: --- orig/src/affile.c +++ mod/src/affile.c @@ -242,8 +242,9 @@ affile_dw_init(LogPipe *s, GlobalConfig stat(self->filename->str, &st) == 0 && st.st_mtime < time(NULL) - self->owner->overwrite_if_older) { - msg_notice("Destination file is too old, removing", + msg_info("Destination file is older than overwrite_if_older(), overwriting", evt_tag_str("filename", self->filename->str), + evt_tag_int("overwrite_if_older", self->owner->overwrite_if_older), NULL); unlink(self->filename->str); } --- orig/src/cfg-lex.l +++ mod/src/cfg-lex.l @@ -284,7 +284,7 @@ lookup_parse_flag(char *flag) return LRO_NOPARSE; if (strcmp(flag, "kernel") == 0) return LRO_KERNEL; - msg_notice("Unknown parse flag", evt_tag_str("flag", flag), NULL); + msg_error("Unknown parse flag", evt_tag_str("flag", flag), NULL); return 0; } --- orig/src/macros.c +++ mod/src/macros.c @@ -441,8 +441,8 @@ log_macro_expand(GString *result, gint i break; } default: - msg_fatal("Internal error, unknown macro referenced;", NULL); - return FALSE; + g_assert_not_reached(); + break; } return TRUE; } --- orig/src/messages.h +++ mod/src/messages.h @@ -33,6 +33,7 @@ extern int verbose_flag; #define msg_fatal(desc, tag1, tags...) msg_event_send(msg_event_create(EVT_PRI_CRIT, desc, tag1, ##tags )) #define msg_error(desc, tag1, tags...) msg_event_send(msg_event_create(EVT_PRI_ERR, desc, tag1, ##tags )) #define msg_notice(desc, tag1, tags...) msg_event_send(msg_event_create(EVT_PRI_NOTICE, desc, tag1, ##tags )) +#define msg_info(desc, tag1, tags...) msg_event_send(msg_event_create(EVT_PRI_INFO, desc, tag1, ##tags )) #define msg_verbose(desc, tag1, tags...) \ do { \ -- Bazsi
participants (4)
-
Balazs Scheidler
-
bryanh@giraffe-data.com
-
Evan Rempel
-
Peter Klausner