Hi! The idea of a generic framework to set up value pairs came up first during the afmongodb discussion (see Bazsi's ideas at http://article.gmane.org/gmane.comp.syslog-ng/10432), and while working on the tfjson driver, it came up again. I've learnt a lot since the dynamic-variables implementation in afmongodb, and following Bazsi's comments on the list earlier, we managed to talk a little about the value-pairs() feature too. That, and another quick chat with Balint, followed by working on tfjson helped form the proposal below. I'll divide this into two sections: the first part, mostly aimed at people using syslog-ng in production would be a request for comments regarding the syntax of the feature. The second part will be a few notes on the planned implementation. The Syntax ========== value-pairs ( glob-select ("usracct.*") glob-exclude ("*.*id") builtins (no) $HOST $MESSAGE "program_n_pid" = "$PROGRAM[$PID]" ) What we have here, is basically three possible ways to define - or control - value pairs: * Using a function, like glob-select and glob-exclude or builtins (more about these later) * Macro names, with the '$' sign. * "key" = "value" pairs. The way this is supposed to work is that the base set is all the various key=value pairs syslog-ng collected about a particular log message: the built-in macros ($HOST, $MESSAGE, etc), sdata, patterndb - you name it. This is the set we can select or exclude from, the set we can add new pairs to (different values for each value-pairs() instance, of course). With glob-select(), we can mark a subset of the orignal set for inclusion. Subsequent glob-exclude()s can override this, and remove parts of it from the set. Specifying builtins(no) removes every built-in key from the set. Listing known keys, unquoted, '$' prefixed, will add that value to the set, with the key set to the macro name without the dollar sign. For example, $HOST is exactly the same as "HOST" = "$HOST". Finally, "key" = "value" pairs do the obvious thing: add custom pairs to our set. The order of functions does matter, so glob-select("*") glob-exclude("*id") glob-select("usracct.id") will select usracct.id, but no other key ending with "id". This is partly due to me finding this logical, and because if/when we add other functions later on (say, ltrim() or add-prefix() or similar that modify the keys), the order becomes essential. But, back to the purpose of this all: this would be templating on steroids, so to say. Drivers should be able to easily implement support for value-pairs(), which would make system administrators to tweak the way syslog-ng discovered data will appear in the output. Particularly, this functionality is what will make the afmongodb driver shine. This is a much more advanced version of the dynamic-variables() functionality. This will also come in handy for the JSON template function. I leave the rest up to your imagination. Feedback on the syntax would be most appreciated. The Design ========== Value pairs would be stored in a struct like the following: typedef struct { GPatternSpec *pattern; gboolean exclude; } ValuePairPattern; typedef struct { ValuePairPattern **patterns; gboolean builtins; NVRegistry extras_registry; NVTable extras; } ValuePairs; ValuePairs->patterns would be a NULL terminated array (we can allow for a lot of reallocs during init, imo - otherwise we can make it a GPtrArray or something similar to save time on that), where each pattern is either an exclude or a select pattern. The ->builtins boolean would flag whether we should include the builtin variables ($HOST, etc), and would default to TRUE. And ->extras are the extra keys specified explicitly: both in $MACRO and "key"="value" form (the former translated to "MACRO"="$MACRO" form). We'd have the following support API: ValuePairs *value_pairs_new (); void value_pairs_add_pattern (ValuePairs *vp, const gchar *pattern, gboolean exclude); void value_pairs_set_builtins (ValuePairs *vp, gboolean state); void value_pairs_add_extra (ValuePairs *vp, const gchar *key, const gchar *value); void value_pairs_foreach (ValuePairs *vp, VPForeachFunc *func, gpointer user_data); typedef gboolean (*VPForeachFunc) (const gchar *name, const gchar *value, gssize value_len, gpointer user_data); We'd also have a VALUE_PAIRS keyword in the parser (if it is possible, which I think it is), which drivers could use. The basic VALUE_PAIRS stuff would set up a ValuePairs structure, and store it in a global variable, which the driver could then either copy, or just assign it to it's own (since the main parser won't free or touch it anymore anyway). Then, all the drivers have to do, is call value_pairs_foreach(), which would firs iterate over logmsg_registry's keys, taking the specified patterns into account. Then it would iterate over it's own extras, and pass the results to the foreach function one by one, similar to how nv_table_foreach does. That's about it. It's a bit vague, mostly because I did not research if this is possible the way I imagine it yet. Hopefully this all made at least some sense. -- |8]
Gergely Nagy wrote:
Hi!
The idea of a generic framework to set up value pairs came up first during the afmongodb discussion (see Bazsi's ideas at http://article.gmane.org/gmane.comp.syslog-ng/10432), and while working on the tfjson driver, it came up again.
I've learnt a lot since the dynamic-variables implementation in afmongodb, and following Bazsi's comments on the list earlier, we managed to talk a little about the value-pairs() feature too. That, and another quick chat with Balint, followed by working on tfjson helped form the proposal below.
I'll divide this into two sections: the first part, mostly aimed at people using syslog-ng in production would be a request for comments regarding the syntax of the feature.
The Syntax ==========
value-pairs ( glob-select ("usracct.*") glob-exclude ("*.*id") builtins (no) $HOST $MESSAGE "program_n_pid" = "$PROGRAM[$PID]" )
I would like to have the select/exclude take a style and an expression so that the use of gnu regex could be used (if supported by syslog-ng) or perl regex or glob. Perhaps only glob and perl regex are supported now but there may be a faster regular expression tool that becomes available in the future that could be added without breaking backwards compatibility. Something of the format value-pairs ( select ( style="pcre" pattern="^usracct\." ) select ( style="glob" pattern="useracct.*") The use of $HOST or any other macro should always refer to the content of the macro, so for the purpose of identifying macros by name, a syntax macro (HOST MESSAGE) could be used. I think this is more intuitive than using the $HOST names. Finally, each of these value-pairs definitions is of the form keyword ( arguments ) so for consistency I would suggest that defining custom keys should be done with something of the format define ( "program_n_pid", "$PROGRAM[$PID]") for complete orthogonal consistency this should be define ( macro="program_n_pid", value="$PROGRAM[$PID]") but that may be a little bit of overkill. Evan.
value-pairs ( glob-select ("usracct.*") glob-exclude ("*.*id") builtins (no) $HOST $MESSAGE "program_n_pid" = "$PROGRAM[$PID]" )
I would like to have the select/exclude take a style and an expression so that the use of gnu regex could be used (if supported by syslog-ng) or perl regex or glob. Perhaps only glob and perl regex are supported now but there may be a faster regular expression tool that becomes available in the future that could be added without breaking backwards compatibility.
Part of the reason for the glob- is precisely due to this reason: so that other -select/-exclude styles can be added. Instead of what you propose, we'd have pcre-select() or xpath-select() or whatever else there is need for.
The use of $HOST or any other macro should always refer to the content of the macro, so for the purpose of identifying macros by name, a syntax
macro (HOST MESSAGE)
could be used. I think this is more intuitive than using the $HOST names.
That makes sense, thank you!
Finally, each of these value-pairs definitions is of the form
keyword ( arguments )
so for consistency I would suggest that defining custom keys should be done with something of the format
define ( "program_n_pid", "$PROGRAM[$PID]")
for complete orthogonal consistency this should be
define ( macro="program_n_pid", value="$PROGRAM[$PID]")
but that may be a little bit of overkill.
Yeah, it would be. :) I'd rather have something that might be a little bit inconsistent (eg, the glob-* stuff being the exception), yet expressive and not overly long, than something that's consistent, but too verbose. I do see your point, though, and thanks a lot for the ideas! -- |8]
Gergely Nagy wrote:
value-pairs ( glob-select ("usracct.*") glob-exclude ("*.*id") builtins (no) $HOST $MESSAGE "program_n_pid" = "$PROGRAM[$PID]" )
I would like to have the select/exclude take a style and an expression so that the use of gnu regex could be used (if supported by syslog-ng) or perl regex or glob. Perhaps only glob and perl regex are supported now but there may be a faster regular expression tool that becomes available in the future that could be added without breaking backwards compatibility.
Part of the reason for the glob- is precisely due to this reason: so that other -select/-exclude styles can be added.
Instead of what you propose, we'd have pcre-select() or xpath-select() or whatever else there is need for.
The use of $HOST or any other macro should always refer to the content of the macro, so for the purpose of identifying macros by name, a syntax
macro (HOST MESSAGE)
could be used. I think this is more intuitive than using the $HOST names.
That makes sense, thank you!
Finally, each of these value-pairs definitions is of the form
keyword ( arguments )
so for consistency I would suggest that defining custom keys should be done with something of the format
define ( "program_n_pid", "$PROGRAM[$PID]")
for complete orthogonal consistency this should be
define ( macro="program_n_pid", value="$PROGRAM[$PID]")
but that may be a little bit of overkill.
Yeah, it would be. :)
I'd rather have something that might be a little bit inconsistent (eg, the glob-* stuff being the exception), yet expressive and not overly long, than something that's consistent, but too verbose.
Currently syslog-ng uses a type() option to specify glob/pcre/posix and so on, with posix being the default. Could we have a single select fuction and use the type option, just for the sake of consistency? Robert
I do see your point, though, and thanks a lot for the ideas!
On Tue, 2011-01-25 at 09:56 +0100, Fekete Robert wrote:
Currently syslog-ng uses a type() option to specify glob/pcre/posix and so on, with posix being the default. Could we have a single select fuction and use the type option, just for the sake of consistency?
Oh, I see. In this case, yes, select(".*" type(glob)) (or however syslog-ng handles this at the moment) it will be. -- |8]
I realize that this is not specific to syslog-ng, but was not sure where to turn. Starting with a fairly recent linux kernel (perhaps 2.6.30) the messages sent via /dev/kmsg have the format [167007.168017] ... some text ... where these numbers seem to be seconds.microseconds since the last boot. Is there any way to configure the linux kernel to NOT put this detail in messages? Thanks.
On Wed, Jan 26, 2011 at 01:11:42PM -0800, Evan Rempel wrote:
I realize that this is not specific to syslog-ng, but was not sure where to turn.
Starting with a fairly recent linux kernel (perhaps 2.6.30) the messages sent via /dev/kmsg have the format
[167007.168017] ... some text ...
where these numbers seem to be seconds.microseconds since the last boot.
Is there any way to configure the linux kernel to NOT put this detail in messages?
Thanks.
At least it used to be possible to do so. Try looking here in the kernel configuration utilities. Perhaps if you read the code you might find a sysctl to do it without rebuilding. Kernel hacking [*] Show timing information on printks Matthew.
Add printk.time=0 as an option in your grub.conf file (Assuming you are using grub.) -----Original Message----- From: Evan Rempel [mailto:erempel@uvic.ca] Sent: Wednesday, January 26, 2011 4:12 PM To: Syslog-ng users' and developers' mailing list Subject: [syslog-ng] New Linux Kernel message hi resolution timestamps. I realize that this is not specific to syslog-ng, but was not sure where to turn. Starting with a fairly recent linux kernel (perhaps 2.6.30) the messages sent via /dev/kmsg have the format [167007.168017] ... some text ... where these numbers seem to be seconds.microseconds since the last boot. Is there any way to configure the linux kernel to NOT put this detail in messages? Thanks. ____________________________________________________________________________ __ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
Hi again! Last time (http://article.gmane.org/gmane.comp.syslog-ng/10542) I described the way I imagine value_pairs(), both the syntax, and the implementation. Today, I post here because there's a few bits and pieces of code to show: both the generic value_pairs() framework implemented, and a modified afmongodb() driver that uses it, and a modified tfjson template function aswell! There's a few rough edges to sort out still, but it's already usable (see the end of this post for links). Lets have a look at the afmongodb driver! Example: afmongodb ================== destination d_mongo { mongodb(); }; The default settings and all, will include every key-value pair syslog-ng knows about: { "HOST" : "localhost", "HOST_FROM" : "localhost", "LEGACY_MSGHDR" : "sshd[19868]: ", "MESSAGE" : "Accepted publickey for algernon from ::1 port 42248 ssh2", "PID" : "19868", "PROGRAM" : "sshd", "SOURCE" : "s_network", "_id" : ObjectId("4d4418efaef570564d000013"), "classifier" : { "class" : "system", "rule_id" : "4dd5a329-da83-4876-a431-ddcb59c2858c" }, "secevt" : { "verdict" : "ACCEPT" }, "usracct" : { "application" : "sshd", "authmethod" : "publickey", "device" : "::1", "service" : "ssh2", "sessionid" : "19868", "type" : "login", "username" : "algernon" } } All keys equal, the dynamic keys appear on the same level as every other one (unlike the previous solution in mongodb, which put dynamic keys into their own namespace). However, there's a few things here we're not all that interested in, and we'd like to limit the keys to the dynamic keys only, plus some builtins we explicitly specify. We can do that! destination d_mongo { mongodb( value_pairs(builtins(no) select("*") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) ); }; This will include all dynamic values, along with the $HOST and $MESSAGE builtins, and two extra keys: PROGRAM and TIMESTAMP: { "HOST" : "localhost", "MESSAGE" : "Accepted publickey for algernon from ::1 port 47932 ssh2", "PROGRAM" : "sshd[20839]", "TIMESTAMP" : "1296308990", "_id" : ObjectId("4d441afea7423bc050000013"), "classifier" : { "class" : "system", "rule_id" : "4dd5a329-da83-4876-a431-ddcb59c2858c" }, "secevt" : { "verdict" : "ACCEPT" }, "usracct" : { "application" : "sshd", "authmethod" : "publickey", "device" : "::1", "service" : "ssh2", "sessionid" : "20839", "type" : "login", "username" : "algernon" } } That's all nice and good, but we don't really care about classifier.rule_id, do we? destination d_mongo { mongodb( value_pairs(builtins(no) select("*") exclude(".classifier.rule_id") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) ); }; And this will do exactly what it says: skip builtins, select everything that is left, and exclude ".classifier.rule_id" from that, and then add a few extra stuff on our own. Of course, one can have multiple selects and excludes, any number of macros listed, and any number of (key value) pairs. In the future, if and when new types of selects (for example, POSIX or PCRE regexps) are implemented, the select and exclude statements will grow a type() option aswell. TODO ==== There's a few things still in limbo regarding the syntax, however: * Evan Rempel's comment that "$HOST" and the like should always refer to the value of the macro is convincing, so that part of the syntax will change most probably. My current favourite ide is to list the macro name as-is, without quotes, without the dollar sign: value-pairs(HOST MESSAGE ("foo" "bar") ...) But maybe his macro(HOST MESSAGE) suggestion would be better... We'll see, I guess! * The way key-value pairs can be specified looks iffy. Something like pair("key", "$VALUE") would look better. * The parser needs to be usable by - for example - template functions aswell. It's trivial to add value_pairs() support for any destination driver (or source driver - anything that has a grammar file in the sources can easily use this functionality), but at the moment template functions cannot, and tfjson has its own, GOption-based parser, which isn't 100% compatible with the bison/flex parser. For example, the bison/flex parser keeps the order of select() and exclude() statements, so one can do exclude("*") select("usracct.*) and that will select everything under "usracct", but nothing else. With tfjson's GOption-based parser, all selects are evaluated first, and then the excludes, so the above example would lead to no keys selected at all. The Links ========= My git tree: git://git.madhouse-project.org/syslog-ng/syslog-ng-3.3.git (or on the web: http://git.madhouse-project.org/syslog-ng/syslog-ng-3.3/) The interesting branches are: * work/value-pairs/base: The basic value-pairs implementation, without parser support. * work/value-pairs/grammar: bison/flex parser built upon the ValuePairs base. * work/tfjson/value-pairs: tfjson ported to ValuePairs. The syntax is the same as before (see my previous mail about tfjson). * work/afmongodb-vp: afmongodb ported to ValuePairs. Example syntax can be seen above. -- |8]
From: syslog-ng-bounces@lists.balabit.hu [syslog-ng-bounces@lists.balabit.hu] On Behalf Of Gergely Nagy [algernon@balabit.hu] Sent: Saturday, January 29, 2011 6:14 AM To: Syslog-ng users' and developers' mailing list Subject: [syslog-ng] [RFC]: value_pairs() demo
[...snip...]
destination d_mongo { mongodb( value_pairs(builtins(no) select("*") exclude(".classifier.rule_id") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) ); };
And this will do exactly what it says: skip builtins, select everything that is left, and exclude ".classifier.rule_id" from that, and then add a few extra stuff on our own.
I think that the "builtin(no)" option should be abandon in favour of something else. It is really nothing more than a power-select or power-exclude but it does not honour the order requirement of the select/exclude options. In the above example you have excluded the built in macros but then used a select("*") which implies adding everything back in. If you had done these in the oposite order, what semantic would be intended. It is unclear to me what is defined as a builtin macro and which ones are not. It is also unclear where the $UNIXTIME came from since it was not shown at all in the example that apparently incleded everything. Perhaps just relying on the select/exclude (which should probably be renamed to include/exclude) would be sufficient since in most cases at least some of the builtin macros will be desired and like in your example where you included the $HOST and $MESSAGE it would have been almost as easy to merely exclude the others by name and not use the builtin option at all. Just my $0.02 Evan Rempel.
On Sat, 2011-01-29 at 07:40 -0800, Evan Rempel wrote:
destination d_mongo { mongodb( value_pairs(builtins(no) select("*") exclude(".classifier.rule_id") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) ); };
And this will do exactly what it says: skip builtins, select everything that is left, and exclude ".classifier.rule_id" from that, and then add a few extra stuff on our own.
I think that the "builtin(no)" option should be abandon in favour of something else.
In my opinion, it'd be better to clarify what builtin() is for. At the moment, there's a short list of builtin macros: HOST, HOST_FROM, MESSAGE, PROGRAM, PID, MSGID, SOURCE, LEGACY_MSGHDR (defined in lib/logmsg.c), and there's a few standard macros, like $UNIXTIME. By default, the standard macros that are not part of the builtins, will not be included unless explicitly requested, which is a shame, and that's what makes builtins() confusing, imo. If builtins() dealt with the standard macros, it'd be much easier - and I plan to figure out how to do just that. That will also affect select() and exclude() too. Perhaps it can be renamed to builtin-macros() then?
It is really nothing more than a power-select or power-exclude but it does not honour the order requirement of the select/exclude options.
Yep, and that's by design. There's a priority among the selectors: explicit selects ("$HOST", "$MESSAGE" and key-value pairs) are the highest, followed by builtins() and select()/exclude() on the lowest priority. Thus, if one turns builtins() off, one can still explicitly add key-value pairs that use builtin stuff. Likewise, if any builtins are excluded, they can still be explicitly added, however, since builtins() has higher priority than select()/exclude(), if they're turned off, select()/exclude() will not see them at all.
In the above example you have excluded the built in macros but then used a select("*") which implies adding everything back in. If you had done these in the oposite order, what semantic would be intended.
That's due to the explicit > builtins() > select/exclude priority order.
It is unclear to me what is defined as a builtin macro and which ones are not.
Indeed, it is unclear - even to me. I plan to fix that, though (see above).
It is also unclear where the $UNIXTIME came from since it was not shown at all in the example that apparently incleded everything.
Yep, unfortunately the way macros and builtins are handled in syslog-ng is a bit... unclear, and chaotic. I'm trying to figure out an easy way to fix this, and make builtins() include all of the built-in macros, including $UNIXTIME and the rest.
Perhaps just relying on the select/exclude (which should probably be renamed to include/exclude) would be sufficient since in most cases at least some of the builtin macros will be desired and like in your example where you included the $HOST and $MESSAGE it would have been almost as easy to merely exclude the others by name and not use the builtin option at all.
The problem with that, is that there's no other easy way to exclude all of the builtin macros, which might be preferable in some cases. Thanks a lot for the detailed feedback by the way, it's most appreciated! -- |8]
Gergely Nagy wrote:
On Sat, 2011-01-29 at 07:40 -0800, Evan Rempel wrote:
destination d_mongo { mongodb( value_pairs(builtins(no) select("*") exclude(".classifier.rule_id") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) ); };
And this will do exactly what it says: skip builtins, select everything that is left, and exclude ".classifier.rule_id" from that, and then add a few extra stuff on our own. I think that the "builtin(no)" option should be abandon in favour of something else.
In my opinion, it'd be better to clarify what builtin() is for. At the moment, there's a short list of builtin macros:
HOST, HOST_FROM, MESSAGE, PROGRAM, PID, MSGID, SOURCE, LEGACY_MSGHDR (defined in lib/logmsg.c), and there's a few standard macros, like $UNIXTIME.
By default, the standard macros that are not part of the builtins, will not be included unless explicitly requested, which is a shame, and that's what makes builtins() confusing, imo.
If builtins() dealt with the standard macros, it'd be much easier - and I plan to figure out how to do just that. That will also affect select() and exclude() too.
Perhaps it can be renamed to builtin-macros() then?
It is really nothing more than a power-select or power-exclude but it does not honour the order requirement of the select/exclude options.
Yep, and that's by design. There's a priority among the selectors: explicit selects ("$HOST", "$MESSAGE" and key-value pairs) are the highest, followed by builtins() and select()/exclude() on the lowest priority.
Thus, if one turns builtins() off, one can still explicitly add key-value pairs that use builtin stuff. Likewise, if any builtins are excluded, they can still be explicitly added, however, since builtins() has higher priority than select()/exclude(), if they're turned off, select()/exclude() will not see them at all.
This is a good example of why things are so confusing. The paragraph above is contradictory within itself. On one hand you state that if "one builtins() off, once can still explictitly add key-valued pairs" and on the other hand you state "if they're turned off, select()/exclude() will not see them at all". I realize that you need to add the builtins using something other than select, but that seems confusing too.
In the above example you have excluded the built in macros but then used a select("*") which implies adding everything back in. If you had done these in the oposite order, what semantic would be intended.
That's due to the explicit > builtins() > select/exclude priority order.
It is unclear to me what is defined as a builtin macro and which ones are not.
Indeed, it is unclear - even to me. I plan to fix that, though (see above).
It is also unclear where the $UNIXTIME came from since it was not shown at all in the example that apparently incleded everything.
Yep, unfortunately the way macros and builtins are handled in syslog-ng is a bit... unclear, and chaotic. I'm trying to figure out an easy way to fix this, and make builtins() include all of the built-in macros, including $UNIXTIME and the rest.
Perhaps just relying on the select/exclude (which should probably be renamed to include/exclude) would be sufficient since in most cases at least some of the builtin macros will be desired and like in your example where you included the $HOST and $MESSAGE it would have been almost as easy to merely exclude the others by name and not use the builtin option at all.
The problem with that, is that there's no other easy way to exclude all of the builtin macros, which might be preferable in some cases.
I am not so much concerned with making the process of excluding all of the built in macros easy, but am much more concerned with making the syntax consistent, deterministic and obvious so that when writing, on probably more importantly for trouble shooting, when reading the configuration file. Creating a system where there is a priority of options, and then having the order define the priority for some of the options makes a system that is more easily misunderstood. destination d_mongo { mongodb( value_pairs(builtins(no) select("*") exclude(".classifier.rule_id") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) ); }; would seem to be very different from destination d_mongo { mongodb( value_pairs(select("*") exclude(".classifier.rule_id") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) builtins(no) ); }; but given the proposal that "builtins(no)" has highest priority, these two examples actually have the same semantics. And this just won't do what you expect either. destination d_mongo { mongodb( value_pairs(select("*") exclude(".classifier.rule_id") select("HOST") select("MESSAGE") ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) builtins(no) ); };
Thanks a lot for the detailed feedback by the way, it's most appreciated!
That's nice. Sometimes I am told I am argumentative :-( -- Evan Rempel
On Tue, 2011-02-01 at 08:29 -0800, Evan Rempel wrote:
It is really nothing more than a power-select or power-exclude but it does not honour the order requirement of the select/exclude options.
Yep, and that's by design. There's a priority among the selectors: explicit selects ("$HOST", "$MESSAGE" and key-value pairs) are the highest, followed by builtins() and select()/exclude() on the lowest priority.
Thus, if one turns builtins() off, one can still explicitly add key-value pairs that use builtin stuff. Likewise, if any builtins are excluded, they can still be explicitly added, however, since builtins() has higher priority than select()/exclude(), if they're turned off, select()/exclude() will not see them at all.
This is a good example of why things are so confusing. The paragraph above is contradictory within itself.
On one hand you state that if "one builtins() off, once can still explictitly add key-valued pairs" and on the other hand you state "if they're turned off, select()/exclude() will not see them at all".
Yep, it is a little confusing if you look at it that way. But I don't see the contradiction: select()/exclude() are not the same as explicit specification (either in the (current) form of "$HOST" or ("host" "$HOST")). The former will not see them, but one can add them back explicitly. Even if we find a way to get rid of builtins(), this priority will remain nevertheless, because I do need a way to include value pairs that were formerly excluded - if for nothing else, for convenience's sake. (That, and the implementation is a lot simpler this way :P)
I realize that you need to add the builtins using something other than select, but that seems confusing too.
Hopefully documentation will be able to help with that :) (Myself, I'm terrible at documentation, sadly)
It is also unclear where the $UNIXTIME came from since it was not shown at all in the example that apparently incleded everything.
Yep, unfortunately the way macros and builtins are handled in syslog-ng is a bit... unclear, and chaotic. I'm trying to figure out an easy way to fix this, and make builtins() include all of the built-in macros, including $UNIXTIME and the rest.
Perhaps just relying on the select/exclude (which should probably be renamed to include/exclude) would be sufficient since in most cases at least some of the builtin macros will be desired and like in your example where you included the $HOST and $MESSAGE it would have been almost as easy to merely exclude the others by name and not use the builtin option at all.
The problem with that, is that there's no other easy way to exclude all of the builtin macros, which might be preferable in some cases.
I am not so much concerned with making the process of excluding all of the built in macros easy, but am much more concerned with making the syntax consistent, deterministic and obvious so that when writing, on probably more importantly for trouble shooting, when reading the configuration file.
Creating a system where there is a priority of options, and then having the order define the priority for some of the options makes a system that is more easily misunderstood.
True enough, but at the moment, I don't really see a better solution.
destination d_mongo { mongodb( value_pairs(builtins(no) select("*") exclude(".classifier.rule_id") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) ); };
would seem to be very different from
destination d_mongo { mongodb( value_pairs(select("*") exclude(".classifier.rule_id") "$HOST" "$MESSAGE" ("PROGRAM" "$PROGRAM[$PID]") ("TIMESTAMP" "$UNIXTIME")) builtins(no) ); };
but given the proposal that "builtins(no)" has highest priority, these two examples actually have the same semantics.
Yeah, I see your point. Hmm... perhaps something like: exclude(builtins) (and similarly, select(builtins), for what it's worth) could be added, that has the same priority as select()/exclude()? Or, come to think of it, I might be able to beat some sense into builtins() and have it on the same priority as select() and exclude()... We'd still have the "$HOST" and ("key" "value") stuff on another level, though, and mixing those in would be quite a challenge to implement.
Thanks a lot for the detailed feedback by the way, it's most appreciated!
That's nice. Sometimes I am told I am argumentative :-(
Sometimes it's good to be argumentative >;) -- |8]
participants (5)
-
Evan Rempel
-
Fekete Robert
-
Gergely Nagy
-
Matthew Hall
-
w3euu