Hi Bazsi, I've started to document the grouping-by parser, and have a few questions/comments about it: * It seems that some of the grouping-by options are the same (or very similar) to the correlation-related attributes of the pattern database, but have different names. Could we name them consistently where they are the same? (I haven't checked the correlation module from Rust, but maybe we could align that as well.) For example: grouping-by | patterndb scope | context-scope timeout | context-timeout aggregate | message or action * In the original commit message, you mention three possible values for the 'scope' option, whereas the context-scope in the patterndb has four (program). Are these deliberately different, or they use the same code? * grouping-by doesn't look to me as an actual parser. From the existing objects, it resembles a filter more (IMHO), but I'd rather categorize it as something else that transforms/processes the incoming data, and should be therefore in a separate configuration object (along with the geoip parser). Robert
Hi, On Jun 6, 2016 11:17 AM, "Fekete, Róbert" <robert.fekete@balabit.com> wrote:
Hi Bazsi,
I've started to document the grouping-by parser, and have a few
questions/comments about it:
* It seems that some of the grouping-by options are the same (or very
similar) to the correlation-related attributes of the pattern database, but have different names. Could we name them consistently where they are the same? (I haven't checked the correlation module from Rust, but maybe we could align that as well.)
For example: grouping-by | patterndb scope | context-scope timeout | context-timeout aggregate | message or action
I omitted the "context" prefix on purpose, they are important in the patterndb context as rules have correllation and non correllation related groups of options. With groupingby it would be kind of redundant. I was thinking on aggregate() a lot, and decided to use something that is closer to the "groupingby" term, group by in SQL works with aggregate functions, in a sense they produce aggregates over various dimensions. In patterndb, you can generate multiple actions for a rule. Anyway, naming should probably be discussed in person.
* In the original commit message, you mention three possible values for
the 'scope' option, whereas the context-scope in the patterndb has four (program). Are these deliberately different, or they use the same code? They use the same code, so it should be the same
* grouping-by doesn't look to me as an actual parser. From the existing
objects, it resembles a filter more (IMHO), but I'd rather categorize it as something else that transforms/processes the incoming data, and should be therefore in a separate configuration object (along with the geoip parser). I agree, we currently only have parsers/rewrite/filter stuff only. It might make sense to create a more generalized concept though. I am not sure it is worth it, but we already had similar usecases where we couldnt categorize some kind of functionality. But let's test whether we can find a descriptive name for it. How would you call "generic processing" in the config? Btw, I stand by my decision that it is not a filter, it never drops messages, whereas the primary function of filters is to drop messages.
Robert
Hi, On Mon, Jun 6, 2016 at 10:30 PM, Scheidler, Balázs < balazs.scheidler@balabit.com> wrote:
Hi,
On Jun 6, 2016 11:17 AM, "Fekete, Róbert" <robert.fekete@balabit.com> wrote:
Hi Bazsi,
I've started to document the grouping-by parser, and have a few
questions/comments about it:
* It seems that some of the grouping-by options are the same (or very
similar) to the correlation-related attributes of the pattern database, but have different names. Could we name them consistently where they are the same? (I haven't checked the correlation module from Rust, but maybe we could align that as well.)
For example: grouping-by | patterndb scope | context-scope timeout | context-timeout aggregate | message or action
I omitted the "context" prefix on purpose, they are important in the patterndb context as rules have correllation and non correllation related groups of options. With groupingby it would be kind of redundant.
I think that the concept is the same, even though it is heavily based on a similar functionality.
I was thinking on aggregate() a lot, and decided to use something that is closer to the "groupingby" term, group by in SQL works with aggregate functions, in a sense they produce aggregates over various dimensions. In patterndb, you can generate multiple actions for a rule.
Anyway, naming should probably be discussed in person.
* In the original commit message, you mention three possible values
for the 'scope' option, whereas the context-scope in the patterndb has four (program). Are these deliberately different, or they use the same code?
They use the same code, so it should be the same
Ok.
* grouping-by doesn't look to me as an actual parser. From the existing
objects, it resembles a filter more (IMHO), but I'd rather categorize it as something else that transforms/processes the incoming data, and should be therefore in a separate configuration object (along with the geoip parser).
I agree, we currently only have parsers/rewrite/filter stuff only. It might make sense to create a more generalized concept though. I am not sure it is worth it, but we already had similar usecases where we couldnt categorize some kind of functionality. But let's test whether we can find a descriptive name for it. How would you call "generic processing" in the config?
Well, it might not be generic enough, but both the grouping-by and the geoip parsers add auxiliary data to a message, so in a sense the 'enrich()' the existing data. (My first idea was 'transform', but we do not transform anything directly.) Anyway, I'll try to come up with some other ideas.
Btw, I stand by my decision that it is not a filter, it never drops messages, whereas the primary function of filters is to drop messages.
Robert
participants (2)
-
Fekete, Róbert
-
Scheidler, Balázs