grouping-by parser questions

Fekete, Róbert

6 Jun 2016 6 Jun '16

11:16 a.m.

Hi Bazsi, I've started to document the grouping-by parser, and have a few questions/comments about it: * It seems that some of the grouping-by options are the same (or very similar) to the correlation-related attributes of the pattern database, but have different names. Could we name them consistently where they are the same? (I haven't checked the correlation module from Rust, but maybe we could align that as well.) For example: grouping-by | patterndb scope | context-scope timeout | context-timeout aggregate | message or action * In the original commit message, you mention three possible values for the 'scope' option, whereas the context-scope in the patterndb has four (program). Are these deliberately different, or they use the same code? * grouping-by doesn't look to me as an actual parser. From the existing objects, it resembles a filter more (IMHO), but I'd rather categorize it as something else that transforms/processes the incoming data, and should be therefore in a separate configuration object (along with the geoip parser). Robert

Attachments:

attachment.html (text/html — 1.3 KB)

Show replies by date

Scheidler, Balázs

6 Jun 6 Jun

10:30 p.m.

Hi, On Jun 6, 2016 11:17 AM, "Fekete, Róbert" <robert.fekete@balabit.com> wrote:

...

Hi Bazsi,

I've started to document the grouping-by parser, and have a few

questions/comments about it:

...

* It seems that some of the grouping-by options are the same (or very

similar) to the correlation-related attributes of the pattern database, but have different names. Could we name them consistently where they are the same? (I haven't checked the correlation module from Rust, but maybe we could align that as well.)

...

For example: grouping-by | patterndb scope | context-scope timeout | context-timeout aggregate | message or action

I omitted the "context" prefix on purpose, they are important in the patterndb context as rules have correllation and non correllation related groups of options. With groupingby it would be kind of redundant. I was thinking on aggregate() a lot, and decided to use something that is closer to the "groupingby" term, group by in SQL works with aggregate functions, in a sense they produce aggregates over various dimensions. In patterndb, you can generate multiple actions for a rule. Anyway, naming should probably be discussed in person.

...

* In the original commit message, you mention three possible values for

the 'scope' option, whereas the context-scope in the patterndb has four (program). Are these deliberately different, or they use the same code? They use the same code, so it should be the same

...

* grouping-by doesn't look to me as an actual parser. From the existing

objects, it resembles a filter more (IMHO), but I'd rather categorize it as something else that transforms/processes the incoming data, and should be therefore in a separate configuration object (along with the geoip parser). I agree, we currently only have parsers/rewrite/filter stuff only. It might make sense to create a more generalized concept though. I am not sure it is worth it, but we already had similar usecases where we couldnt categorize some kind of functionality. But let's test whether we can find a descriptive name for it. How would you call "generic processing" in the config? Btw, I stand by my decision that it is not a filter, it never drops messages, whereas the primary function of filters is to drop messages.

...

Robert

Fekete, Róbert

7 Jun 7 Jun

7:56 p.m.

Hi, On Mon, Jun 6, 2016 at 10:30 PM, Scheidler, Balázs < balazs.scheidler@balabit.com> wrote:

...

Hi,

On Jun 6, 2016 11:17 AM, "Fekete, Róbert" <robert.fekete@balabit.com> wrote:

...
Hi Bazsi,

I've started to document the grouping-by parser, and have a few

questions/comments about it:

...
* It seems that some of the grouping-by options are the same (or very

similar) to the correlation-related attributes of the pattern database, but have different names. Could we name them consistently where they are the same? (I haven't checked the correlation module from Rust, but maybe we could align that as well.)

...
For example: grouping-by | patterndb scope | context-scope timeout | context-timeout aggregate | message or action

I omitted the "context" prefix on purpose, they are important in the patterndb context as rules have correllation and non correllation related groups of options. With groupingby it would be kind of redundant.

I think that the concept is the same, even though it is heavily based on a similar functionality.

...

I was thinking on aggregate() a lot, and decided to use something that is closer to the "groupingby" term, group by in SQL works with aggregate functions, in a sense they produce aggregates over various dimensions. In patterndb, you can generate multiple actions for a rule.

Anyway, naming should probably be discussed in person.

...
* In the original commit message, you mention three possible values

for the 'scope' option, whereas the context-scope in the patterndb has four (program). Are these deliberately different, or they use the same code?

They use the same code, so it should be the same

Ok.

...

...
* grouping-by doesn't look to me as an actual parser. From the existing

objects, it resembles a filter more (IMHO), but I'd rather categorize it as something else that transforms/processes the incoming data, and should be therefore in a separate configuration object (along with the geoip parser).

I agree, we currently only have parsers/rewrite/filter stuff only. It might make sense to create a more generalized concept though. I am not sure it is worth it, but we already had similar usecases where we couldnt categorize some kind of functionality. But let's test whether we can find a descriptive name for it. How would you call "generic processing" in the config?

Well, it might not be generic enough, but both the grouping-by and the geoip parsers add auxiliary data to a message, so in a sense the 'enrich()' the existing data. (My first idea was 'transform', but we do not transform anything directly.) Anyway, I'll try to come up with some other ideas.

...

Btw, I stand by my decision that it is not a filter, it never drops messages, whereas the primary function of filters is to drop messages.

...
Robert

3470

Age (days ago)

3471

Last active (days ago)

List overview

Download

2 comments

2 participants

participants (2)

Fekete, Róbert
Scheidler, Balázs