logic and duplicate suppression

John Kristoff

28 Jul 2010 28 Jul '10

5:04 p.m.

I have a couple of scenarios where I'm looking to enhance how I handle and process some logs. I'm looking for suggestions on what my options are, but maybe these are potential feature requests? 1. In using a parser (cvs or the patterndb), I'd like to use some conditionals based on a resultant macro value. So for example, if I have an sshd authentication log message with a source address in a macro and that address is contained w/in a specific prefix, I'd like to handle that message differently. Perhaps not log it all or set another MACRO to a certain value. 2. I'd like to be able to suppress duplicate messages even if they are not necessarily contiguous at the destination. So for example, if I have a SSH client that generates a log of its SSH client protocol and software, I don't need to see that over and over again (e.g. as you might commonly see today in SSH brute force attacks). John

Show replies by date

Martin Holste

28 Jul 28 Jul

9:47 p.m.

There are a number of high-level ways of handling this kind of task. Here is my philosophy: Disk is cheap. Log everything and become efficient at querying/grepping/reporting instead of pre-filtering. This is especially important for security because even the most mundane logs can be critical later. The way I handle your presented tasks is to normalize incoming logs as much as possible with Syslog-NG and dump them into SQL. I can then run periodic queries against the SQL with very fine-grained control for alerting, retention, or whatever higher-level task you're looking to do. So, for your example of handling ssh messages differently depending on the source address, I have a SQL column for source address and then I can do "WHERE INET_ATON(source_ip) NOT BETWEEN INET_ATON("x.x.x.x") AND INET_ATON("y.y.y.y")" in my query. For reporting, I can do "GROUP BY INET_ATON(source_ip)-MOD(INET_ATON(source_ip), 256)" to group by a class C subnet. Maybe this is more than you want to do in your case, but it sounds to me like maybe you're ready for some functionality beyond manually reading through the log files. There are plenty of ready-made log collectors out there: Balabit makes a nice solution in their Store Box, Clayton has his Logzilla (php-syslog-ng) project, or if you're under 500 MB per day of logs, I highly recommend the free Splunk Personal Edition which is phenomenal. My belief is that your time would be better spent setting up a solid apparatus for querying and reporting than on trying to get Syslog-NG to filter in the specific ways you want it to. On Wed, Jul 28, 2010 at 10:04 AM, John Kristoff <jtk@cymru.com> wrote:

...

I have a couple of scenarios where I'm looking to enhance how I handle and process some logs. I'm looking for suggestions on what my options are, but maybe these are potential feature requests?

1. In using a parser (cvs or the patterndb), I'd like to use some conditionals based on a resultant macro value. So for example, if I have an sshd authentication log message with a source address in a macro and that address is contained w/in a specific prefix, I'd like to handle that message differently. Perhaps not log it all or set another MACRO to a certain value.

2. I'd like to be able to suppress duplicate messages even if they are not necessarily contiguous at the destination. So for example, if I have a SSH client that generates a log of its SSH client protocol and software, I don't need to see that over and over again (e.g. as you might commonly see today in SSH brute force attacks).

John ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html

Balazs Scheidler

29 Jul 29 Jul

10:03 a.m.

On Wed, 2010-07-28 at 10:04 -0500, John Kristoff wrote:

...

I have a couple of scenarios where I'm looking to enhance how I handle and process some logs. I'm looking for suggestions on what my options are, but maybe these are potential feature requests?

1. In using a parser (cvs or the patterndb), I'd like to use some conditionals based on a resultant macro value. So for example, if I have an sshd authentication log message with a source address in a macro and that address is contained w/in a specific prefix, I'd like to handle that message differently. Perhaps not log it all or set another MACRO to a certain value.

I had a similar idea for a while and as an incentive for you to try the latest-greatest stuff, I've implemented it in OSE 3.2: commit b3f4c03473a0f77bf7d87abf3f00b46e035bbbe8 Author: Balazs Scheidler <bazsi@balabit.hu> Date: Thu Jul 29 09:59:53 2010 +0200 rewrite: implement condition() option for rewrite expressions This patch implements condition() option for rewrite expression, which makes it possible to only apply a given reply rule if the message matches the filter. For example: set("something new" condition(facility(auth)));

...

2. I'd like to be able to suppress duplicate messages even if they are not necessarily contiguous at the destination. So for example, if I have a SSH client that generates a log of its SSH client protocol and software, I don't need to see that over and over again (e.g. as you might commonly see today in SSH brute force attacks).

This is more difficult. The sane way of doing this is to keep state on a per-host basis, which is the area of correllation. Of course this is on the radar for syslog-ng, but we're not there yet. Doing simply on the source side is not going to work as multiple "source" hosts can appear on the same connection. -- Bazsi

John Kristoff

4:02 p.m.

On Thu, 29 Jul 2010 10:03:37 +0200 Balazs Scheidler <bazsi@balabit.hu> wrote:

...

I had a similar idea for a while and as an incentive for you to try the latest-greatest stuff, I've implemented it in OSE 3.2:

Thank you, thought I can't promise I'll try it in the immediate future. By way of another example for the sort of thing I was thinking of here are a couple of patterndb rules for BIND 9.x messages: <ruleset id='c7a73d172adcd7c5' name='named'> <pattern>named</pattern> <rules> <rule id='3409bd184b4dd59c' provider='drg.lamer' class='info'> <description>BIND 9.x lame server - see lib/dns/resolver.c</description> <patterns> <pattern>lame-servers: info: @ESTRING:LAMER.REASON: resolving@ '@ESTRING:LAMER.QNAME:/@@STRING:LAMER.TYPE@/@STRING:LAMER.CLASS@': @IPvANY:LAMER.SADDR@#@NUMBER@</pattern> <pattern>lame-servers: info: lame server resolving @QSTRING:LAMER.QNAME:'@ (in @QSTRING:LAMER.ZONE:'@?): @IPvANY:LAMER.SADDR@#@NUMBER@</pattern> </patterns> <examples> <example> <test_message>lame-servers: info: unexpected RCODE (REFUSED) resolving 'ns1.example.org/AAAA/IN': 192.0.2.1#53</test_message> <test_message>lame-servers: info: lame server resolving '1.2.0.192.in-addr.arpa' (in '2.0.192.in-addr.arpa'?): 192.2.0.1#53</test_message> </example> </examples> </rule> <rule id='2531e2f1a9304259' provider='drg.query' class='info'> <description>BIND 9.x query</description> <patterns> <pattern>queries: info: client @IPvANY:QUERY.SADDR@#@NUMBER:QUERY.SPORT@: view @STRING:QUERY.VIEW@: query: @ESTRING:QUERY.QNAME: @@STRING:QUERY.CLASS@ @STRING:QUERY.TYPE@ @ANYSTRING:QUERY.FLAGS@</pattern> </patterns> <examples> <example> <test_message>queries: info: client 192.0.2.1#1024: view external: query: . IN NS -</test_message> <test_message>queries: info: client 127.0.0.1#1024: view loopback: query: 1.0.0.127.in-addr.arpa IN PTR +</test_message> </example> </examples> </rule> </rules> </ruleset> I'll happily take improvements to the above patterns. Do note, the prefix strings (e.g. 'queries: info: ' and 'lame-servers: info: ') will only be present if BIND print-severity and print-category options are set. A better set of named rules to account for various combinations of those options may be desirable. Likewise, to account for the existence or lack of using views for queries. In the case of the query pattern, being able to set a MACRO based on the presence of a flag (e.g. if FLAGS =~ /\+/ then RD=1 else RD=0).

...

This is more difficult. The sane way of doing this is to keep state on a per-host basis, which is the area of correllation. Of course this is on the radar for syslog-ng, but we're not there yet.

I'm already doing as Martin suggested (thanks for your reply Martin) and will just log everything and optimize queries and reporting. If really needed, I could post process entries in the database to keep only the first and last seen log of interest. Though Thanks again, John

Balazs Scheidler

2 Aug 2 Aug

4:29 p.m.

On Thu, 2010-07-29 at 09:02 -0500, John Kristoff wrote:

...

On Thu, 29 Jul 2010 10:03:37 +0200 Balazs Scheidler <bazsi@balabit.hu> wrote:

...
I had a similar idea for a while and as an incentive for you to try the latest-greatest stuff, I've implemented it in OSE 3.2:

Thank you, thought I can't promise I'll try it in the immediate future. By way of another example for the sort of thing I was thinking of here are a couple of patterndb rules for BIND 9.x messages:

<ruleset id='c7a73d172adcd7c5' name='named'> <pattern>named</pattern> <rules> <rule id='3409bd184b4dd59c' provider='drg.lamer' class='info'> <description>BIND 9.x lame server - see lib/dns/resolver.c</description> <patterns> <pattern>lame-servers: info: @ESTRING:LAMER.REASON: resolving@ '@ESTRING:LAMER.QNAME:/@@STRING:LAMER.TYPE@/@STRING:LAMER.CLASS@': @IPvANY:LAMER.SADDR@#@NUMBER@</pattern> <pattern>lame-servers: info: lame server resolving @QSTRING:LAMER.QNAME:'@ (in @QSTRING:LAMER.ZONE:'@?): @IPvANY:LAMER.SADDR@#@NUMBER@</pattern> </patterns> <examples> <example> <test_message>lame-servers: info: unexpected RCODE (REFUSED) resolving 'ns1.example.org/AAAA/IN': 192.0.2.1#53</test_message> <test_message>lame-servers: info: lame server resolving '1.2.0.192.in-addr.arpa' (in '2.0.192.in-addr.arpa'?): 192.2.0.1#53</test_message> </example> </examples> </rule>

<rule id='2531e2f1a9304259' provider='drg.query' class='info'> <description>BIND 9.x query</description> <patterns> <pattern>queries: info: client @IPvANY:QUERY.SADDR@#@NUMBER:QUERY.SPORT@: view @STRING:QUERY.VIEW@: query: @ESTRING:QUERY.QNAME: @@STRING:QUERY.CLASS@ @STRING:QUERY.TYPE@ @ANYSTRING:QUERY.FLAGS@</pattern> </patterns> <examples> <example> <test_message>queries: info: client 192.0.2.1#1024: view external: query: . IN NS -</test_message> <test_message>queries: info: client 127.0.0.1#1024: view loopback: query: 1.0.0.127.in-addr.arpa IN PTR +</test_message> </example> </examples> </rule>

</rules> </ruleset>

I'll happily take improvements to the above patterns. Do note, the prefix strings (e.g. 'queries: info: ' and 'lame-servers: info: ') will only be present if BIND print-severity and print-category options are set. A better set of named rules to account for various combinations of those options may be desirable. Likewise, to account for the existence or lack of using views for queries.

you can simply duplicate the pattern rules one including the "queries: info" part, the other that doesn't. they can even trigger the same rule. btw, the issue with integrating these into the patterndb ruleset is that there's no schema for this yet. can you describe what 'generic' application level events these do describe? For example, user login/logout are described using the "usracct" schema, which defines which name-value pairs need to be marked in the incoming log message. Does this idea apply to here as well?

...

In the case of the query pattern, being able to set a MACRO based on the presence of a flag (e.g. if FLAGS =~ /\+/ then RD=1 else RD=0).

I don't understand this, can you elaborate please?

...

...
This is more difficult. The sane way of doing this is to keep state on a per-host basis, which is the area of correllation. Of course this is on the radar for syslog-ng, but we're not there yet.

I'm already doing as Martin suggested (thanks for your reply Martin) and will just log everything and optimize queries and reporting. If really needed, I could post process entries in the database to keep only the first and last seen log of interest. Though

ok. -- Bazsi

John Kristoff

5 Aug 5 Aug

11:44 p.m.

On Mon, 02 Aug 2010 16:29:39 +0200 Balazs Scheidler <bazsi@balabit.hu> wrote:

...

can you describe what 'generic' application level events these do describe? For example, user login/logout are described using the "usracct" schema, which defines which name-value pairs need to be marked in the incoming log message. Does this idea apply to here as well?

Without knowing what the choices are and what the goals are, maybe they are both under a DNS or more generic netinfo schema? The drg.lamer pattern identifies a lame delegation. They are both informational as the prefix tags suggest. In the generic sense, maybe renaming the LAMER part of the name to a generic DNS tag would be appropriate?

...

...
In the case of the query pattern, being able to set a MACRO based on the presence of a flag (e.g. if FLAGS =~ /\+/ then RD=1 else RD=0).

I don't understand this, can you elaborate please?

A ISC BIND query log message may contain the following flags appended onto the log message: flag | description -------------------------- + | recursion desired - | recursion not requested S | signed query E | EDNS options in use T | TCP in use D | DNSSEC OK set C | checking disabled I was thinking of a way to set a macro based on the presence of a particular flag. For for instance, if the following logs appear: client 127.0.0.1#49152: query: www.example.org IN A + client 192.0.2.1#49152: query: www.example.org IN A +E client 2001:DB8::1#49152: query: www.example.org IN A +SE In any case, I was thinking if I could set ${DNS.RECURSION} = 1 that would be nice unless there is a better, more efficient way within the existing capabilities. John

Balazs Scheidler

15 Aug 15 Aug

7:55 a.m.

Hi, Sorry for taking such an awful long time to respond, but better late than never, so here it comes. On Thu, 2010-08-05 at 16:44 -0500, John Kristoff wrote:

...

On Mon, 02 Aug 2010 16:29:39 +0200 Balazs Scheidler <bazsi@balabit.hu> wrote:

...
can you describe what 'generic' application level events these do describe? For example, user login/logout are described using the "usracct" schema, which defines which name-value pairs need to be marked in the incoming log message. Does this idea apply to here as well?

Without knowing what the choices are and what the goals are, maybe they are both under a DNS or more generic netinfo schema? The drg.lamer pattern identifies a lame delegation. They are both informational as the prefix tags suggest. In the generic sense, maybe renaming the LAMER part of the name to a generic DNS tag would be appropriate?

Now that I think of it, the DNS query portion is quite simple: it logs the contents of the DNS query and probably the same parameters would probably be present in all DNS server logs, thus I just have to decide the naming policy to be used on "transaction logs in general". If we had a mail server and we'd want to normalize SMTP transactions, how would they be called? I guess "smtptxn" for SMTP transaction would be a good name, right? In that way your DNS transactions (= query logs) would need to be called "dnstxn", how does that sound to you? Also, lame delegation is not a query, right? (I'd really need to refresh my memories about lame delegation, it's been a while since I last ran my own DNS server) If I understand the pattern/log message correctly, lame delegation would happen if my bind instance would try to resolve because recursion was requested, and the respond received from the upstream DNS server was bogus. Is this right? If I'm right, it should go under "dnslame".

...

...
...
In the case of the query pattern, being able to set a MACRO based on the presence of a flag (e.g. if FLAGS =~ /\+/ then RD=1 else RD=0).

I don't understand this, can you elaborate please?

A ISC BIND query log message may contain the following flags appended onto the log message:

flag | description -------------------------- + | recursion desired - | recursion not requested S | signed query E | EDNS options in use T | TCP in use D | DNSSEC OK set C | checking disabled

I was thinking of a way to set a macro based on the presence of a particular flag. For for instance, if the following logs appear:

client 127.0.0.1#49152: query: www.example.org IN A + client 192.0.2.1#49152: query: www.example.org IN A +E client 2001:DB8::1#49152: query: www.example.org IN A +SE

In any case, I was thinking if I could set ${DNS.RECURSION} = 1 that would be nice unless there is a better, more efficient way within the existing capabilities.

Hmm.. flag like stuff is not easy to parse using patterndb, it is not very convinient, but creating all the possible patterns will not cause it to be parsed too much slower, however if all flags are independent that'd mean 2^7 = 128 rules. One way to do that, I'd suggest to parse the whole field as text, put it in a database and filter on that at query time. An alternative is to use rewrite() rules with conditions: rewrite r_dns_flags { set("dnstxn.recursion", "1" condition(match("+" type(string) value("dnstxn.flags"))); }; Another alterantive would be in the future is to use the "template-functions" idea and extend syslog-ng with a specific function. However I didn't finish with that yet: in a template you'll be able to call functions, like this: template("$(function $value)"); For example: <value name="dnstxn.recursion">$(substring-present-p $value +))</value> where "substring-present-p" would be a function that returns either "0" or "1" depending on whether the $value contains a substring or not. -- Bazsi

John Kristoff

20 Aug 20 Aug

1:17 a.m.

On Sun, 15 Aug 2010 07:55:58 +0200 Balazs Scheidler <bazsi@balabit.hu> wrote:

...

Now that I think of it, the DNS query portion is quite simple: it logs the contents of the DNS query and probably the same parameters would probably be present in all DNS server logs, thus I just have to decide the naming policy to be used on "transaction logs in general".

There are various types of logs a DNS server could generate depending on how granular you want your parser to be. The lame delegation logs for example are reasonably different than the query log and a zone transfer log message in turn would be different from each of those.

...

I guess "smtptxn" for SMTP transaction would be a good name, right? In that way your DNS transactions (= query logs) would need to be called "dnstxn", how does that sound to you?

Doesn't really matter to me. Some purists might not like referring to them as transactions, but I could care less. :-) If you want an alternative, I would suggest dnsquery.

...

Also, lame delegation is not a query, right? (I'd really need to

Correct, but the log message is only generated as a result of a query that probably didn't go so well.

...

refresh my memories about lame delegation, it's been a while since I last ran my own DNS server) If I understand the pattern/log message correctly, lame delegation would happen if my bind instance would try to resolve because recursion was requested, and the respond received from the upstream DNS server was bogus. Is this right?

Mostly right. Not bogus so much as the answer was either not marked authoritative (aa bit is not set) or you simply didn't get an answer at all from the nameserver you were following in a delegation chain, perhaps due to a DNS server configuration error, fault, packet filter, outage, etc.

...

One way to do that, I'd suggest to parse the whole field as text, put it in a database and filter on that at query time.

That's the current plan and is acceptable to me, thanks for looking it over. John

Balazs Scheidler

3 Sep 3 Sep

1:35 p.m.

Hi, On Thu, 2010-08-19 at 18:17 -0500, John Kristoff wrote:

...

On Sun, 15 Aug 2010 07:55:58 +0200 Balazs Scheidler <bazsi@balabit.hu> wrote:

...
Now that I think of it, the DNS query portion is quite simple: it logs the contents of the DNS query and probably the same parameters would probably be present in all DNS server logs, thus I just have to decide the naming policy to be used on "transaction logs in general".

There are various types of logs a DNS server could generate depending on how granular you want your parser to be. The lame delegation logs for example are reasonably different than the query log and a zone transfer log message in turn would be different from each of those.

...
I guess "smtptxn" for SMTP transaction would be a good name, right? In that way your DNS transactions (= query logs) would need to be called "dnstxn", how does that sound to you?

Doesn't really matter to me. Some purists might not like referring to them as transactions, but I could care less. :-) If you want an alternative, I would suggest dnsquery.

Agreed, I don't mind dnsquery. :)

...

...
Also, lame delegation is not a query, right? (I'd really need to

Correct, but the log message is only generated as a result of a query that probably didn't go so well.

I'm adding your patterns then, and create a schema for DNS related stuff then. -- Bazsi

Balazs Scheidler

2:37 p.m.

New subject: [patterndb] bind9 patterns and DNS schema (was: Re: logic and duplicate suppression)

On Fri, 2010-09-03 at 13:35 +0200, Balazs Scheidler wrote:

...

Hi,

On Thu, 2010-08-19 at 18:17 -0500, John Kristoff wrote:

...
On Sun, 15 Aug 2010 07:55:58 +0200 Balazs Scheidler <bazsi@balabit.hu> wrote:

...
Now that I think of it, the DNS query portion is quite simple: it logs the contents of the DNS query and probably the same parameters would probably be present in all DNS server logs, thus I just have to decide the naming policy to be used on "transaction logs in general".

There are various types of logs a DNS server could generate depending on how granular you want your parser to be. The lame delegation logs for example are reasonably different than the query log and a zone transfer log message in turn would be different from each of those.

...
I guess "smtptxn" for SMTP transaction would be a good name, right? In that way your DNS transactions (= query logs) would need to be called "dnstxn", how does that sound to you?

Doesn't really matter to me. Some purists might not like referring to them as transactions, but I could care less. :-) If you want an alternative, I would suggest dnsquery.

Agreed, I don't mind dnsquery. :)

...
...
Also, lame delegation is not a query, right? (I'd really need to

Correct, but the log message is only generated as a result of a query that probably didn't go so well.

I'm adding your patterns then, and create a schema for DNS related stuff then.

And here it comes. I have added two schemas: Schema: dnsqry Status: experimental Description: DNS query logs This schema is describing DNS query logs. Strongly bind inspired. Attributes: NV pair name Mandatory Description dnsqry.client_ip N Source IP address of the DNS request. dnsqry.client_port N Source port dnsqry.view N DNS view dnsqry.query Y DNS query. dnsqry.class Y DNS class (IN for internet) dnsqry.type Y DNS record type to query (e.g. A, PTR, etc) dnsqry.flags N DNS Request flags. And: Schema: dnslame Status: experimental Description: DNS logs for lame delegation. This schema is for DNS lame logs, strongly bind inspired. Attributes: NV pair name Mandatory Description dnslame.reason N The reason the DNS request couldn't be fulfilled. dnslame.zone N The lame zone. These two zones describe DNS events. I've also cleaned up your patterns and added them to dns/bind.pdb. I'd appreciate review from both you and anyone else running DNS servers if I did it right. The patterns themselves match and extract the NV pairs properly, this is tested by "pdbtool test". -- Bazsi

Fekete Robert

29 Jul 29 Jul

10:34 a.m.

Hi John, John Kristoff wrote:

...

I have a couple of scenarios where I'm looking to enhance how I handle and process some logs. I'm looking for suggestions on what my options are, but maybe these are potential feature requests?

1. In using a parser (cvs or the patterndb), I'd like to use some conditionals based on a resultant macro value. So for example, if I have an sshd authentication log message with a source address in a macro and that address is contained w/in a specific prefix, I'd like to handle that message differently. Perhaps not log it all or set another MACRO to a certain value.

You can filter on the results of your message parsing and use embedded log statements to handle messages differently based on the values of the parsers. You need a filter that selects program(sshd), netmask(), and tag(how-you-tag-sshd-auth-messages). For embedded logpaths, see http://www.balabit.com/dl/html/syslog-ng-ose-v3.1-guide-admin-en.html/config... and http://www.balabit.com/dl/html/syslog-ng-ose-v3.1-guide-admin-en.html/refere... for the various filters. HTH Robert

...

2. I'd like to be able to suppress duplicate messages even if they are not necessarily contiguous at the destination. So for example, if I have a SSH client that generates a log of its SSH client protocol and software, I don't need to see that over and over again (e.g. as you might commonly see today in SSH brute force attacks).

AFAIK,

...

John ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html

5512

Age (days ago)

5549

Last active (days ago)

List overview

Download

10 comments

4 participants

participants (4)

Balazs Scheidler
Fekete Robert
John Kristoff
Martin Holste