How does regex work with HOST definitions?
Hi there I have a subset of syslog-ng hosts that use a specific DNS formatted naming convention that I wish to ensure all their data is caught by a particular syslog-ng filter. I have filter f_process_Test { host("^...\-..\-ids\-[0-9]+\...\.our\.net$") and not host("abc-xy-ids-02\.our\.net"); }; i.e. I want abc-12-ids-01.aa.our.net and xyz-12-ids-01.aa.our.net to be caught by this filter, but abc-xy-ids-02\.our\.net not to be. I could explicitly name them all I suppose - but there are 12+ of them and they are growing in number. A regex would be much more efficient. Anyway, it doesn't work. That filter never triggers. I know the hostnames are correct as I have a general catch-all rule that logs to filenames containing the hostname - and those hostnames show up in there. Can anyone explain what I've got wrong? REGEX works fine in my "match" calls... This is syslog-ng-1.6.7-2 under CentOS4.1 Thanks! -- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
Hi there Has anyone any idea about this? It looks to me that regex don't work on the "host()" options at all. I have mine set to a regex, and it's capturing all sorts of traffic from other syslog clients that don't match :-( Jason Jason Haar wrote:
Hi there
I have a subset of syslog-ng hosts that use a specific DNS formatted naming convention that I wish to ensure all their data is caught by a particular syslog-ng filter.
I have
filter f_process_Test { host("^...\-..\-ids\-[0-9]+\...\.our\.net$") and not host("abc-xy-ids-02\.our\.net"); };
i.e. I want abc-12-ids-01.aa.our.net and xyz-12-ids-01.aa.our.net to be caught by this filter, but abc-xy-ids-02\.our\.net not to be.
I could explicitly name them all I suppose - but there are 12+ of them and they are growing in number. A regex would be much more efficient.
Anyway, it doesn't work. That filter never triggers. I know the hostnames are correct as I have a general catch-all rule that logs to filenames containing the hostname - and those hostnames show up in there.
Can anyone explain what I've got wrong? REGEX works fine in my "match" calls...
This is syslog-ng-1.6.7-2 under CentOS4.1
Thanks!
-- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
On 9/28/05, Jason Haar <Jason.Haar@trimble.co.nz> wrote:
Hi there
Has anyone any idea about this? It looks to me that regex don't work on the "host()" options at all. I have mine set to a regex, and it's capturing all sorts of traffic from other syslog clients that don't match :-(
Remove the backslashes before the hyphens - you'd only need to do that inside a character class, e.g. [a-z\-] to match any of a through z and hyphen. Outside a character class it means itself (or if it's the first character in a character class and not escaped, like this [-a-z]).
catenate wrote:
Has anyone any idea about this? It looks to me that regex don't work on the "host()" options at all. I have mine set to a regex, and it's capturing all sorts of traffic from other syslog clients that don't match :-(
Remove the backslashes before the hyphens - you'd only need to do that inside a character class, e.g. [a-z\-] to match any of a through z and hyphen. Outside a character class it means itself (or if it's the first character in a character class and not escaped, like this [-a-z]).
Didn't help I'm afraid. I've got host ("-ids-") and it's still picking up data from boxes who don't contain "-ids-" in their hostname. One thing I didn't mention is that all the incorrect hosts being picked up have their syslogs "routed" through another syslog-ng server running on a host that does match "-ids-", could that be a cause? ie. hostname.my.network -- syslog-ng ---> host-ids-01.my.network -- syslog-ng --> my.central.syslog.server and my.central.syslog.server is logging entries from hostname.my.network as if it matches host("-ids-"). This is a bit of an issue as it means I'm ended up with records being recorded incorrectly 2-4 times - I'm running out of diskspace! (around 15G a week now when it should be 5G) -- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
On Mon, 03 Oct 2005 10:41:22 +1300, Jason Haar said:
One thing I didn't mention is that all the incorrect hosts being picked up have their syslogs "routed" through another syslog-ng server running on a host that does match "-ids-", could that be a cause?
This is the exact reason that 'options { keep_hostname(yes); }' exists.
On Mon, Oct 03, 2005 at 10:41:22AM +1300, Jason Haar wrote:
catenate wrote:
Has anyone any idea about this? It looks to me that regex don't work on the "host()" options at all. I have mine set to a regex, and it's capturing all sorts of traffic from other syslog clients that don't match :-(
Remove the backslashes before the hyphens - you'd only need to do that inside a character class, e.g. [a-z\-] to match any of a through z and hyphen. Outside a character class it means itself (or if it's the first character in a character class and not escaped, like this [-a-z]).
Didn't help I'm afraid. I've got
But it was still an incorrect regexp.
host ("-ids-")
and it's still picking up data from boxes who don't contain "-ids-" in their hostname.
One thing I didn't mention is that all the incorrect hosts being picked up have their syslogs "routed" through another syslog-ng server running on a host that does match "-ids-", could that be a cause?
So what do the log entries look like, do you have chained hostnames or is it replaced with the relaying host? Paste in a couple entries that are logged incorrectly. -- Nate "Man is the only animal that blushes. Or needs to." - Samuel Clemens
Nate Campi wrote:
host ("-ids-")
and it's still picking up data from boxes who don't contain "-ids-" in their hostname.
One thing I didn't mention is that all the incorrect hosts being picked up have their syslogs "routed" through another syslog-ng server running on a host that does match "-ids-", could that be a cause?
So what do the log entries look like, do you have chained hostnames or is it replaced with the relaying host?
Paste in a couple entries that are logged incorrectly.
OK, but I don't have the hostnames in the content - I have them in the directory name instead - see below I have "keep_hostname (no)" set (and yes, I know... - but wait) I have (1) destination d_dir_messages { file("/var/log/syslog/$HOST/$YEAR/$MONTH/$DAY/raw" template("$R_ISODATE $HOST $FACILITY $PRIORITY $MSG\n") } log { source(s_local); destination(d_local_messages); And in the directories created, $HOST is converted into the hostname of the original syslog client - irrespective of whether or not it was "gatewayed" via an intermediary syslog-ng server (exclusively from syslog-ng over TCP if that makes a difference). I also have (2) destination d_dir_IDS { file("/var/log/syslog/$HOST/$YEAR/$MONTH/$DAY/IDS-logs" template("$R_ISODATE $MESSAGE\n") } filter f_process_IDS { host("-ids-") and not host("xx-ids-02.my.net"); }; log { source(s_local); filter(f_process_IDS); destination(d_dir_IDS);}; In the case of (2), I am seeing IDS-logs files from hosts that don't match the f_process_IDS filter. It has been mentioned that "keep_hostname" could be the cause, but I have tried that with it set to "no" and "yes" and it has made no difference - I still see the wrong hosts being matched. Your comment about chained hostnames makes me wonder if the HOST variable is different when used in a directory/file context than when it's part of a template definition? -- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
Hi there I brought this up a couple of weeks ago ("How does regex work with HOST definitions?") and I now think it's a bug. Basically if you call HOST as part of a template call such as: template("$R_ISODATE $HOST $FACILITY $PRIORITY $MSG\n") or file("/var/log/syslog/$HOST/$YEAR/$MONTH/$DAY") then HOST is *the first syslog client* sending the syslog record (assuming keep_hostname is set). i.e. HOST might be the actual client that physically sent the record - or it might be the client gatewayed through a previous syslog server. However, if you are referring to the remote syslog client via a regex in a filter, such as filter f_process_TIBS { host("-ids-") } then it appears that "host" is literally *the last syslog client* - instead of *the first syslog client*. e.g. if you have a syslog client (clientA) that forwards to serverB, and serverB forwards to serverC, then for a particular clientA record, HOST on serverC is "clientA", but "host" refers to "serverB". I can see this by using lsof. I can see that the likes of /var/log/syslog/clientA/2005/10/17/filename is open for write, although clientA hostname doesn't match the filter associated with that path - but the serverB that clientA gateway's through does... Can someone check if this is true? My problem is that the above filter on "serverC" basically matches all syslog clients, whereas running the same config on serverB only matches the appropriate clientA hosts - as I want. Thanks -- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
On Mon, 2005-10-17 at 12:40 +1300, Jason Haar wrote:
Hi there
I brought this up a couple of weeks ago ("How does regex work with HOST definitions?") and I now think it's a bug.
Basically if you call HOST as part of a template call such as:
template("$R_ISODATE $HOST $FACILITY $PRIORITY $MSG\n")
or
file("/var/log/syslog/$HOST/$YEAR/$MONTH/$DAY")
then HOST is *the first syslog client* sending the syslog record (assuming keep_hostname is set). i.e. HOST might be the actual client that physically sent the record - or it might be the client gatewayed through a previous syslog server.
However, if you are referring to the remote syslog client via a regex in a filter, such as
filter f_process_TIBS { host("-ids-") }
then it appears that "host" is literally *the last syslog client* - instead of *the first syslog client*. e.g. if you have a syslog client (clientA) that forwards to serverB, and serverB forwards to serverC, then for a particular clientA record, HOST on serverC is "clientA", but "host" refers to "serverB".
I don't see how this could be the case. $HOST is expanded to the same value as is used for host() filtering, more specifically "struct log_info->host" Filtering: static int do_filter_host(struct filter_expr_node *c, struct log_filter *rule UNUSED, struct log_info *log) { CAST(filter_expr_re, self, c); return (!regexec(&self->regex, (char *) log->host->data, 0, NULL, 0)) ^ c->comp; } Macro expansion: case M_HOST: { /* host */ struct ol_string *host = (id == M_HOST ? msg->host : msg->host_from); UINT8 *p1; UINT8 *p2; int remaining; p1 = memchr(host->data, '@', host->length); if (p1) p1++; else p1 = host->data; remaining = host->length - (p1 - host->data); p2 = memchr(p1, '/', remaining); if (p2) { length = LIBOL_MIN((unsigned int) (p2 - p1), *left); } else { length = LIBOL_MIN(*left, (unsigned int) (host->length - (p1 - host->data))); } length = append_string(dest, left, (char *) p1, length, escape); break; } The long code in the macro expansion does nothing but strip off everything before '@' and after the first '.' (but there's $FULLHOST which does not do this) -- Bazsi
Jason Haar wrote:
Hi there
Has anyone any idea about this? It looks to me that regex don't work on the "host()" options at all. I have mine set to a regex, and it's capturing all sorts of traffic from other syslog clients that don't match :-(
Jason
Jason Haar wrote:
Hi there
I have a subset of syslog-ng hosts that use a specific DNS formatted naming convention that I wish to ensure all their data is caught by a particular syslog-ng filter.
I have
filter f_process_Test { host("^...\-..\-ids\-[0-9]+\...\.our\.net$") and not host("abc-xy-ids-02\.our\.net"); };
i.e. I want abc-12-ids-01.aa.our.net and xyz-12-ids-01.aa.our.net to be caught by this filter, but abc-xy-ids-02\.our\.net not to be.
I could explicitly name them all I suppose - but there are 12+ of them and they are growing in number. A regex would be much more efficient.
Anyway, it doesn't work. That filter never triggers. I know the hostnames are correct as I have a general catch-all rule that logs to filenames containing the hostname - and those hostnames show up in there.
Can anyone explain what I've got wrong? REGEX works fine in my "match" calls...
This is syslog-ng-1.6.7-2 under CentOS4.1
Thanks!
Hello, Try regex coach it helps a lot ^_^ http://www.weitz.de/regex-coach/ JF
participants (6)
-
Balazs Scheidler
-
catenate
-
Jason Haar
-
jf
-
Nate Campi
-
Valdis.Kletnieks@vt.edu