[syslog-ng] Anyone got a well performing search interface for syslog data?

Roberto Nibali ratz at drugphish.ch
Tue Sep 6 21:38:49 CEST 2005


I don't want to spoil the party ...

> We generate around 1GB syslog data per hour (peak),
> and still haven't found a good interface to search archived data.

We basically use grep, which we patched a bit to speed up the search ;).

> I am seriously considering "PHPsyslogNG", but we are concerned
> about the security risks of installing mysql and php on our
> otherwise very locked down OpenBSD loggers.

mysql shouldn't be a problem, and for php you can google for some 
hardening php projects.

During the course of various years we have been doing centralised log 
file analysis we've come to realise that db's just don't cut it, as 
strange as this may sound now. We make heavy usage of macro expansion 
and build up a hierarchy of logfiles through simple fs directories. We 
simply had problems extracting important information from GBytes of log 
entries form a DB, either postgres or mysql. The current key to success 
is to write appropriate filters to dissect incoming log data in an 
intelligent way and store it in a directory structure using macro expansion.

> I am primarily worried about the integrity of the original log archives,

Then DBMS is the way to go.

> so I may end up deploying a new server with either PHPsyslogNG or MARS,
> and feeding a copy of the log stream to that new host.

What kind of information exactly do you need to extract? That's maybe 
the question that most people need to ask themselves when deploying 
syslog servers. Do you simply want to browse through some logfiles to 
cherry-pick suspicious lines or do you yield for correlated data for 
information and event management?

> Another trick is to do this:
> 
> fgrep -h "192.168.1." * |tee /tmp/temp192-168-1.log |egrep "(ftp|http|deny)" 
> 
> If you need to "re-run the command with another grep on the end",
> you can use /tmp/temp192-168-1.log as the source, instead of the
> complete logs.  Just make sure /tmp has room to spare :)

As a short sidenote: people maintaining grep have recently switched 
maintainership and kind of a to me strange activity in grep's 
development was to change the way fgrep and egrep are dealt with: 
Basically egrep is grep -E and fgrep is grep -F. And egrep, resp. fgrep 
are symlinks to /bin/grep normally. So, now this will change in future, 
as those symlinks will be real files; which means that you'll lose some 
time if you use egrep¦fgrep instead of grep -E or grep -F ;). But with 
your pipe orgy I reckon this does not really account for. We had to 
patch grep heavily to reduce our '¦' orgy.

> We've considered appliances from companies such as LogLogic,
> at one time had a budget to purchase a syslog appliance..
> As it turns out most "appliances" are LAMP with a nice GUI,
> and usually either have limitations on the types and formats of the
> log source data they will accept, or charge a license fee for modules
> to process different event sources, or even a fee per source host!

Interesting...ly strange business model.

> Around 2001 NFR offered their "SLR" syslog appliance, they
> no longer sell this but SLR may be available to existing customers.
> Another appliance option to consider is Cisco's MARS product,
> (formerly Protego), which includes it's own Oracle backend

Thanks for this input. Regards,
Roberto Nibali, ratz
-- 
echo 
'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc


More information about the syslog-ng mailing list