On 9/5/05, Jason Haar <Jason.Haar@trimble.co.nz> wrote:
We're generating around 4Gb syslog data per week, and I'm looking for a good search interface into it.
We generate around 1GB syslog data per hour (peak), and still haven't found a good interface to search archived data. I am seriously considering "PHPsyslogNG", but we are concerned about the security risks of installing mysql and php on our otherwise very locked down OpenBSD loggers. I am primarily worried about the integrity of the original log archives, so I may end up deploying a new server with either PHPsyslogNG or MARS, and feeding a copy of the log stream to that new host.
I can cut my way through it with egrep/etc, but waiting 10-15min for a
I've found that fastest way to search large text files from the command line is to start with an 'fgrep' to get a broad match, then use egrep to look for specific information.
result really isn't going to break any speed records. Especially when I then need to re-run it with another "grep" on the end of it! ;-)
Another trick is to do this: fgrep -h "192.168.1." * |tee /tmp/temp192-168-1.log |egrep "(ftp|http|deny)" If you need to "re-run the command with another grep on the end", you can use /tmp/temp192-168-1.log as the source, instead of the complete logs. Just make sure /tmp has room to spare :)
Has anyone come up with a good speedy way of coping with Gbytes of syslog data?
I get "acceptable" search times when looking at short time ranges (usually just a couple of hours at a time) by coding common queries as Perl scripts. This also makes it easy to generate histograms and summary reports, in text, HTML, or both.
Or is it time to invest in some Appliance or the like?
We've considered appliances from companies such as LogLogic, at one time had a budget to purchase a syslog appliance.. As it turns out most "appliances" are LAMP with a nice GUI, and usually either have limitations on the types and formats of the log source data they will accept, or charge a license fee for modules to process different event sources, or even a fee per source host! Around 2001 NFR offered their "SLR" syslog appliance, they no longer sell this but SLR may be available to existing customers. Another appliance option to consider is Cisco's MARS product, (formerly Protego), which includes it's own Oracle backend Kevin Kadow