Jason, I've been looking into php-syslog-ng (http://freshmeat.net/projects/phpsyslogng/), which, as the name might suggest is a php/mysql frontend for syslog-ng. For large amounts of data you can use the "logrotate" function that it provides to make a new database every day/week/whatever. This means that as long as you know the date of what you are looking for, the search stays small. In the case that you're not sure of the date you can still search across all databases, but be prepared to wait ! The databases are indexed and optimized which makes them faster (a lot!) than grep. Another alternative is to leave the data in text files but then to index the text files with something like "beagle" (http://beaglewiki.org/Main_Page) or "penetrator" (http://freshmeat.net/projects/penetrator/). You then just need to search the index which will let you know exactly where to look. Regards, Jim Jason Haar wrote:
We're generating around 4Gb syslog data per week, and I'm looking for a good search interface into it.
I can cut my way through it with egrep/etc, but waiting 10-15min for a result really isn't going to break any speed records. Especially when I then need to re-run it with another "grep" on the end of it! ;-)
I have tried injecting it into a MySQL database using some schemas I've found on the Internet - but the performance didn't seem much better to me - and you lost the "free-text" attributes of grep (or more specifically, the sorts of searches I find I want to do aren't SQL-friendly).
Has anyone come up with a good speedy way of coping with Gbytes of syslog data? Or is it time to invest in some Appliance or the like?