syslog-ng GUI

Arya, Manish Kumar

11 May 2006 11 May '06

2:51 p.m.

Hi Guys, I am storing logs on a central server having 3T SAN, using follwing template destination indexlog { file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST" template("$HOUR:$MIN:$SEC,$PROGRAM,$FACILITY,$PRIORITY,$MSGONLY\n") template-escape(yes) owner(root) group(root) perm(0644) dir_perm(0755) create_dirs(yes)); }; my logging is done perfectly :) like /logs/log01/indexlog/2006/05/11/hostnames I want to have a GUI to view logs with following facilities -search logs on basis on date/time, text patterns in messages,hostnames. -should provide facility to have filters associated with user authentication/authorization. -should be able to to parallel search to improve search response time. Regards, -Manish __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Show replies by date

Jon Stearley

11 May 11 May

5:49 p.m.

On May 11, 2006, at 6:51 AM, Arya, Manish Kumar wrote:

...

Hi Guys,

I am storing logs on a central server having 3T SAN, using follwing template

destination indexlog {

file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST"

template("$HOUR:$MIN:$SEC,$PROGRAM,$FACILITY,$PRIORITY,$MSGONLY\n") template-escape(yes) owner(root) group(root) perm(0644) dir_perm(0755) create_dirs(yes)); };

my logging is done perfectly :)

like /logs/log01/indexlog/2006/05/11/hostnames

I want to have a GUI to view logs with following facilities

-search logs on basis on date/time, text patterns in messages,hostnames.

http://www.cs.sandia.gov/sisyphus/ mines patterns, but does not have a production GUI (yet). It is more of a research tool at this point, but I would be happy to help you give it a try. Recent emphasis has been on the functionality described in .../detection.pdf. Please let me know if interested, like I said I'd be happy to help, and am in fact looking for additional datasets to analyze; I find my approach to be effective for supercomputer logs, but have not yet explored its effectiveness for other log sets (eg enterprise). I've been waiting to implement a production GUI until I am confident that the underlying functionality is general and excellent. My current leaning is towards adding sisyphus functionality to splunk's interface (and have contacted splunk about this). G'day! -- +--------------------------------------------------------------+ | Jon Stearley (505) 845-7571 (FAX 844-9297) | | Sandia National Laboratories Scalable Systems Integration | +--------------------------------------------------------------+

Jon Stearley

6 p.m.

On May 11, 2006, at 9:49 AM, Jon Stearley wrote:

...

My current leaning is towards adding sisyphus functionality to splunk's interface (and have contacted splunk about this).

Or, something lighter like the attached concept manpage. Anyone interested in such a cli tool? -jon

Ken Garland

8:09 p.m.

if you are splitting all logs up into subdirs like that you will have quite a fun time doing any parsing. i use php-syslog-ng which is piped from mysql and setup as follows in the conf: source s_tcp { tcp(); }; source s_udp { udp(); }; source s_local { unix-stream("/dev/log"); internal(); }; destination d_mysql { pipe("/var/log/mysql.pipe" template("INSERT INTO logs (host, facility, priority, level, tag, datetime, program, msg) VALUES ( '$HOST', '$FACILITY', '$PRIORITY', '$LEVEL', '$TAG', '$YEAR-$MONTH-$DAY $HOUR:$MIN:$SEC', '$PROGRAM', '$MSG' );\n") template-escape(yes)); }; log { source(s_tcp); destination(d_mysql); }; log { source(s_udp); destination(d_mysql); }; I have filters and other log facilities setup but this is the basic layout, the docs on the php-syslog-ng site are very simple to follow. i was going to try splunk but many of the times i find that I'm in the shell doing my reports and searches against the log file. On May 11, 2006, at 8:51 AM, Arya, Manish Kumar wrote:

...

Hi Guys,

I am storing logs on a central server having 3T SAN, using follwing template

destination indexlog {

file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST"

template("$HOUR:$MIN:$SEC,$PROGRAM,$FACILITY,$PRIORITY,$MSGONLY\n") template-escape(yes) owner(root) group(root) perm(0644) dir_perm(0755) create_dirs(yes)); };

my logging is done perfectly :)

like /logs/log01/indexlog/2006/05/11/hostnames

I want to have a GUI to view logs with following facilities

-search logs on basis on date/time, text patterns in messages,hostnames.

-should provide facility to have filters associated with user authentication/authorization.

-should be able to to parallel search to improve search response time.

Regards, -Manish

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

Jon Stearley

9:12 p.m.

On May 11, 2006, at 12:09 PM, Ken Garland wrote:

...

...
file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST" ... -should be able to to parallel search to improve search response time.

If you decide to go with SQL and have $$, netezza.com will almost certainly overcome your speed issues (parallel harware sql!). Having gotten utterly bogged down with Mysql on Linux (stripes, chunks, huge indexes), I just went back to files because they are simple and sufficient for my purposes.

...

if you are splitting all logs up into subdirs like that you will have quite a fun time doing any parsing.

If dirs/logs are arranged according to the factors used for subset selection (year/month/day/host) and the dirs/logs are listed in a (periodically updated) file (eg "corpus.docs" in sisyphus), subset selection can be done by simply grepping the file and concatenating the resulting dirs/logs. This is one implementation option underlying the clog.man page I sent earlier. Further subset selection by facility and priority could then be done by grepping the resulting log content (further dirs/logs splitting by facility/ priority presents multiple bad side effects). $0.02 -jon

Arya, Manish Kumar

14 May 14 May

3:01 a.m.

New subject: syslog-ng + database performance

Hi Guys, Thanks for your valuable suggestions for syslog-ng UI. I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am not happy with performance. we have a central log server with 3000G SAN and 15 GB RAM. and 20,000 devices are suppose to pump logs 24x7 :) with oracle we faced two serious issues, thats why i also started pumping logs in files along with db. -inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db. -select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent. can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db) Regards, -Manish --- Jon Stearley <jrstear@sandia.gov> wrote:

...

On May 11, 2006, at 12:09 PM, Ken Garland wrote:

...
...
file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST"

...
... -should be able to to parallel search to improve search response time.

If you decide to go with SQL and have $$, netezza.com will almost certainly overcome your speed issues (parallel harware sql!). Having gotten utterly bogged down with Mysql on Linux (stripes, chunks, huge indexes), I just went back to files because they are simple and sufficient for my purposes.

...
if you are splitting all logs up into subdirs like that you will have quite a fun time doing any parsing.

If dirs/logs are arranged according to the factors used for subset selection (year/month/day/host) and the dirs/logs are listed in a (periodically updated) file (eg "corpus.docs" in sisyphus), subset selection can be done by simply grepping the file and concatenating the resulting dirs/logs. This is one implementation option underlying the clog.man page I sent earlier. Further subset selection by facility and priority could then be done by grepping the resulting log content (further dirs/logs splitting by facility/ priority presents multiple bad side effects). $0.02

-jon

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Dev Anand

1:49 p.m.

New subject: syslog-ng + database performance

Just the same as Arya, we too have a performance issue in this syslog-ng . Particularly we got only 20 servers which are giving in the logs to a server and the queries are taking more time to display the results. Is there any other way of optimising the dbs or any special tweakings that can be done to the mysql server as such . Thanks in advance for any kind suggestions Deva On 5/14/06, Arya, Manish Kumar <m.arya@yahoo.com> wrote:

...

Hi Guys,

Thanks for your valuable suggestions for syslog-ng UI.

I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am not happy with performance. we have a central log server with 3000G SAN and 15 GB RAM. and 20,000 devices are suppose to pump logs 24x7 :) with oracle we faced two serious issues, thats why i also started pumping logs in files along with db.

-inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db.

-select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent.

can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db)

Regards, -Manish

--- Jon Stearley <jrstear@sandia.gov> wrote:

...
On May 11, 2006, at 12:09 PM, Ken Garland wrote:

...
...
file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST"

...
... -should be able to to parallel search to improve search response time.

If you decide to go with SQL and have $$, netezza.com will almost certainly overcome your speed issues (parallel harware sql!). Having gotten utterly bogged down with Mysql on Linux (stripes, chunks, huge indexes), I just went back to files because they are simple and sufficient for my purposes.

...
if you are splitting all logs up into subdirs like that you will have quite a fun time doing any parsing.

If dirs/logs are arranged according to the factors used for subset selection (year/month/day/host) and the dirs/logs are listed in a (periodically updated) file (eg "corpus.docs" in sisyphus), subset selection can be done by simply grepping the file and concatenating the resulting dirs/logs. This is one implementation option underlying the clog.man page I sent earlier. Further subset selection by facility and priority could then be done by grepping the resulting log content (further dirs/logs splitting by facility/ priority presents multiple bad side effects). $0.02

-jon

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

Alexander Clouter

11:08 p.m.

New subject: syslog-ng + database performance

Hi, disclaimer: this advice is all comes from the 'abyss' that is my brain and never has gone into practice, but it does come from issues I have had to deal with myself in the past Dev Anand <deva.security@gmail.com> [20060514 17:19:27 +0530]:

...

Just the same as Arya, we too have a performance issue in this syslog-ng .

A lot of problesm pile up as people seem to love SQL a little too much, however as an alternative to flat text files it actually is pretty neat thing.

...

Particularly we got only 20 servers which are giving in the logs to a server and the queries are taking more time to display the results.

Thats probably the result of 'sub-optimal' SQL queries, table layout and indexing. SQL can never be counted on for realtime, or even 'semi' realtime, applications regardless of what people say. Databases should always be thought of as a slow harddisk and that there is always going to be a bottleneck due to it. The solution is to devise to a method that removes SQL latency from the equation. This might seem difficult but in practice is rather straight forward, there are two methods, for the OP I would recommend the latter.

...

Is there any other way of optimising the dbs or any special tweakings that can be done to the mysql server as such .

Writing to the DB: ------------------ Have a second buffer which chains up the SQL queries to create a single multi-rule-drop INSERT. You probably have seen the difference in speed when you mysqldump the database into single INSERT's and then into multi-INSERT's when it comes to restoration. I would say its at least ten times faster, if not more. This will reduce the SQL latency issue but it will not remove it. The second approach removes SQL latency altogether. Data is dumped to local log files with something like a five to sixty minute log rotation. A cron job runs that picks up these files and then pushes them into the database. This part can be placed at slightly lower priority (if on the same machine as well as the SQL server), nice 19 or something, so that the rest of the machine is always available to receive the events. Use a maildir approach to processing the logs where you have: logs---+-recorded +-processing `-finished Log files start in 'recorded', your script picks the oldest (if there is only one log file it skips this turn) one in 'recorded' and moves it into 'processing'. Then it works on the file. When finished it drops it in 'finished'. The cron script runs once a minute. If there is a file in 'processing' it does nothing, if the file in processing is older than 'x minutes' then an alert could be sent to the administrator. This will detect when updates are talking too long or a script has stalled; this points to a problem that has occured' Its important that only one script rus at a time as the DB is going to be in a locked state anyway pushing the recorded logs to it, running a second instance is only going to slow down the thing further or not go anywhere until the first script has unlocked the database. Now SQL latency is no longer an issue. Reading from the DB: -------------------- Its worth checking your SQL statements to make sure they do not clobber the entire database pointlessly and do not soak up huge amounts of RAM in the process of execution. Indexing can help, but do not get carried away. Only index things like the date or the machine the data came from, I doubt there is much else thats useful to index. Whilst INSERTing the rules you might want to consider some pre-processing. If the log entry comes from a mailserver and its the SMTP daemon, flag that entry in an ENUM column as being part of a SMTP daemon. You are then effectly creating an index based on the data contained in the log messages, this column you can then index on. You probably will find that if you are executing your queries on the live recording server thats going to make things abismal. This is bad practice, especially on a heavily used logging server as everytime a query is needed to be made it has to probably wait for a stack of INSERT's to complete, then when you do execute your query you overflow the buffer on syslog-ng (if you are not using my second writing method) and loose a lot of data. You probably should consider replicating the data off at intervals to a second 'read-only' mysql setup. I would consider 'mysqldump' as its damn fast but you probably will find you might have to introduce even table rotation to make that effective. Of course if you go with my second DB writing method you can make the queries directly on the DB without loosing any data.

...

Thanks in advance for any kind suggestions

I just hope my advice does not cause your pants to explode....thats my final disclaimer :) Cheers Alex

...

Deva

On 5/14/06, Arya, Manish Kumar <m.arya@yahoo.com> wrote:

...
Hi Guys,

Thanks for your valuable suggestions for syslog-ng UI.

I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am not happy with performance. we have a central log server with 3000G SAN and 15 GB RAM. and 20,000 devices are suppose to pump logs 24x7 :) with oracle we faced two serious issues, thats why i also started pumping logs in files along with db.

-inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db.

-select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent.

can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db)

Regards, -Manish

--- Jon Stearley <jrstear@sandia.gov> wrote:

...
On May 11, 2006, at 12:09 PM, Ken Garland wrote:

...
...
file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST"

...
... -should be able to to parallel search to improve search response time.

If you decide to go with SQL and have $$, netezza.com will almost certainly overcome your speed issues (parallel harware sql!). Having gotten utterly bogged down with Mysql on Linux (stripes, chunks, huge indexes), I just went back to files because they are simple and sufficient for my purposes.

...
if you are splitting all logs up into subdirs like that you will have quite a fun time doing any parsing.

If dirs/logs are arranged according to the factors used for subset selection (year/month/day/host) and the dirs/logs are listed in a (periodically updated) file (eg "corpus.docs" in sisyphus), subset selection can be done by simply grepping the file and concatenating the resulting dirs/logs. This is one implementation option underlying the clog.man page I sent earlier. Further subset selection by facility and priority could then be done by grepping the resulting log content (further dirs/logs splitting by facility/ priority presents multiple bad side effects). $0.02

-jon

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

Brian Candler

15 May 15 May

11:58 a.m.

New subject: syslog-ng + database performance

On Sun, May 14, 2006 at 10:08:56PM +0100, Alexander Clouter wrote:

...

Whilst INSERTing the rules you might want to consider some pre-processing. If the log entry comes from a mailserver and its the SMTP daemon, flag that entry in an ENUM column as being part of a SMTP daemon. You are then effectly creating an index based on the data contained in the log messages, this column you can then index on.

In particular, beware that if you are doing queries like select * from logs where msg like '%mail%'; then they will almost certainly be unable to use an index, even if you have one (unless your DB supports some very fancy full-text indexing). That is, typically, like 'foo%'; -- fast, uses index like '%foo'; -- slow, won't use index, forces full table scan like '%foo%'; -- slow, won't use index, forces full table scan So, it's a lot better to pre-parse the log lines into the fields of interest, and put those fields into separate database columns suitably indexed, if you intend searching on them. Regards, Brian.

Didier Conchaudron

12:06 p.m.

New subject: syslog-ng + database performance

Hi, I'm experiencing problems too but not that way. I maintain a loghost for hundreds of servers, and every peace of log goes into a Postgresql DB. One week of activity is about 90GB of data. Syslog write into a named pipe, a Perl script read it and execute the INSERT queries. The loghost itself is a P4 2,8Ghz with 2GB of memory and have no problems for doing the inserts. The problems I have comes from the SELECT done into the DB used for reporting. I've resolved most of the performances issues with specific indexes, especially for full text researches. From my point of view, your problem comes the Oracle DB tweaking, and probably from the listener itself. But my Oracle knowledge is poor. Cheers, Didier Quoting "Arya, Manish Kumar" <m.arya@yahoo.com>:

...

Hi Guys,

Thanks for your valuable suggestions for syslog-ng UI.

I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am not happy with performance. we have a central log server with 3000G SAN and 15 GB RAM. and 20,000 devices are suppose to pump logs 24x7 :) with oracle we faced two serious issues, thats why i also started pumping logs in files along with db.

-inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db.

-select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent.

can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db)

Regards, -Manish

--- Jon Stearley <jrstear@sandia.gov> wrote:

...
On May 11, 2006, at 12:09 PM, Ken Garland wrote:

...
...
file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST"

...
... -should be able to to parallel search to improve search response time.

If you decide to go with SQL and have $$, netezza.com will almost certainly overcome your speed issues (parallel harware sql!). Having gotten utterly bogged down with Mysql on Linux (stripes, chunks, huge indexes), I just went back to files because they are simple and sufficient for my purposes.

...
if you are splitting all logs up into subdirs like that you will have quite a fun time doing any parsing.

If dirs/logs are arranged according to the factors used for subset selection (year/month/day/host) and the dirs/logs are listed in a (periodically updated) file (eg "corpus.docs" in sisyphus), subset selection can be done by simply grepping the file and concatenating the resulting dirs/logs. This is one implementation option underlying the clog.man page I sent earlier. Further subset selection by facility and priority could then be done by grepping the resulting log content (further dirs/logs splitting by facility/ priority presents multiple bad side effects). $0.02

-jon

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.

Arya, Manish Kumar

1:04 p.m.

New subject: syslog-ng + database performance

have you noticed that events are being droped by db? try to push logs into files n db, and then count number of events in db and file for certain time :) in my case files have 8 to 10 times more events than in db. which means sql is droping many events --- Didier Conchaudron <didier@conchaudron.net> wrote:

...

Hi,

I'm experiencing problems too but not that way.

I maintain a loghost for hundreds of servers, and every peace of log goes into a Postgresql DB. One week of activity is about 90GB of data.

Syslog write into a named pipe, a Perl script read it and execute the INSERT queries. The loghost itself is a P4 2,8Ghz with 2GB of memory and have no problems for doing the inserts.

The problems I have comes from the SELECT done into the DB used for reporting. I've resolved most of the performances issues with specific indexes, especially for full text researches.

From my point of view, your problem comes the Oracle DB tweaking, and probably from the listener itself. But my Oracle knowledge is poor.

Cheers,

Didier

Quoting "Arya, Manish Kumar" <m.arya@yahoo.com>:

...
Hi Guys,

Thanks for your valuable suggestions for syslog-ng UI.

I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am not happy with performance. we have a central log server with 3000G SAN and 15 GB RAM. and 20,000 devices are suppose to pump logs 24x7 :) with oracle we faced two serious issues, thats why i also started pumping logs in files along with db.

-inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db.

-select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent.

can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db)

Regards, -Manish

--- Jon Stearley <jrstear@sandia.gov> wrote:

...
On May 11, 2006, at 12:09 PM, Ken Garland wrote:

...
...
file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST"

...
...
...
... -should be able to to parallel search to improve search response time.

If you decide to go with SQL and have $$, netezza.com will almost certainly overcome your speed issues (parallel harware sql!). Having gotten utterly bogged down with Mysql on Linux (stripes, chunks, huge indexes), I just went back to files because they are simple and sufficient for my purposes.

...
if you are splitting all logs up into subdirs like that you will have quite a fun time doing any parsing.

If dirs/logs are arranged according to the factors used for subset selection (year/month/day/host) and the dirs/logs are listed in a (periodically updated) file (eg "corpus.docs" in sisyphus), subset selection can be done by simply grepping the file and concatenating the resulting dirs/logs. This is one implementation option underlying the clog.man page I sent earlier. Further subset selection by facility and priority could then be done by grepping the resulting log content (further dirs/logs splitting by facility/ priority presents multiple bad side effects). $0.02

-jon

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu

https://lists.balabit.hu/mailman/listinfo/syslog-ng

...
Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu

https://lists.balabit.hu/mailman/listinfo/syslog-ng

...
Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

----------------------------------------------------------------

...

This message was sent using IMP, the Internet Messaging Program.

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Didier Conchaudron

2:45 p.m.

New subject: syslog-ng + database performance

I've not tested. But as you saying that I will probably do it ;-) Didier Quoting "Arya, Manish Kumar" <m.arya@yahoo.com>:

...

have you noticed that events are being droped by db? try to push logs into files n db, and then count number of events in db and file for certain time :) in my case files have 8 to 10 times more events than in db. which means sql is droping many events

--- Didier Conchaudron <didier@conchaudron.net> wrote:

...
Hi,

I'm experiencing problems too but not that way.

I maintain a loghost for hundreds of servers, and every peace of log goes into a Postgresql DB. One week of activity is about 90GB of data.

Syslog write into a named pipe, a Perl script read it and execute the INSERT queries. The loghost itself is a P4 2,8Ghz with 2GB of memory and have no problems for doing the inserts.

The problems I have comes from the SELECT done into the DB used for reporting. I've resolved most of the performances issues with specific indexes, especially for full text researches.

From my point of view, your problem comes the Oracle DB tweaking, and probably from the listener itself. But my Oracle knowledge is poor.

Cheers,

Didier

Quoting "Arya, Manish Kumar" <m.arya@yahoo.com>:

...
Hi Guys,

Thanks for your valuable suggestions for syslog-ng UI.

I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am not happy with performance. we have a central log server with 3000G SAN and 15 GB RAM. and 20,000 devices are suppose to pump logs 24x7 :) with oracle we faced two serious issues, thats why i also started pumping logs in files along with db.

-inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db.

-select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent.

can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db)

Regards, -Manish

--- Jon Stearley <jrstear@sandia.gov> wrote:

...
On May 11, 2006, at 12:09 PM, Ken Garland wrote:

...
...
file("/logs/log01/indexlog/$YEAR/$MONTH/$DAY/$HOST"

...
...
...
... -should be able to to parallel search to improve search response time.

If you decide to go with SQL and have $$, netezza.com will almost certainly overcome your speed issues (parallel harware sql!). Having gotten utterly bogged down with Mysql on Linux (stripes, chunks, huge indexes), I just went back to files because they are simple and sufficient for my purposes.

...
if you are splitting all logs up into subdirs like that you will have quite a fun time doing any parsing.

If dirs/logs are arranged according to the factors used for subset selection (year/month/day/host) and the dirs/logs are listed in a (periodically updated) file (eg "corpus.docs" in sisyphus), subset selection can be done by simply grepping the file and concatenating the resulting dirs/logs. This is one implementation option underlying the clog.man page I sent earlier. Further subset selection by facility and priority could then be done by grepping the resulting log content (further dirs/logs splitting by facility/ priority presents multiple bad side effects). $0.02

-jon

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu

https://lists.balabit.hu/mailman/listinfo/syslog-ng

...
Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu

https://lists.balabit.hu/mailman/listinfo/syslog-ng

...
Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

----------------------------------------------------------------

...
This message was sent using IMP, the Internet Messaging Program.

_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.

Roberto Nibali

17 May 17 May

9:21 p.m.

New subject: syslog-ng + database performance

Hello,

...

I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am not happy with performance.

Any numbers, performance figures?

...

we have a central log server with 3000G SAN and 15 GB RAM. and 20,000 devices are suppose to pump logs 24x7 :)

What's your expected/measured: o rate of arrival avg/peak in lines/s and socket connections/s o data volume/s of this central log server? What kind of hardware/OS/Oracle version and Configuration are you running this central log server with?

...

with oracle we faced two serious issues, thats why i also started pumping logs in files along with db.

It's almost always faster to write log files to flat files.

...

-inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db.

Parallel Inserts or one single pipe? Do you purge old data from your DB and if so, in which interval?

...

-select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent.

What's your time frame expectation regarding your select statements?

...

can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db)

Cheers, Roberto Nibali, ratz -- echo '[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq' | dc

Arya, Manish Kumar

18 May 18 May

7:24 a.m.

New subject: syslog-ng + database performance

Hi,

...

I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am not happy with performance.

Any numbers, performance figures? [Manish] database drops 80% of events compared to files

...

we have a central log server with 3000G SAN and

...

GB RAM. and 20,000 devices are suppose to pump logs 24x7 :)

What's your expected/measured: o rate of arrival avg/peak in lines/s and socket connections/s [Manish] avg 300 peak 1000 o data volume/s of this central log server? [Manish] 1~2 GB per day, but this will go to 15 GB when will will add more deives very soon What kind of hardware/OS/Oracle version and Configuration are you running this central log server with? [Manish] four 1281 MHz SUNW,UltraSPARC-IIIi Processors, 16 GB RAM oracle 10g and solaris 10

...

with oracle we faced two serious issues, thats

why

...

i also started pumping logs in files along with db.

It's almost always faster to write log files to flat files. [Manish] Yes, thats why I am loging events in files too along with db for redundancy.

...

-inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db.

Parallel Inserts or one single pipe? Do you purge old data from your DB and if so, in which interval? [Manish] serial inserts, yes but after months, we have 3000 GB SAN.

...

-select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent.

What's your time frame expectation regarding your select statements? [Manish] should return results within 5 sec atmost. though after doing range partition this has improved somewhat.

...

can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db)

Regards, -Manish --- Roberto Nibali <ratz@drugphish.ch> wrote:

...

Hello,

...
I have seen that most of the UIs are avialable with databases. I have syslog-ng+oracle setup too. but I am

not

...
happy with performance.

Any numbers, performance figures?

...
we have a central log server with 3000G SAN

and 15

...
GB RAM. and 20,000 devices are suppose to pump logs 24x7 :)

What's your expected/measured:

o rate of arrival avg/peak in lines/s and socket connections/s o data volume/s

of this central log server?

What kind of hardware/OS/Oracle version and Configuration are you running this central log server with?

...
with oracle we faced two serious issues, thats

why

...
i also started pumping logs in files along with db.

It's almost always faster to write log files to flat files.

...
-inserts, i have using named pipe to insert logs in db, but oracle somehow drops inserts, becuase "rate of arival of events" is much larger than "rate of insert operations". I have noticed that there is about 80-90% event drops in db.

Parallel Inserts or one single pipe? Do you purge old data from your DB and if so, in which interval?

...
-select, when we search logs, it was really really bad performance it took too long to give results. but then we did indexing on hostname and partitioned table on time (new range partition is created after every 6 hrs) This improved system performance to some extent.

What's your time frame expectation regarding your select statements?

...
can you guys suggest me if mysql or postgre will be better to overcome above to problems (but remember our db is huge :), so I am not sure if mysql or postgre is able to handle such big db)

Cheers, Roberto Nibali, ratz -- echo

'[q]sa[ln0=aln256%Pln256/snlbx]sb3135071790101768542287578439snlbxq'

...

| dc _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html

__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

7060

Age (days ago)

7067

Last active (days ago)

List overview

Download

13 comments

8 participants

participants (8)

Alexander Clouter
Arya, Manish Kumar
Brian Candler
Dev Anand
Didier Conchaudron
Jon Stearley
Ken Garland
Roberto Nibali

syslog-ng GUI

Arya, Manish Kumar

Jon Stearley

Jon Stearley

Ken Garland

Jon Stearley

Arya, Manish Kumar

Dev Anand

Alexander Clouter

Brian Candler

Didier Conchaudron

Arya, Manish Kumar

Didier Conchaudron

Roberto Nibali

Arya, Manish Kumar

tags

participants (8)