[syslog-ng]Over zealous syslog-ng problem

Tue, 31 Dec 2002 15:11:45 -0500

Ben Russo wrote:

>There are a few ways to look at this problem...
>
>1. The box sending the messages..
>	Do the 16,000,000 messages all have the same facility.priority?
>	traditional syslog on solaris can only decide what to send based
>	on facility and priority (and maybe the "tag" IIRC).
>	So you may or may not be able to filter them at the sending side
>	depending on whether the facility.priority of the messages is 
>	unique to what you want to filter.
>  
>

They are actually 16 million copies of the same message.  I would like 
one to be recorded, but not all 16 million.  If I get one, I could 
trigger an alarm (actually, the network monitoring people could do 
something if that message appears).  The sending machine is running 
syslog-ng, so I was hoping that I could stop it from writting all the 
messages to local disk and sending them across the network.  I suppose I 
could use a match rule to trigger an alarm and to filter out the 
messages, but the noc people may not like that.

>2.  The syslog-ng receiving the messages...
>	Have your syslog-ng use the "match (regexp)" rule to filter
>	out certain messages, but not others.  Maybe that will work?
>
>3.  Have your perl program decide what to insert and what not to..
>
>As far as losing the messages...  Syslog-ng doesn't buffer, so if your
>mysql database isn't able to keep up with the flood of messages that are
>coming in to the pipe and from there to your perl program then syslog-ng
>drops them. (AFAIK)
>  
>
>The way that I have handled this in my situation is documented at
>http://www.muppethouse.com/~ben/
>  
>

Thanks, I'll take a look.

>I had syslog-ng format my incomming messages into SQL insert statements
>in batches by second.  Then I have a program come by and pick up each
>batch to be inserted and delete the batch file when it finishes.
>This way if there is a flood of messages, they queue up in the directory
>and get pushed into the database ASAP until the queue is empty.
>  
>

My perl script does the same thing.  It basically sits asleep and checks 
every few seconds to see if something has been written to the pipe.  If 
so, it reads 1 line at a time until nothing else is there and then goes 
back to sleep.  The pipe entries are also preformatted sql statements.  
What is strange to me is that the sending machine seems to have no 
problems writting 16 million entries to disk and the receiving machine 
has the same syslog-ng binary and, for the most part, the same 
syslog-ng.conf file.  So either the messages are getting lost in 
transport, or the perl sql inserts are not blocking and they happen too 
fast for mysql to deal with.  To me, the latter is more troubling.  
Either way, my setup needs to be refined.

Aaron

>-Ben.
>
>On Tue, 2002-12-31 at 11:44, Aaron Jackson wrote:
>  
>
>>My Setup:
>>I have syslog-ng running on several Solaris 8 machines.  Each machine 
>>writes log messages to their local disks and also forwards the messages 
>>to a central log server via a UPD connection, also a Solaris 8 machine. 
>>  The central log server stores everything into a mysql database via a 
>>perl script I wrote.
>>
>>My Problem:
>>I am running the UNIX version of Cisco Secure on one of the Solaris 
>>boxes.  A couple of times it has blown up.  When this happens, it 
>>generates millions of log messages in a very short period.  The problem 
>>is that syslog-ng logs most of these messages (I also get the mangled 
>>message problem during these heavy loads).  The most recent episode 
>>generated 1,930,974 messages that made it into the mysql database and 
>>49,573 mangled messages on the central log server, but 16,040,886 
>>messages were written to disk on the local machine (see below).
>>
>>My Questions:
>>Is there any way to throttle syslog-ng, or make syslog-ng not accept all 
>>  log messages when an app goes crazy?  I want to log some of these 
>>messages, so I know when to restart the service, but I don't want all 16 
>>million.  Also, it seems that around 15 million log messages didn't make 
>>it to my central server.  Where were they lost?  Is this a problem with 
>>the UDP transport?
>>
>>Aaron
>>
>># cat local0.log | grep -c 'ERROR - error on accept'
>>16040886
>>
>>jackson@auth:/tmp {5} cat sql_errors | grep -c 'INSERT INTO'
>>49573
>>
>>mysql> delete from logs where host='acs' and facility='local0' and 
>>priority='err' and msg like '%ERROR - error on accept%';
>>Query OK, 1943387 rows affected (1 hour 40.16 sec)
>>
>>
>>_______________________________________________
>>syslog-ng maillist  -  syslog-ng@lists.balabit.hu
>>https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>Frequently asked questions at http://www.campin.net/syslog-ng/faq.html
>>    
>>