I think you have to pay for splunk if you are getting more than 1GB of logs per day. ________________________________________ From: syslog-ng-bounces@lists.balabit.hu [syslog-ng-bounces@lists.balabit.hu] On Behalf Of Brian Long [Brian.Long@CBC-Companies.com] Sent: Thursday, October 11, 2007 2:31 PM To: syslog-ng@lists.balabit.hu Subject: Re: [syslog-ng] Logs out of control You are trying to do this the hard way. Just install splunk and let it read all your logs from a fifo. ------------------------------ Message: 2 Date: Wed, 10 Oct 2007 13:22:04 -0400 From: John Hala <john.hala@villanova.edu> Subject: Re: [syslog-ng] Logs out of control To: Syslog-ng users' and developers' mailing list <syslog-ng@lists.balabit.hu> Message-ID: <294949F5411DFC418D1DEC84BE825AE11DB300AD6C@VUEX2.vuad.villanova.edu> Content-Type: text/plain; charset="us-ascii" I have 170 hosts right now and plan on going up to 250. Currently getting 1.5 GB of logs per day. I think I should try to partition by day. Using php-syslog-ng, I believe most searches will either be by date or by host, or a combination of both. What type of partitioning do you recommend? I was thinking HASH partitioning. With HASH, I won't have to specify the range or exact days to partition by, whereas with HASH, I just specify the column date. In the table for the date column, I went with 'datetime'. I don't quite understand how to partition that out. Here's my newbie idea of how it should be done, however, I don't quite understand how to implement... Meaning, is it possible to partition a table that already has data in it, and if so, how to specify that particular table to partition because in the examples it just shows creating a table and following it by the partition code... ie | host | varchar(128) | YES | MUL | NULL | | | facility | varchar(10) | YES | MUL | NULL | | | priority | varchar(10) | YES | MUL | NULL | | | level | varchar(10) | YES | | NULL | | | tag | varchar(10) | YES | | NULL | | | datetime | datetime | YES | MUL | NULL | | | program | varchar(15) | YES | MUL | NULL | | | msg | text | YES | MUL | NULL | | | seq | bigint(20) unsigned | NO | PRI | NULL | auto_increment | logs | CREATE TABLE `logs` ( `host` varchar(128) DEFAULT NULL, `facility` varchar(10) DEFAULT NULL, `priority` varchar(10) DEFAULT NULL, `level` varchar(10) DEFAULT NULL, `tag` varchar(10) DEFAULT NULL, `datetime` datetime DEFAULT NULL, `program` varchar(15) DEFAULT NULL, `msg` text, `seq` bigint(20) unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (`seq`), KEY `host` (`host`), KEY `program` (`program`), KEY `datetime` (`datetime`), KEY `priority` (`priority`), KEY `facility` (`facility`), FULLTEXT KEY `msg` (`msg`) ) ENGINE=MyISAM AUTO_INCREMENT=44728025 DEFAULT CHARSET=latin1 MAX_ROWS=4294967295 AVG_ROW_LENGTH=50 so the partition code would go like... PARTITION BY HASH(DAY(timedate)) PARTITIONS 365; PARTITION BY HASH(HOST(host)) PARTITIONS 500; --> (I am just giving a large number that I don't think the number of hosts will reach) I'm pretty certain, I'm way off, lol. Any help would be greatly appreciated! Thanks, John -----Original Message----- From: syslog-ng-bounces@lists.balabit.hu [mailto:syslog-ng-bounces@lists.balabit.hu] On Behalf Of Valdis.Kletnieks@vt.edu Sent: Saturday, October 06, 2007 12:27 AM To: Syslog-ng users' and developers' mailing list Subject: Re: [syslog-ng] Logs out of control On Fri, 05 Oct 2007 15:33:30 EDT, John Hala said:
how would you cut up the partition... by day?
This would of course depend on your local requirements - but units such as "by day", "by week", "by month" certainly come to mind. You will also want to consider how easy it is to implement local policy/legal requirements such as "logs of XYZ events *must* be kept 180 days", "logs of ABC *must* be discarded after 90 days", and so on. Also add in things like "how many records per day", and "how many machines", and so on. You may want to consider using slightly larger time units, such as "month" or "last 30 days", to simplify your life when you're asked to produce log entries for "ABC for the last week", or consider some other partition such as "per machine per month" if that makes sense in your network. If it helps any, I also manage a server that tracks IDS incidents, and the useful quantities there are "last 2 hours", "last 24 hours", "this month", and "forever". ------------------------------ _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng End of syslog-ng Digest, Vol 30, Issue 8 **************************************** _______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html