Daniel Neubacher <daniel.neubacher@xing.com> writes:
my syslog-ng has gotten quite big with 50k logs per second and the server seems to hit the io limit at night. While a few month ago I could run a gzip with ionice over all old logs the server doesn't like it anymore and quite a lot of logs are storing while the compression lasts.
I'm using the ose so I've got no logstore. And for a second I've thought about writing the logs a compressed fuse fs but... fuse :P So how are you guys doing it?
I've used several different approaches over the years, I'll list some of them, with pros and cons: Rotate & compress ================= The first approach I used was to simply rotate log files and compress them. This quickly killed my CPU and disks. Pros: - Simple as a brick. Cons: - CPU and IO intensive, bogs down the computer Runtime, external compression ============================= Another option I played with was to write a very small program, that accepts data on stdin, and compresses it on the fly, then I sent my logs to this destination. I also kept the most recent logs in uncompressed files too. Pros: - Fairly simple - The CPU/IO load is better spread - Uncompressed logs still available I used /var/log/FILENAME-${YEAR}${MONTH}${DAY}.log, and simply deleted old ones. - You don't need to re-read old logs to archive them, archival happens on the fly. Cons: - Requires an external program, which one will have to carefully write to not loose data. - It's much harder to reliably rotate the compressed files. My program closed the current file on SIGHUP, and opened a new one. Not too elegant, and not really configurable, but got the job done. - Still bogs down the CPU and IO. This can be partially addressed by writing the compressed files to a different disk than where you write uncompressed logs. Runtime archival to external services ===================================== Since I didn't have the resources to put anymore disks into my log server at the time, IO became a problem. So I moved the archival to a different server, by sending uncompressed logs over the network, and moving the runtime compression to the other box. This is pretty much the same solution as the one above, but instead of a local pipe, stuff is sent over the network. Pros: - Still simple - IO is done on another box, so doesn't disturb the local uncompressed log storage. Cons: - Needs a separate server - Increased network bandwidth - If archiving is slow, it can still bog down both machines due to flow control. - Needs potentially large queues on the sending side, and without disk queue, that's not the most reliable thing. Database ======== This is my current solution. I still have my local logs in files for easy access, but the archive is stored in a MongoDB cluster. Pros: - IO is spread accross a number of machines - Does not bog down the central server, ever - Structured logs, better queryability Cons: - Data is not compressed - Needs a higher amount of resources to work reliably and efficiently - More complicated to set up - MongoDB does have an overhead over simply emitting text & compressing it. A variant of this would be to use AMQP to transfer logs, then you can attach any number of archival servers onto the publisher, and spread out the work nicely. But AMQP adds its own overhead too. Other solutions =============== There's a whole lot of other ways to achieve the same thing, the above ones are only those few I've personally used in the not too distant past. -- |8]