Hi, On Tue, Aug 19, 2014 at 08:33:26AM +0200, Gergely Nagy wrote:
-> ElasticSearch -> Kibana for visualisation and shorter-term (a few months, maybe, depending on the amount of logs you have) storage. For archival purposes, I'd use text files with rotation and compression, alongside ES+Kibana.
The ELK developers claim that the storage overhead of ES over text files is a factor of 3. In my experience, if you don't compress additionally (e.g. using ZFS) it's more like tenfold, but then again we do add some structure to the events.
I found that text files are much more efficiently compressed than databases, so if your concern is size, then by all means, use files for archival. Nothing stops you from using a DB alongside it for other purposes.
I second that: use a Kibana-like interface with appropriate storage for search, and text for archiving. We used to have a very small footprint using text files on a compressed+deduplicated ZFS, but then again, grep sucks. The additional benefit of having the "raw" text files is to be able to do a rerun of your analysis/indexing that feeds your search backend. Just to give you some figures, here's a table of a few of our Elasticsearch indices: index pri rep docs.count store.size pri.store.size syslog-2014.08.06 24 1 65347459 53.2gb 26.6gb syslog-2014.08.19 24 1 16801481 12.9gb 6.5gb syslog-2014.08.05 24 1 63663738 49.8gb 24.9gb As you can see, one event takes roughly 512 bytes of storage, doubled up because we have one replica. Cheers