Hi, Currently we're logging everything to text files for a few LAN clients. We're considering using a database instead and have a few questions to help us decide: - Would a database be a good option to replace existing text files for long term storage considering storage space? - Would mongo OR mysql be better suited for storing system logs? I understand answers to these questions can vary depending on specific use case but seeking a general recommendation to see what's typically being used and what the most stable/supported options would be. -- *Nullius In Verba*
On 18.08.2014 23:06, VMI X wrote:
Hi, Currently we're logging everything to text files for a few LAN clients. We're considering using a database instead and have a few questions to help us decide:
* Would a database be a good option to replace existing text files for long term storage considering storage space? * Would mongo OR mysql be better suited for storing system logs?
I understand answers to these questions can vary depending on specific use case but seeking a general recommendation to see what's typically being used and what the most stable/supported options would be.
Why don't you test it yourself by logging to *both* text files (as you do now) and a database? That would get you some data to compare, and what's best about it - that would be your data, your use-cases, etc. HTH, Jakub. -- Jakub Jankowski|shasta@toxcorp.com|http://toxcorp.com/ GPG: FCBF F03D 9ADB B768 8B92 BB52 0341 9037 A875 942D
Thanks for replying, Yes of course we plan on testing it out for ourselves. Just wanted to put the questions out there to see what other 'proven recipes' other's might have discovered that work for them to benefit from their experience and perhaps learn something we aren't even considering as yet. On Mon, Aug 18, 2014 at 2:14 PM, Jakub Jankowski <shasta@toxcorp.com> wrote:
On 18.08.2014 23:06, VMI X wrote:
Hi, Currently we're logging everything to text files for a few LAN clients. We're considering using a database instead and have a few questions to help us decide:
* Would a database be a good option to replace existing text files for long term storage considering storage space? * Would mongo OR mysql be better suited for storing system logs?
I understand answers to these questions can vary depending on specific use case but seeking a general recommendation to see what's typically being used and what the most stable/supported options would be.
Why don't you test it yourself by logging to *both* text files (as you do now) and a database? That would get you some data to compare, and what's best about it - that would be your data, your use-cases, etc.
HTH, Jakub.
-- Jakub Jankowski|shasta@toxcorp.com|http://toxcorp.com/ GPG: FCBF F03D 9ADB B768 8B92 BB52 0341 9037 A875 942D
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- *Nullius In Verba*
Hmmm - I have done some testing but not settled on the perfect solution (yet). - I have multi-line logs being parsed by a program destination and then writing to mongodb (perl parser) - I have started testing straight mongo, but getting the right data in the right fields seems important - I have done very basic testing with syslog-ng --> redis which I am planning on then --> elasticsearch (I may need to stick Logstash in between redis and elasticsearch) In general, any relational database (SQL) wins when you can represent the relationships in the schema. "nosql" like mongo typically wins with unstructured data, but at a space penalty (needing to store json format) If I had to pick one *right now* I would probably use syslog-ng --> redis --> logstash --> elasticsearch --> kibana The (R) ELK stack has a lot of support and development, and is pretty close to a free splunk. Although I can see using a sharded/replicated mongodb having some basic advantages, but I have not (yet) found the perfect way to do ad-hoc queries against the store. Good luck (and report back!) Thanks, Jim On 08/18/2014 05:06 PM, VMI X wrote:
Hi, Currently we're logging everything to text files for a few LAN clients. We're considering using a database instead and have a few questions to help us decide:
* Would a database be a good option to replace existing text files for long term storage considering storage space? * Would mongo OR mysql be better suited for storing system logs?
I understand answers to these questions can vary depending on specific use case but seeking a general recommendation to see what's typically being used and what the most stable/supported options would be.
-- /Nullius In Verba/
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Jim Hendrick <jrhendri@roadrunner.com> writes:
- I have done very basic testing with syslog-ng --> redis which I am planning on then --> elasticsearch (I may need to stick Logstash in between redis and elasticsearch)
FWIW, with the syslog-ng Incubator[1], you have multiple options to log directly to ElasticSearch. We even have an elasticsearch() destination there. (At the moment, it uses a very dumb python program, but in the future, it will be vastly improved. The syntax will remain the same, though). [1]: https://github.com/balabit/syslog-ng-incubator -- |8]
VMI X <vmixus@gmail.com> writes:
Currently we're logging everything to text files for a few LAN clients. We're considering using a database instead and have a few questions to help us decide:
- Would a database be a good option to replace existing text files for long term storage considering storage space? - Would mongo OR mysql be better suited for storing system logs?
I understand answers to these questions can vary depending on specific use case but seeking a general recommendation to see what's typically being used and what the most stable/supported options would be.
As I recommended on IRC, I would suggest using syslog-ng (+ incubator) -> ElasticSearch -> Kibana for visualisation and shorter-term (a few months, maybe, depending on the amount of logs you have) storage. For archival purposes, I'd use text files with rotation and compression, alongside ES+Kibana. I found that text files are much more efficiently compressed than databases, so if your concern is size, then by all means, use files for archival. Nothing stops you from using a DB alongside it for other purposes. Which DB? That depends on a lot of things. The DB your tools are prepared for. If you use Kibana, that's going to be ElasticSearch. But MongoDB has a fair amount of good tools that can help you work with your log data. But then, so does SQL (and when it comes to SQL, I always recommend Postgres over MySQL). Hope that helps! -- |8]
Hi, On Tue, Aug 19, 2014 at 08:33:26AM +0200, Gergely Nagy wrote:
-> ElasticSearch -> Kibana for visualisation and shorter-term (a few months, maybe, depending on the amount of logs you have) storage. For archival purposes, I'd use text files with rotation and compression, alongside ES+Kibana.
The ELK developers claim that the storage overhead of ES over text files is a factor of 3. In my experience, if you don't compress additionally (e.g. using ZFS) it's more like tenfold, but then again we do add some structure to the events.
I found that text files are much more efficiently compressed than databases, so if your concern is size, then by all means, use files for archival. Nothing stops you from using a DB alongside it for other purposes.
I second that: use a Kibana-like interface with appropriate storage for search, and text for archiving. We used to have a very small footprint using text files on a compressed+deduplicated ZFS, but then again, grep sucks. The additional benefit of having the "raw" text files is to be able to do a rerun of your analysis/indexing that feeds your search backend. Just to give you some figures, here's a table of a few of our Elasticsearch indices: index pri rep docs.count store.size pri.store.size syslog-2014.08.06 24 1 65347459 53.2gb 26.6gb syslog-2014.08.19 24 1 16801481 12.9gb 6.5gb syslog-2014.08.05 24 1 63663738 49.8gb 24.9gb As you can see, one event takes roughly 512 bytes of storage, doubled up because we have one replica. Cheers
participants (5)
-
Fabien Wernli
-
Gergely Nagy
-
Jakub Jankowski
-
Jim Hendrick
-
VMI X