[syslog-ng] filtering vs. keeping all logs

Czanik, Péter peter.czanik at balabit.com
Fri Apr 29 10:33:29 CEST 2016


Hi,

First of all: thank you for your feedback.

This is very interesting, as it is pretty much the contrary what I hear /
read in most discussions. I am often asked how to throw away cron / dhcp /
dns / kernel / debug / etc. messages to save bandwidth / disk space and
sometimes even to narrow down what is saved from authentication logs (which
sounds crazy to my security minded ears...).

I wonder what is the reason of this contradiction. Is it the size of the
organization? (assumption: a larger org has more resources to save
everything) Or is it compliance? (PCI, etc.) Or both?

Bye,

Peter Czanik (CzP) <peter.czanik at balabit.com>
Balabit / syslog-ng upstream
http://czanik.blogs.balabit.com/
https://twitter.com/PCzanik

On Thu, Apr 28, 2016 at 6:59 PM, Evan Rempel <erempel at uvic.ca> wrote:

> Logs are used for so many things. Auditing, security, post incident
> analysis, live alerting (SIEM) and others. It is for this reason that I
> believe that all raw log data should be saved.
>
> Adding to the discussion about metadata...
>
> We add metadata from a variety of sources.
>
> 1. The syslog line itself. We parse EVERY log message to identify specific
> data and context. For example, a login identifier is often used in an email
> address, but in the context of an e-mail address, it is NOT a login
> identifier. This enables data mining on login identifiers without having to
> further filer out e-mail messages. We populate hundreds of metadata
> elements this way. tape volumes, database instances, login, uid, gid, disk
> drive names, logical volume names, FRU components in hardware monitoring.
> The list is huge.
>
> 2. Incident details. During the parsing of EVERY log message, specific
> messages are identified as messages that should be alerted on. Metadata is
> added that contains incident description, URL to resolution documentation,
> severity of the incident and details on minimizing false positives. For
> example, a repeating log message may only be an incident if it repeats at a
> defined rate over a defined duration. All of this data is used to produce
> alerts to SMS, email, ticketing system.
>
> 3. Inventory management system. We add metadata for tiers of service. We
> have test, dev, preprod and prod. We also add business application names
> such as database instance (SID), Facilities management, workflow,
> MSExchange, listserver etc.
>
> 4. Business responsibility matrix. For each host/application there is a
> group that is responsible for the service. this metadata is added so that
> when alerts need to be sent the alerting subsystem can determine where to
> send the alert. It does this based on this responsibility matrix and data
> from #2.
>
>
> All of this metadata gets placed into elasticsearch so we can start to
> mine the data by asking questions like:
>
> - show all of the activity by user XXX in service Y in the preproduction
> tier on linux hosts.
> - show all of the incidents for host HHH that group GGG is responsible for
> fixing.
> - which service is responsible for the large increase in error class
> syslog lines, and in which tier of service did they occur.
>
> The metadata is the power that drives this, and without the real time high
> performance pattern matching it just can't be done.
>
> Evan.
>
>
>
> On 04/28/2016 06:23 AM, Scot Needy wrote:
>
> We save all log data and compress/dedup hourly.  For an enterprise of
> about 5000 servers this averages about 200GB.
> Some PCI compartments are special have backup and retention policies for
> compliance.
>
> Archiving raw log data also gives us data to re-parse should the patterns
> need to be updated.
>
>
>
> On Apr 28, 2016, at 7:23 AM, Czanik, Péter < <peter.czanik at balabit.com>
> peter.czanik at balabit.com> wrote:
>
> Hi,
>
> I was asking, because up until now I recall a single syslog-ng user, who
> told me, that he saves all log messages. On the other hand I keep receiving
> (marketing) e-mails, that no logs should be discarded, everything should be
> saved. And sometimes I receive the same feedback from the Big Data world:
> we have enough disk space, why to do any filtering. So I'd be interested to
> learn from real world experiences, if filtering is really old fashioned or
> is there any situation (compliance requirement, endless storage, etc.) when
> you really save all log messages.
>
> Bye,
>
> Peter Czanik (CzP) < <peter.czanik at balabit.com>peter.czanik at balabit.com>
> Balabit / syslog-ng upstream
> http://czanik.blogs.balabit.com/
> https://twitter.com/PCzanik
>
> On Thu, Apr 28, 2016 at 11:11 AM, Fabien Wernli < <wernli at in2p3.fr>
> wernli at in2p3.fr> wrote:
>
>> On Thu, Apr 28, 2016 at 11:06:07AM +0200, Czanik, Péter wrote:
>> > One of the major strengths of syslog-ng is message filtering, which
>> > facilitates message routing and discarding useless log messages. OTOH I
>> > often read, that we have now all the technologies and storage to keep
>> all
>> > logs. What do you think?
>>
>> I would go further: we now have the means to add relevant metadata to all
>> the events,
>> which in turn allows us to do targeted archiving.
>>
>>
>>
>> ______________________________________________________________________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation:
>> http://www.balabit.com/support/documentation/?product=syslog-ng
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>
>>
>>
>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation:
> http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
>
>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
>
> --
> Evan Rempel                                      erempel at uvic.ca
> Senior Systems Administrator                        250.721.7691
> Data Centre Services, University Systems, University of Victoria
>
>
>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation:
> http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.balabit.hu/pipermail/syslog-ng/attachments/20160429/ea32e592/attachment.htm 


More information about the syslog-ng mailing list