[syslog-ng] Syslog-ng as basic realtime reliable logfile replicator. Possible?

Declan White declanw at is.bbc.co.uk
Wed Jan 31 17:43:21 UTC 2018


On Tue, Jan 30, 2018 at 09:58:55AM +0100, Fekete, R?bert wrote:
> Hi,
> 
> Just a few quick notes:
>  * recent version of syslog-ng OSE also support disk-based buffering to
> avoid network and server outages. 

Yup. Already using. It also deals with the case where logfile rotations still
need to happen even if you can't send data home for two days. I suspect you'd
miss a whole day from the stream otherwise.

> syslog-ng also has a commercial Premium Edition that can acknowledge the
> receiving of messages, and resend messages that got lost.

Aye, I'm aware of it, but won't get access to it. I'll have to solve that bit later.
Is there a list of output modules that do ACK's?

>  * syslog-ng can handle very long messages, so long lines shouldn't be a
> problem (adjust log-message-size() if needed). If a message is longer than
> that, then it will be truncated.

Yeah, I just can't trust every log file I'll meet, and want to not need to
have to. A couple of tags like "truncated" and "continuation" and a way of
encoding potential binary (e.g. mmencode, or urlencode) would be handy
there. I'm converting it all into a JSON metadata'd structure before sending anyway.

>  * AFAIK you cannot find inodes, but you can transfer the FILE_NAME macro
> that includes the path+filename

That's not enough to uniquely identify the original position of a jigsaw piece.

>  * AFAIK syslog-ng holds the source files until it is reading from them, so
> file truncation should not be a problem, but I'm not entirely sure about
> that

I kinda want to keep an eye on the case where someone tries to delete their tracks from the logs. 
I'd like to be able to spot a log shrinking/being replaced out-of-schedule. Just a little extra metadata for the paranoid.

> Regards,
> 
> Robert
> 
> On Tue, Jan 30, 2018 at 2:26 AM, Scot <scotrn at gmail.com> wrote:
> > - by default, there is nothing you can make with syslog-ng alone that will
> > not lose data during a network or endpoint outage.
> 
> I use rsyslog on clients and relays with TCP disk buffering including
> relays.  Properly measured you should know when you are buffering.

I watched rsyslog loop on the CPU trying to throw data out a failed RELP network connection. 
I don't think it's safe to use outside of Linux.

> > - transporting metadata can tell you which file the data is from, but not
> > where in the file it's from, so you can't really tell if you have duplicate
> > data, or missed data. (The inode number might be handy too)
> > - behaviour around input file truncation is fuzzy. That a truncation has
> > occured might be useful metadata to send (if you're looking for people
> > fiddling logs).
>
> Any mature log reader should handle those use cases, if you have no
> control over the rotation is it possible to load the data after rotation?
> Logrotated has pre and post rotation functions.

I'm using syslog-ng as the mature log reader.
I have rsync for yesterday's logs. I want today's.

> > - It doesn't seem to be able to encode binary/NULs in the logs, so it
> > cannot relay data from 'untrusted' application logs?
> > - Not sure what it does with very long lines. Loses data?
> >   Have not seen those cases.

Not all applications expect their logfiles to be sent via syslog, with character-range and line length limits.
That doesn't stop people needing those logfiles centralised.

> > Hope it help a little.
> > Scot
> >
> > On Mon, Jan 29, 2018 at 9:34 AM, Declan White <declanw at is.bbc.co.uk>
> > wrote:
> >
> >> Hullo.
> >>
> >> I'm trying to fit syslog-ng around a basic problem and looking for tips.
> >>
> >> I have log files growing on one machine that I want to follow and
> >> reliably replicate to a central machine, so it's effectively a basic 'tail
> >> -f' job.
> >> It seems simple, but as I try and close out the possible error conditions
> >> it's getting hairier and hairier.
> >>
> >> e.g.
> >> - by default, there is nothing you can make with syslog-ng alone that
> >> will not lose data during a network or endpoint outage.
> >> - transporting metadata can tell you which file the data is from, but not
> >> where in the file it's from, so you can't really tell if you have duplicate
> >> data, or missed data. (The inode number might be handy too)
> >> - behaviour around input file truncation is fuzzy. That a truncation has
> >> occured might be useful metadata to send (if you're looking for people
> >> fiddling logs).
> >> - It doesn't seem to be able to encode binary/NULs in the logs, so it
> >> cannot relay data from 'untrusted' application logs?
> >> - Not sure what it does with very long lines. Loses data?
> >>
> >> I'm not necessarily looking to get syslog-ng to recreate the file
> >> exactly, just to send enough information to allow something else to work
> >> out the full order of events.
> >> Googling around to see how others solve this problem, I see people doing
> >> infinite rsync loops, or installing large Java beasties, or paying someone
> >> else to make it all go away.
> >>
> >> I tried using rsyslog, but it melted down into a screaming puddle of
> >> nondeterministic threading.
> >>
> >> Is what I'm attempting really as hard as it seems?
> >>
> >> - D
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq

-- 
Declan White


More information about the syslog-ng mailing list