Syslog-ng as basic realtime reliable logfile replicator. Possible?
Hullo. I'm trying to fit syslog-ng around a basic problem and looking for tips. I have log files growing on one machine that I want to follow and reliably replicate to a central machine, so it's effectively a basic 'tail -f' job. It seems simple, but as I try and close out the possible error conditions it's getting hairier and hairier. e.g. - by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage. - transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs). - It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data? I'm not necessarily looking to get syslog-ng to recreate the file exactly, just to send enough information to allow something else to work out the full order of events. Googling around to see how others solve this problem, I see people doing infinite rsync loops, or installing large Java beasties, or paying someone else to make it all go away. I tried using rsyslog, but it melted down into a screaming puddle of nondeterministic threading. Is what I'm attempting really as hard as it seems? - D
- by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage. I use rsyslog on clients and relays with TCP disk buffering including relays. Properly measured you should know when you are buffering. - transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs). Any mature log reader should handle those use cases, if you have no control over the rotation is it possible to load the data after rotation? Logrotated has pre and post rotation functions. - It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data? Have not seen those cases. Hope it help a little. Scot On Mon, Jan 29, 2018 at 9:34 AM, Declan White <declanw@is.bbc.co.uk> wrote:
Hullo.
I'm trying to fit syslog-ng around a basic problem and looking for tips.
I have log files growing on one machine that I want to follow and reliably replicate to a central machine, so it's effectively a basic 'tail -f' job. It seems simple, but as I try and close out the possible error conditions it's getting hairier and hairier.
e.g. - by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage. - transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs). - It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data?
I'm not necessarily looking to get syslog-ng to recreate the file exactly, just to send enough information to allow something else to work out the full order of events. Googling around to see how others solve this problem, I see people doing infinite rsync loops, or installing large Java beasties, or paying someone else to make it all go away.
I tried using rsyslog, but it melted down into a screaming puddle of nondeterministic threading.
Is what I'm attempting really as hard as it seems?
- D ____________________________________________________________ __________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/? product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, Just a few quick notes: * recent version of syslog-ng OSE also support disk-based buffering to avoid network and server outages. syslog-ng also has a commercial Premium Edition that can acknowledge the receiving of messages, and resend messages that got lost. * syslog-ng can handle very long messages, so long lines shouldn't be a problem (adjust log-message-size() if needed). If a message is longer than that, then it will be truncated. * AFAIK you cannot find inodes, but you can transfer the FILE_NAME macro that includes the path+filename * AFAIK syslog-ng holds the source files until it is reading from them, so file truncation should not be a problem, but I'm not entirely sure about that Regards, Robert On Tue, Jan 30, 2018 at 2:26 AM, Scot <scotrn@gmail.com> wrote:
- by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage. I use rsyslog on clients and relays with TCP disk buffering including relays. Properly measured you should know when you are buffering.
- transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs).
Any mature log reader should handle those use cases, if you have no control over the rotation is it possible to load the data after rotation? Logrotated has pre and post rotation functions.
- It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data? Have not seen those cases.
Hope it help a little. Scot
On Mon, Jan 29, 2018 at 9:34 AM, Declan White <declanw@is.bbc.co.uk> wrote:
Hullo.
I'm trying to fit syslog-ng around a basic problem and looking for tips.
I have log files growing on one machine that I want to follow and reliably replicate to a central machine, so it's effectively a basic 'tail -f' job. It seems simple, but as I try and close out the possible error conditions it's getting hairier and hairier.
e.g. - by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage. - transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs). - It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data?
I'm not necessarily looking to get syslog-ng to recreate the file exactly, just to send enough information to allow something else to work out the full order of events. Googling around to see how others solve this problem, I see people doing infinite rsync loops, or installing large Java beasties, or paying someone else to make it all go away.
I tried using rsyslog, but it melted down into a screaming puddle of nondeterministic threading.
Is what I'm attempting really as hard as it seems?
- D ____________________________________________________________ __________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product= syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
____________________________________________________________ __________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/? product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
On Tue, Jan 30, 2018 at 09:58:55AM +0100, Fekete, R?bert wrote:
Hi,
Just a few quick notes: * recent version of syslog-ng OSE also support disk-based buffering to avoid network and server outages.
Yup. Already using. It also deals with the case where logfile rotations still need to happen even if you can't send data home for two days. I suspect you'd miss a whole day from the stream otherwise.
syslog-ng also has a commercial Premium Edition that can acknowledge the receiving of messages, and resend messages that got lost.
Aye, I'm aware of it, but won't get access to it. I'll have to solve that bit later. Is there a list of output modules that do ACK's?
* syslog-ng can handle very long messages, so long lines shouldn't be a problem (adjust log-message-size() if needed). If a message is longer than that, then it will be truncated.
Yeah, I just can't trust every log file I'll meet, and want to not need to have to. A couple of tags like "truncated" and "continuation" and a way of encoding potential binary (e.g. mmencode, or urlencode) would be handy there. I'm converting it all into a JSON metadata'd structure before sending anyway.
* AFAIK you cannot find inodes, but you can transfer the FILE_NAME macro that includes the path+filename
That's not enough to uniquely identify the original position of a jigsaw piece.
* AFAIK syslog-ng holds the source files until it is reading from them, so file truncation should not be a problem, but I'm not entirely sure about that
I kinda want to keep an eye on the case where someone tries to delete their tracks from the logs. I'd like to be able to spot a log shrinking/being replaced out-of-schedule. Just a little extra metadata for the paranoid.
Regards,
Robert
On Tue, Jan 30, 2018 at 2:26 AM, Scot <scotrn@gmail.com> wrote:
- by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage.
I use rsyslog on clients and relays with TCP disk buffering including relays. Properly measured you should know when you are buffering.
I watched rsyslog loop on the CPU trying to throw data out a failed RELP network connection. I don't think it's safe to use outside of Linux.
- transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs).
Any mature log reader should handle those use cases, if you have no control over the rotation is it possible to load the data after rotation? Logrotated has pre and post rotation functions.
I'm using syslog-ng as the mature log reader. I have rsync for yesterday's logs. I want today's.
- It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data? Have not seen those cases.
Not all applications expect their logfiles to be sent via syslog, with character-range and line length limits. That doesn't stop people needing those logfiles centralised.
Hope it help a little. Scot
On Mon, Jan 29, 2018 at 9:34 AM, Declan White <declanw@is.bbc.co.uk> wrote:
Hullo.
I'm trying to fit syslog-ng around a basic problem and looking for tips.
I have log files growing on one machine that I want to follow and reliably replicate to a central machine, so it's effectively a basic 'tail -f' job. It seems simple, but as I try and close out the possible error conditions it's getting hairier and hairier.
e.g. - by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage. - transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs). - It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data?
I'm not necessarily looking to get syslog-ng to recreate the file exactly, just to send enough information to allow something else to work out the full order of events. Googling around to see how others solve this problem, I see people doing infinite rsync loops, or installing large Java beasties, or paying someone else to make it all go away.
I tried using rsyslog, but it melted down into a screaming puddle of nondeterministic threading.
Is what I'm attempting really as hard as it seems?
- D
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Declan White
Hmmm - a few questions: - where are the files coming from that you want to follow? Is there another syslog daemon running? Or are you capturing application logs that cannot use kernel logging? Syslog-ng can easily use files as sources if that is all you have. If you can use syslog-ng as the system log daemon - you can easily write local files *and* forward to a central logger. It's pretty easy to have the central syslog server that receives the logs separate them by sending server - I have used HOST_FROM pretty often since it doesn't need name resolution (better for performance) and it will deal with non RFC logs fairly well. As far as metadata - I typically put some of this in the filename - like date, host_from, facility, severity, etc. Things like file parsing, etc. can usually be dealt with using appropriate mix of flags and parsing/rewrite rules (if necessary). Does this help? Jim On Mon, Jan 29, 2018 at 9:34 AM, Declan White <declanw@is.bbc.co.uk> wrote:
Hullo.
I'm trying to fit syslog-ng around a basic problem and looking for tips.
I have log files growing on one machine that I want to follow and reliably replicate to a central machine, so it's effectively a basic 'tail -f' job. It seems simple, but as I try and close out the possible error conditions it's getting hairier and hairier.
e.g. - by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage. - transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs). - It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data?
I'm not necessarily looking to get syslog-ng to recreate the file exactly, just to send enough information to allow something else to work out the full order of events. Googling around to see how others solve this problem, I see people doing infinite rsync loops, or installing large Java beasties, or paying someone else to make it all go away.
I tried using rsyslog, but it melted down into a screaming puddle of nondeterministic threading.
Is what I'm attempting really as hard as it seems?
- D ____________________________________________________________ __________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/? product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, "Declan White" <declanw@is.bbc.co.uk> írta 2018-01-29 14:34-kor:
I'm trying to fit syslog-ng around a basic problem and looking for tips.
I have log files growing on one machine that I want to follow and reliably replicate to a central machine, so it's effectively a basic 'tail -f' job. It seems simple, but as I try and close out the possible error conditions it's getting hairier and hairier.
e.g. - by default, there is nothing you can make with syslog-ng alone that will not lose data during a network or endpoint outage. - transporting metadata can tell you which file the data is from, but not where in the file it's from, so you can't really tell if you have duplicate data, or missed data. (The inode number might be handy too) - behaviour around input file truncation is fuzzy. That a truncation has occured might be useful metadata to send (if you're looking for people fiddling logs). - It doesn't seem to be able to encode binary/NULs in the logs, so it cannot relay data from 'untrusted' application logs? - Not sure what it does with very long lines. Loses data?
I'm not necessarily looking to get syslog-ng to recreate the file exactly, just to send enough information to allow something else to work out the full order of events. Googling around to see how others solve this problem, I see people doing infinite rsync loops, or installing large Java beasties, or paying someone else to make it all go away.
I tried using rsyslog, but it melted down into a screaming puddle of nondeterministic threading.
I don't understand every point what and how you want. But if you use disk-buffering, rfc5424 syslog protocol, and enrich your log messages with some useful metadata I think your problem can be solved with syslog-ng. Eg. if you want to replicate a file source, you can add the file name and position to every message into the kv-pair section, while the message itself remain intact. If you have multi-line messages (like java stacktrace) it's also doable. And with the rfc5424 the message size is also not a problem. Just you need enough ram for buffers and maybe some finetuning if the eps is high enough. Cheers, Gyu
participants (5)
-
Declan White
-
Fekete, Róbert
-
Jim Hendrick
-
PÁSZTOR György
-
Scot