Reliable syslog and network outages
I'm trying to work out whether syslog-ng can deliver syslog messages reliably in the face of client reboots during network partitions. From reading the documentation, it is not clear where log_fifo stores its data (disk or memory) and therefore whether or not log entries on a remote syslog client that queue up during a partition of the network which seperates syslog client from loghost will remain queued if the syslog client host reboots. If my approach is unworkable, pointers or aid on how to solve the general problem another way would be greatly appreciated. Here are the details: 1. Several administered machines are located at diverse locations, generally where they are used primarily/entirely by local users. 2. Local users can ordinarily continue to do useful work even if they lose external network connectivity for a while; consequently HA network connectivity is not a user requirement. 3. The machines do not generally require direct intervention very often, so from an administration standpoint, 100% reachability is also not a requirement. 4. As a result of points 2 and 3, highly reliable connectivity is not available; outages of minutes or hours really do occur. 5. Pre-syslog-ng implementations of syslog using UDP will simply lose all log entries during the time that a network is partitioned. This is definitely not OK. 6. Syslog-ng appears to offer TCP-based delivery which certainly solves one issue; normal network paket loss is not an issue. 7. Syslog-ng appears to maintain an internal FIFO which, if sized large enough on a syslog relay host at the remote location(s), could collect all log information generated during a partition and then successfully deliver it when connectivity is restored. 8. If, however, syslog-ng's internal FIFO is memory-based then if the remote location's syslog relay host is rebooted during the network partition then all that it has accumulated and not yet transferred will be lost. (Similarly if the death of the TCP connection is treated as reason to give up, rather than continually retry.) 9. What information I have found on dealing with this appears to deal with loghost unreliability rather than network unreliability, and thus recommends providing HA through additional, redundant, loghosts. Setting aside the problems in reconcilliation of logs that this implies, it takes away the reason for having syslog deliver via IP directly (rather than, say, email) entirely. Does syslog-ng provide a means to deal with this problem? Does some other solution exist? Thanks in advance. - Raz
Syslog-ng keep the undelievarable messages in internal queue ( memory ) (Naturally when you reboot your syslog-ng machine you lost all the queue) As far as there is some space in the queue...messages aren't lost and sended as soon as the receiver came back. When the queue is filled new messages will be lost. I've done extensive testing with syslog-ng, some own parsing engine and my-sql. I can assure that syslog-ng handles load of thousands of message for second ( i tried up to 15000 ), and the queue works really good when you've some program destinations that are slower than a burst of messages. Also if you have more destinations, syslog-ng doesn't wait for all the destinations became available to empty its queue, but keep in the queue only the messages for the unreachable destinations. That's works fine when you have some external parser that stores messages in a db. When the connection to the db is broken, parsers bloks on connection and doesn't read their stdin. That way syslog-ng queue the messages and it feeds them when the connection to the db is available again. Amodiovalerio Verde Security Project Manager ----- Original Message ----- From: "Roland Turner" <raz.flfybt.at@countersnipe.com> To: <syslog-ng@lists.balabit.hu> Sent: Friday, August 15, 2003 6:27 PM Subject: [syslog-ng]Reliable syslog and network outages
I'm trying to work out whether syslog-ng can deliver syslog messages reliably in the face of client reboots during network partitions.
From reading the documentation, it is not clear where log_fifo stores its data (disk or memory) and therefore whether or not log entries on a remote syslog client that queue up during a partition of the network which seperates syslog client from loghost will remain queued if the syslog client host reboots. If my approach is unworkable, pointers or aid on how to solve the general problem another way would be greatly appreciated.
Here are the details:
1. Several administered machines are located at diverse locations, generally where they are used primarily/entirely by local users.
2. Local users can ordinarily continue to do useful work even if they lose external network connectivity for a while; consequently HA network connectivity is not a user requirement.
3. The machines do not generally require direct intervention very often, so from an administration standpoint, 100% reachability is also not a requirement.
4. As a result of points 2 and 3, highly reliable connectivity is not available; outages of minutes or hours really do occur.
5. Pre-syslog-ng implementations of syslog using UDP will simply lose all log entries during the time that a network is partitioned. This is definitely not OK.
6. Syslog-ng appears to offer TCP-based delivery which certainly solves one issue; normal network paket loss is not an issue.
7. Syslog-ng appears to maintain an internal FIFO which, if sized large enough on a syslog relay host at the remote location(s), could collect all log information generated during a partition and then successfully deliver it when connectivity is restored.
8. If, however, syslog-ng's internal FIFO is memory-based then if the remote location's syslog relay host is rebooted during the network partition then all that it has accumulated and not yet transferred will be lost. (Similarly if the death of the TCP connection is treated as reason to give up, rather than continually retry.)
9. What information I have found on dealing with this appears to deal with loghost unreliability rather than network unreliability, and thus recommends providing HA through additional, redundant, loghosts. Setting aside the problems in reconcilliation of logs that this implies, it takes away the reason for having syslog deliver via IP directly (rather than, say, email) entirely.
Does syslog-ng provide a means to deal with this problem? Does some other solution exist?
Thanks in advance.
- Raz
_______________________________________________ syslog-ng maillist - syslog-ng@lists.balabit.hu https://lists.balabit.hu/mailman/listinfo/syslog-ng Frequently asked questions at http://www.campin.net/syslog-ng/faq.html
Amodiovalerio Verde wrote:
Syslog-ng keep the undelievarable messages in internal queue ( memory )
(Naturally when you reboot your syslog-ng machine you lost all the queue)
This is pretty much what I feared. <snip>
I can assure that syslog-ng handles load of thousands of message for second ( i tried up to 15000 ), and the queue works really good when you've some program destinations that are slower than a burst of messages.
Ah, it sounds as though you've implemented the FIFO/buffer as a performance feature rather than as a high-availability one. Is there any straightforward way to build a relay out of syslog-ng that offers reliable[1] forwarding of syslog information? If not, would syslog-ng be interested in accepting a patch which added an option to make the FIFO buffer persistent? - Raz 1: Clearly, perfect reliability is impossible; if a machine containing queued logs is physically destroyed before it gets the opportunity to deliver its logs, then they really will be lost. I am interested in reliability with respect to a temporary loss of connectivity between syslog relay and syslog collector, and a reboot of the relay during that loss of connectivity. The degree of reliability that I have in mind is comparable to that which I would expect of a mail relay; I certainly wouldn't want it throwing data away because of a reboot, but I'd accept data loss caused by physical destruction of the machine.
well, you could disable syslog-ng queue, and use a destination program. So....syslog-ng get a message and send it immediatly to destination. Now with syslog-ng queue disabled, is up to you to be sure to handle and catch that message. Your program try to send it to a remote machine, if it cannot, it append it to a file. As soon as the host came back, your program ( without blocking on stdin or you will lose new messages syslog-ng is feeding ) , reads that file and sends the messages to the log host Naturally on a reboot, your program will find a non empty file, so if the host is up, it will send all the messages. This could not work if you have to handle big loads of messages or peaks... Your problem could simply be slower than syslog-ng, and lost some messages, while doing something else. I never consider too much this problem, cause I use only UDP logging...and it's well known by committent that messages could be lost. But probably I have to do a policy change, so I had to use TCP, reliability, compression and some few things. If you're interested I let you know. Amodiovalerio Verde
Ah, it sounds as though you've implemented the FIFO/buffer as a performance feature rather than as a high-availability one.
Is there any straightforward way to build a relay out of syslog-ng that offers reliable[1] forwarding of syslog information?
If not, would syslog-ng be interested in accepting a patch which added an option to make the FIFO buffer persistent?
- Raz
1: Clearly, perfect reliability is impossible; if a machine containing queued logs is physically destroyed before it gets the opportunity to deliver its logs, then they really will be lost. I am interested in reliability with respect to a temporary loss of connectivity between syslog relay and syslog collector, and a reboot of the relay during that loss of connectivity. The degree of reliability that I have in mind is comparable to that which I would expect of a mail relay; I certainly wouldn't want it throwing data away because of a reboot, but I'd accept data loss caused by physical destruction of the machine.
On Mon, Aug 18, 2003 at 11:58:55AM +0100, Roland Turner wrote:
Amodiovalerio Verde wrote: Is there any straightforward way to build a relay out of syslog-ng that offers reliable[1] forwarding of syslog information?
If not, would syslog-ng be interested in accepting a patch which added an option to make the FIFO buffer persistent?
We would definitely be interested in a such a thing, but please base your work on the syslog-ng 2 tree, as I want the next stable release based on that one. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
On Fri, Aug 15, 2003 at 05:27:14PM +0100, Roland Turner wrote:
Does syslog-ng provide a means to deal with this problem? Does some other solution exist?
not yet, though disk buffering has been on my todo list for a while. -- Bazsi PGP info: KeyID 9AF8D0A9 Fingerprint CD27 CFB0 802C 0944 9CFD 804E C82C 8EB1
participants (3)
-
Amodiovalerio Verde
-
Balazs Scheidler
-
Roland Turner