[syslog-ng] disk-buffer in elasticsearch2 destination loses messages if docker container is killed

Jose Angel Santiago jasantiago at stratio.com
Fri Sep 14 12:05:04 UTC 2018


Hi,

I've got more accurate information about where are my lost messages.

Now I'm using disk-buffer in syslog-agent and syslog-relay, and I've
checked that lost messages are the ones sent by the syslog-agent when the
syslog-relay docker container is being killed. I can see those messages on
syslog-agent log (I've got both agent & relay in debug mode) with its
corresponding "Outgoing message" line, but those messages never reach the
relay.

Could it be that the relay docker container still returns ACK to the agent
(the agent resolves relay fqdn with a custom DNS) while syslog-ng process
within the container is being stopped? I'm about to test again using
tcpdump to confirm this theory,

BTW, disk-buffers works ok, sometimes I get some duplicated messages when
restarting the relay but that's not a problem for me. Forget about my
.persist file re-creation theory, it doesn't happen.

Regards.



2018-09-13 18:07 GMT+02:00 Péter, Kókai <peter.kokai at oneidentity.com>:

> Hello,
>
> It would not make sense to replace the persist file after restart, so it
> is not something that syslog-ng does. Only if that file is corrupted, in
> that case at startup there should be a log about it, have you checked the
> syslog-ng logs ? (it would be better to enable debug and/or verbose logs,
> and if possible share it with us.)
>
> Could you reproduce the same behavior without docker (if possible) ?
>
>
>
> Best Regards,
> Peter Kokai
>
> On Thu, Sep 13, 2018 at 4:44 PM Jose Angel Santiago <
> jasantiago at stratio.com> wrote:
>
>> Hi,
>>
>> I guess I know what is happening, when I start from scratch the docker
>> container, even I provide a persist file and a buffer file within the
>> mapped volume, syslog-ng recreates them so all messages in buffer file
>> which were not processed by the relay are lost.
>>
>> Is there any way to tell syslog-ng to use an already existing .persist
>> file so it doesn't recreate the .rqf file?
>>
>> Regards.
>>
>>
>>
>> 2018-09-13 16:23 GMT+02:00 Budai, László <laszlo.budai at oneidentity.com>:
>>
>>> Hi,
>>>
>>> one problem could be if the flush-limit would be greater than 1... in
>>> that case syslog-ng would use a HttpBulkMessageProcessor.
>>> In this case syslog-ng pass the message to the HttpBulkMessageProcessor
>>> and sends back a positive ACK to the LogSource (so the message is removed
>>> from the diskbuffer), and if the dockerimage is killed, all the messages
>>> stored in the HttpBulkMessageProcessor are lost.
>>> But in your case syslog-ng should use the HttpSingleMessageProcessor...
>>> which means that the messages are sent one-by-one...
>>> Could you check the diskbuffer with the dqtool?
>>>
>>>
>>> L.
>>>
>>> On Thu, Sep 13, 2018 at 3:50 PM, Jose Angel Santiago <
>>> jasantiago at stratio.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm running syslog-ng (with an elasticsearch2 destination configured)
>>>> within a docker container, and I'm trying to avoid loss of messages if I
>>>> kill the docker container and I start it again.
>>>>
>>>> This is my scenary:
>>>>
>>>> - A service which produces 20 lines of log per second
>>>> - A sislog-ng instance reading from a wildcard-file source (but
>>>> actually it only reads logs from the above service, let's call it
>>>> syslog-agent), which sends all logs to another syslog-ng instance (the one
>>>> running in a docker container, let's call it syslog-relay) though a network
>>>> destination.
>>>> - The syslog-relay sends messages to an elasticsearch instance, with
>>>> following configuration:
>>>>
>>>> options {
>>>>     chain-hostnames(no);
>>>>     use-dns(no);
>>>>     keep-hostname(yes);
>>>>     owner("syslog-ng");
>>>>     group("stratio");
>>>>     perm(0640);
>>>>     time-reap(30);
>>>>     mark-freq(10);
>>>>     stats-freq(0);
>>>>     bad-hostname("^gconfd$");
>>>>     flush-lines(100);
>>>>     log-fifo-size(1000);
>>>>     };
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *destination d_elastic_default_0 {    elasticsearch2(
>>>> cluster("myelastic")        cluster-url("https://myelastic.logs:9200
>>>> <https://myelastic.logs:9200>")        client_mode("https")
>>>> index("default")        type("log")        flush-limit(1)
>>>> disk-buffer(            mem-buf-size(16M)
>>>> disk-buf-size(16M)            reliable(yes)
>>>> dir("/syslog-ng/log")        )        http-auth-type("clientcert")
>>>> java-keystore-filepath("/etc/syslog-ng/certificates/syslog-relay.jks")
>>>> java-keystore-password("XXXXXX")
>>>> java-truststore-filepath("/etc/syslog-ng/certificates/ca-bundle.jks")
>>>> java-truststore-password("XXXXXXXXXX")    );};*
>>>>
>>>> - The dir "/syslog-ng/log" is mapped to a path "/tmp/buffer" from the
>>>> host where the docker container is running, so when I kill the docker
>>>> container, the buffer file is not lost.
>>>> - I've set flush-limit to 1 because I thought that I may lost 1 message
>>>> only as much.
>>>>
>>>> This architecture is working fine (flush-limit=1 makes very slow, but
>>>> for this test is ok), but if I kill the syslog-relay docker container, wait
>>>> 5 to 10 seconds and start it again from scratch, I can see that several
>>>> hundreds of logs are missing in elasticsearch. I check it by stopping the
>>>> logger service and letting syslog-ng agent & relay to finish the process
>>>> enqueued messages.
>>>>
>>>> I can see in the syslog-agent stats that all logs messages have been
>>>> processed, so it seems the problem is on the syslog-relay.
>>>>
>>>> Is this behaviour expected? If so, how can I protect against loss of
>>>> messages in case of a syslog-relay docker container unexpected kill?
>>>>
>>>> Thanks in advance.
>>>>
>>>>
>>>> --
>>>>
>>>> | Jose Angel Santiago
>>>>
>>>> [image: Logo_signature2.png] <http://www.stratio.com/>
>>>>
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>>
>>>> 28224 Pozuelo de Alarcón, Madrid, Spain
>>>>
>>>> +34 918 286 473 <+34%20918%2028%2064%2073> | www.stratio.com
>>>> <https://twitter.com/stratiobd>
>>>> <https://www.linkedin.com/company/stratiobd>
>>>> <https://www.youtube.com/c/StratioBD>
>>>>
>>>> ____________________________________________________________
>>>> __________________
>>>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>>> Documentation: http://www.balabit.com/support/documentation/?
>>>> product=syslog-ng
>>>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>>>
>>>>
>>>>
>>>
>>> ____________________________________________________________
>>> __________________
>>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>> Documentation: http://www.balabit.com/support/documentation/?
>>> product=syslog-ng
>>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>>
>>>
>>>
>>
>>
>> --
>>
>> | Jose Angel Santiago
>>
>> [image: Logo_signature2.png] <http://www.stratio.com/>
>>
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>
>> 28224 Pozuelo de Alarcón, Madrid, Spain
>>
>> +34 918 286 473 <+34%20918%2028%2064%2073> | www.stratio.com
>> <https://twitter.com/stratiobd>
>> <https://www.linkedin.com/company/stratiobd>
>> <https://www.youtube.com/c/StratioBD>
>> ____________________________________________________________
>> __________________
>> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> Documentation: http://www.balabit.com/support/documentation/?
>> product=syslog-ng
>> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>>
>>
> ____________________________________________________________
> __________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?
> product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>
>


-- 

| Jose Angel Santiago

[image: Logo_signature2.png] <http://www.stratio.com/>

Vía de las dos Castillas, 33, Ática 4, 3ª Planta

28224 Pozuelo de Alarcón, Madrid, Spain

+34 918 286 473 | www.stratio.com
<https://twitter.com/stratiobd> <https://www.linkedin.com/company/stratiobd>
<https://www.youtube.com/c/StratioBD>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20180914/33bf9d6a/attachment-0001.html>


More information about the syslog-ng mailing list