<div dir="ltr"><div dir="ltr">Hi Evan,<div><br></div><div>After we fork the process, if the initialization failed in the new deamon, first we wait for it to exit, then if it did not exit in 3 seconds, we send SIGTERM, then SIGKILL to it.</div><div>If even those signals could not not terminate the process, we let it be, and start a new. In this case we log the following:</div><div>"Initialization failed but the daemon did not exit, even when forced to, trying to recover"</div><div><br></div><div>Could you please check, whether you have seen this log somewhere? (journal, etc...)</div><div><br></div><div>Best regards,</div><div>Attila</div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jan 28, 2019 at 10:11 AM Szakacs, Attila <<a href="mailto:attila.szakacs@balabit.com">attila.szakacs@balabit.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi Evan,<div><br></div><div>Thank you for the heads-up!</div><div><br></div><div>This issue might be related to this PR: <a href="https://github.com/balabit/syslog-ng/pull/2099" target="_blank">https://github.com/balabit/syslog-ng/pull/2099</a></div><div>I will try to reproduce it.</div><div><br></div><div>Best regards,</div><div>Attila</div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail-m_-8828739522549133361gmail_attr">On Mon, Jan 21, 2019 at 9:51 PM Evan Rempel <<a href="mailto:erempel@uvic.ca" target="_blank">erempel@uvic.ca</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">We just had a network even in our environment that resulted in many hosts being isolated on the network in some very odd ways.<br>

The hosts had some connections inbound, but no outbound and the network seemed to come and go very quickly.<br>

<br>

<br>

syslog-ng detected that it lost connection to its down-stream syslog collectors a did the normal connection reaping. We have this set for 5 seconds.<br>

At some point the syslog-ng process terminated and the supervisor process launched a new child. This worked well, but in<br>

this odd senario this happened quickly and at on point the supervisor process failed to reap one of its defunct children but still launched a new child<br>

process. The defunct process seemed to hold onto the /dev/log handle so the new child could not get access to it and the result is that<br>

we lost hours of logs that flow through the /dev/log socket.<br>

When we discovered the issue, there where two children of the supervisor process. One sas in a "defunct" state while the other was running,<br>

but only logging kernel messages (from /dev/kmsg) and from the internal syslog-ng source.<br>

I suspect that the child reaping process on SIGCHLD does not use a while loop and misses some children.<br>

<br>

Evan.<br>

<br>

______________________________________________________________________________<br>

Member info: <a href="https://lists.balabit.hu/mailman/listinfo/syslog-ng" rel="noreferrer" target="_blank">https://lists.balabit.hu/mailman/listinfo/syslog-ng</a><br>

Documentation: <a href="http://www.balabit.com/support/documentation/?product=syslog-ng" rel="noreferrer" target="_blank">http://www.balabit.com/support/documentation/?product=syslog-ng</a><br>

FAQ: <a href="http://www.balabit.com/wiki/syslog-ng-faq" rel="noreferrer" target="_blank">http://www.balabit.com/wiki/syslog-ng-faq</a><br>

<br>

</blockquote></div>

</blockquote></div>