[syslog-ng] multiple file sources, worked - some have now gone silent

Declan White declanw at is.bbc.co.uk
Fri Feb 16 17:22:45 UTC 2018


On Fri, Feb 16, 2018 at 09:44:22AM +0100, Nagy, G?bor wrote:
> Hello!
> 
> From the description you gave it is hard to find out what is happening
> without some more information.

Thanks for replying. 

> Can you share your configuration, please?
> Do you use the internal source, wildcard file-source?

Attaching config. 
 
> We also have a tool that can be used to collect environment and information
> called 'syslog-ng-debun'.
> It can collect sensitive data, so be sure to replace/remove those before
> sending the debun, e.g. IP addresses and passwords from config, etc.
> https://github.com/balabit/syslog-ng/blob/master/contrib/syslog-ng-debun
> https://github.com/balabit/syslog-ng/blob/master/contrib/README.syslog-ng-debun

I'll have a snuffle, but the box it blew up on is sensitive, replicating the issue will be non-trivial, and I don't have much time, at all.

Sol10 x86 stripped down base build + sunstudio12 compiler. Packages compiled for syslog-ng:
pkg-config-0.29
coreutils-8.29 
binutils-2.29.1
gmp-6.1.2 
mpfr-3.1.6
mpc-1.1.0 
gcc-7.2.0 
chrpath-0.16
glib-2.50.3-gcc
pcre-8.41-gcc  
json-c-0.13-gcc
--enable-java=no 
--with-mongoc=no 
--with-jsonc=system
 
Yes, Solaris is a sinking ship, but the evacuation will take some time. I need the applications on either side of the evacuation route to match first.

The log contains messages like:
Feb 12 22:28:24.07 host1 syslog-ng[12121]: Destination reliable queue full, dropping message; filename='/var/syslog-ng/syslog-ng-00000.rqf', queue_len='3929', mem_buf_size='10000', disk_buf_size='2000000', persist_name='afsocket_dd_qfile(stream,localhost.afunix:/var/syslog-ng/logserver.socket)'

I don't know why it needs to drop messages when the source is a file and the flow-control is on.

> As you are using Solaris as syslog-ng 3.12.1 have you built it from source
> or used a package?

Source. Had to patch it to get it to compile. Patches attached. Some are patches already in > 3.12.1

Tried to remove GCCisms but failed. Had to compile GCC and many other things too. (Fun fact - GCC now contains GCCisms in libcpp, so can only be compiled with GCC)

Tried a later syslog-ng version but the tarball was missing 'configure'. One of them was missing the bundled json-c.
Needed an empty "json_object_private.h" in the include path (should be another patch, but it was easier just to touch the file).
Openssl now compulsory in syslog-ng but doesn't compile against Solaris openssl, as it assumes some optional openssl features are present (EC algorithms - patentfoo/geopolitics..), so I found an old 0.9.8 just to get it to compile, but I don't want to use SSL anyway (it's dangerous to leave 'custom' SSL libs around to age).

I'm building it to install in an isolated directory, so it can be tested in there independantly of any already installed/running version on the same machine, as a different user.
To make it a deployable I'm running 'chrpath' on all the ELF dynamic objects to replace the RPATH/RUNPATH so it uses its own personal library directory for its own libraries. It's more permanent and effective than the equivalent LD_LIBRARY_PATH wrapper script, and guarantees the deployable is self-contained.

This causes fun, as syslog-ng mostly relies on the compiled-in install paths being the same as the config runpaths, and most of these paths cannot be controlled in the config file. 
This makes the command line veeeerrrryyyy looooooong, and there's no way to stop it trying to chdir to the wrong place on startup (so core dumps probably end up in whatever directory you were in when you started it?).
There may be more than one syslog-ng being run by more than one non-root user.

You've already seen me abandon unix-stream as a source in previous listmails - it breaks when a C library-call tracer is *not* attached, probably for threading reasons (tracer will be serialising), at which point I turned and ran. You've also seen me fix the SGID dir usage case (syslog-ng really needs a way to set its own umask), and hit the framing differences between unix-stream() and network()/syslog() (maybe frame/no-frame could just be added as flags?)
You've also seen me hit SGID issues, as syslog-ng does not trust the user's umask and overrides it with something too restrictive.

My use case is strangely simple. I want changes to a list of files on one host replicated to another host, reliably. Reliably means accounting for any network and host disruption, file truncation or rotation.
This may seem straightforward but there is no such software. People I've tracked down in the same situation are just running rsync in while(1) loops, which doesn't scale. (Also, I've seen rsync protocol-deadlock on big-v-little-endian + 32v64 + differing-raw-directory-order weirdness before).

I tried rsyslog (which required configuration in env vars as well as command line options) and watched its 'reliable protocol' module go insane, flinging messages at a failed connection socket, spinning on the CPU flinging the same bytes, then timing out and declaring success.

So as you can see I've been having fun. I can only logically conclude I've run over Murphy's cat.

If this is all blowing up because the patches I applied to get it to compile weren't thread safe, that would be appropriately ironic.

- Declan

> Best Regards,
> Gabor
> 
> 
> On Thu, Feb 15, 2018 at 6:13 PM, Declan White <declanw at is.bbc.co.uk> wrote:
> 
> > The data in the core dump would need to stay in my hands, so that's no
> > good.
> >
> > I'm going to have to toss syslog-ng out :(
> > Silencing sources without logging anything, when things went wrong in the
> > most common way, is a complete deal breaker.
> >
> > - Declan
> >
> > On Wed, Feb 14, 2018 at 09:52:37PM +0100, Fabien Wernli wrote:
> > > Hi,
> > >
> > > this is very unfortunate. I'm sure a core dump of the process would be
> > > helpful to the developers. Not sure if gcore or similar is available on
> > > Solaris though.
> > >
> > > cheers
> > >
> > > ____________________________________________________________
> > __________________
> > > Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> > > Documentation: http://www.balabit.com/support/documentation/?product=
> > syslog-ng
> > > FAQ: http://www.balabit.com/wiki/syslog-ng-faq
> > >
> > >
> >
> > --
> > Declan White
> > ____________________________________________________________
> > __________________
> > Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> > Documentation: http://www.balabit.com/support/documentation/?product=
> > syslog-ng
> > FAQ: http://www.balabit.com/wiki/syslog-ng-faq
> >
> >

> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
> 


-- 
Declan White
IT Services - Unix Engineer
BBC Service Operations

T: +44 (0)2036 181487 
E: declan.white at atos.net
W: uk.atos.net

This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.
-------------- next part --------------
@version: 3.12
#@include "/usr/local/syslog-ng/etc/scl/*/*.conf"

options{ 
        use_dns(no) ;
        use_uniqid(yes) ;
        time_reopen(5) ;
        time_reap(90000) ;
        frac-digits(2) ;
} ;

source in_internal {
        internal() ;
} ;

source in_test {
        file( "/var/syslog-ng/test.log"
                flags(no-parse)
        ) ;
} ;
source in_var_log_daemon { file( "/var/log/daemon.log" flags(no-parse) ) ; } ;
source in_var_log_rsyncd { file( "/var/log/rsyncd.log" flags(no-parse) ) ; } ;
source in_var_log_authlog { file( "/var/log/authlog" flags(no-parse) ) ; } ;
source in_var_log_syslog { file( "/var/log/syslog" flags(no-parse) ) ; } ;
source in_var_adm_messages { file( "/var/adm/messages" flags(no-parse) ) ; } ;
source in_var_cron_log { file( "/var/cron/log" flags(no-parse) ) ; } ;
source in_var_svc_log { 
        wildcard-file( 
                base-dir("/var/svc/log") 
                filename-pattern("*.log")
                recursive(no)
                follow-freq(2)
                flags(no-parse) 
        ) ; 
} ;

destination out_internal {
        file( "/var/syslog-ng/syslog-ng.log"
             create-dirs(yes)
        ) ;
} ;
destination out_socket {
        unix-stream( "/var/syslog-ng/logserver.socket" 
                template("JSON: $(format-json --key LOGHOST,FILE_NAME,ISODATE,RCPTID,RUNID,SEQNUM,TAGS,MSG)\n")
                disk-buffer(
                        mem-buf-size(10000)
                        disk-buf-size(2000000)
                        reliable(yes)
                        dir("/var/syslog-ng")
                )
                flags(syslog-protocol)
        ) ;
} ;

log {
        source(in_internal) ;
        destination(out_internal) ;
} ;

log {
        source(in_test) ; 
        source(in_var_log_daemon) ; 
        source(in_var_log_rsyncd) ; 
        source(in_var_log_authlog) ; 
        source(in_var_log_syslog) ; 
        source(in_var_adm_messages) ; 
        source(in_var_cron_log) ; 
        source(in_var_svc_log) ; 
        destination(out_socket) ;
        flags(flow-control) ;
} ;

@include "/etc/syslog-ng.d/[0-9]*.conf"

-------------- next part --------------
# These are huge files that are very busy
source in_var_app2_file1_log { file( "/var/app2/file1.log" flags(no-parse) ) ; } ;
source in_var_app2_file1_err { file( "/var/app2/file1.err" flags(no-parse) ) ; } ;
source in_var_app2_file2_log { file( "/var/app2/file2.log" flags(no-parse) ) ; } ;
source in_var_app2_file2_err { file( "/var/app2/file2.err" flags(no-parse) ) ; } ;

log {
   source(in_var_app2_file1_log) ;
   source(in_var_app2_file1_err) ;
   source(in_var_app2_file2_log) ;
   source(in_var_app2_file2_err) ;
        destination(out_socket) ;
        flags(flow-control) ;
} ;

-------------- next part --------------
source in_var_app1_log { file( "/var/app1/app1.log" flags(no-parse) ) ; } ;
source in_var_app1_err { file( "/var/app1/app1.err" flags(no-parse) ) ; } ;

log {
   source(in_var_app1_log) ;
   source(in_var_app1_err) ;
        destination(out_socket) ;
        flags(flow-control) ;
} ;

-------------- next part --------------
A non-text attachment was scrubbed...
Name: syslog-ng-3.12.1-1.diff.gz
Type: application/x-gunzip
Size: 603 bytes
Desc: not available
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20180216/00ebd510/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: syslog-ng-3.12.1-2.diff.gz
Type: application/x-gunzip
Size: 243 bytes
Desc: not available
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20180216/00ebd510/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: syslog-ng-3.12.1-3.diff.gz
Type: application/x-gunzip
Size: 334 bytes
Desc: not available
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20180216/00ebd510/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: syslog-ng-3.12.1-4.diff.gz
Type: application/x-gunzip
Size: 2217 bytes
Desc: not available
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20180216/00ebd510/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: syslog-ng-3.12.1-5.diff.gz
Type: application/x-gunzip
Size: 345 bytes
Desc: not available
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20180216/00ebd510/attachment-0004.bin>


More information about the syslog-ng mailing list