[syslog-ng] syslog-ng anon patch

Roberto Nibali ratz at tac.ch
Thu Jun 2 08:36:07 CEST 2005


Hi,

I just throw in my 2 cents ...

> The attached patch comes from http://dev.riseup.net/patches/syslog-ng

Gives you a 404 at first until you click on login.

> what it does is provide a simple filter to strip out unwanted regular
> expressions from logs, as well as an IP alias that enables you to strip
> out IP addresses from your logs. This patch has been applied to the
> Debian package of syslog-ng, I am writing here to let people know about
> it and submit it for consideration into syslog-ng.

While basically, despite all the written data retention documents for ISPs and
OSPs, I think this is a bad idea, I also see that this patch is rather
non-intrusive and might indeed help a couple of people working in these business
fields. Bad idea not least because the logic of hiding data should be in the
frontend and/or the extraction process (ETL) and not in the data storage. On a
central syslog server you'd like to have data mining theories applied for
example, where you need the whole set of raw data, unfiltered. Well, only
partially unfiltered, since one will certainly apply filters in their log
statements.

But as I said above, the patch is non-intrusive and has certain eligibility.

>  Data retention has become a hot legal topic for ISPs and other Online
>  Service Providers (OSPs). There are many instances where it is preferable
>  to keep less information on users than is collected by default on many
>  systems. In the United States it is not currently required to retain
>  data on users of a server, but you may be required to provide all data
>  on a user which you have retained.

This is a hot topic in Switzerland for example, where legislative reforms have
taken place which might demand just exactly that striping is not allowed.

>   Rather than scrubbing the information you don't want in logs, this patch
>   ensures that the information is never written to disk. Also, for those
>   daemons which log through syslog facilities, this patch provides a
>   convenient single configuration to limit what you wish to log.

This is not entirely true. With your patch you add a third method of dealing
with information. But it's not on the same level as the other two.

Method 1: have log statements which omit certain log lines, and don't set a
          catchall log statement

Method 2: build a filter for lines you'd like to match and forget. Add a
          destination statement with /dev/null as file destination.

Method 3: strip the lines.

Method 1 and 2 drop information, but basically maintain their value of truth.
Method 3 changes the information gain and thus, strongly speaking, dilutes the
truth. Dealing with the legal aspects of information gain/loss with regard to
dilution is a delicate matter.

>   This patch adds the filter "strip". For example:
> 
> 
>         filter f_strip {strip(<regexp>);};

I don't see the necessity to provide a keyword strip as a subset of replace.
Please drop it, while referring to the equivalent lines below, written by you.

>   replace(ips,"0.0.0.0 <http://0.0.0.0>") <--- this is the same as
> strip(ips)
>   replace(<regex>,"----") <--- this is the same as strip(regex)


> We provide a debian package of 1.6.7 with this patch added
> (the repository is http://deb.riseup.net/debian unstable main), or you
> can retrieve the patch yourself from
> http://dev.riseup.net/websvn/listing.php?repname=syslog-ng-anon&path=%2F&sc=0
> <http://dev.riseup.net/websvn/listing.php?repname=syslog-ng-anon&path=%2F&sc=0>
> and apply it with:
> 
> # patch -p1 < syslog-ng-anon.diff
> 
> 
> ------------------------------------------------------------------------
> 
> diff -uNr orig/syslog-ng-1.6.7/doc/Makefile.am new/syslog-ng-1.6.7/doc/Makefile.am
> --- orig/syslog-ng-1.6.7/doc/Makefile.am	2005-03-04 09:58:08.000000000 -0600
> +++ new/syslog-ng-1.6.7/doc/Makefile.am	2005-05-30 18:26:29.986769706 -0500
> @@ -4,7 +4,7 @@
>  
>  EXTRA_DIST = $(man_MANS) stresstest.sh syslog-ng.old.txt	\
>  	syslog-ng.conf.demo syslog-ng.conf.sample \
> -	syslog-ng.conf.solaris 
> -
> +	syslog-ng.conf.solaris README.syslog-ng-anon \
> +	syslog-ng-anon.conf
>  
>  
> diff -uNr orig/syslog-ng-1.6.7/doc/Makefile.in new/syslog-ng-1.6.7/doc/Makefile.in
> --- orig/syslog-ng-1.6.7/doc/Makefile.in	2005-04-09 05:50:58.000000000 -0500
> +++ new/syslog-ng-1.6.7/doc/Makefile.in	2005-05-30 18:29:45.194741054 -0500
> @@ -116,7 +116,9 @@
>  
>  EXTRA_DIST = $(man_MANS) stresstest.sh syslog-ng.old.txt	\
>  	syslog-ng.conf.demo syslog-ng.conf.sample \
> -	syslog-ng.conf.solaris 
> +	syslog-ng.conf.solaris README.syslog-ng-anon \
> +	syslog-ng-anon.conf
> +
>  
>  subdir = doc
>  ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
> diff -uNr orig/syslog-ng-1.6.7/doc/README.syslog-ng-anon new/syslog-ng-1.6.7/doc/README.syslog-ng-anon
> --- orig/syslog-ng-1.6.7/doc/README.syslog-ng-anon	1969-12-31 18:00:00.000000000 -0600
> +++ new/syslog-ng-1.6.7/doc/README.syslog-ng-anon	2005-05-30 18:25:40.828858265 -0500
> @@ -0,0 +1,93 @@
> +syslog-ng-anon
> +
> + This patch adds the capability to syslog-ng that allows you to strip
> + out any given regexp or all IP addresses from log messages before
> + they are written to disk. The goal is to give the system administrator
> + the means to implement site logging policies, by allowing them easy
> + control over exactly what data they retain in their logfiles,
> + regardless of what a particular daemon might think is best.
> +
> +Background:
> +
> + Data retention has become a hot legal topic for ISPs and other Online
> + Service Providers (OSPs). There are many instances where it is preferable
> + to keep less information on users than is collected by default on many
> + systems. In the United States it is not currently required to retain
> + data on users of a server, but you may be required to provide all data
> + on a user which you have retained. OSPs can protect themselves from legal
> + hassles and added work by choosing what data they wish to retain.
> +
> + From "Best Practices for Online Service Providers"
> + (http://www.eff.org/osp):
> +
> +  As an intermediary, the OSP [Online Service Provider] finds itself in
> +  a position to collect and store detailed information about its users
> +  and their online activities that may be of great interest to third
> +  parties. The USA PATRIOT Act also provides the government with
> +  expanded powers to request this information. As a result, OSP owners
> +  must deal with requests from law enforcement and lawyers to hand over
> +  private user information and logs. Yet, compliance with these demands
> +  takes away from an OSP's goal of providing users with reliable,
> +  secure network services. In this paper, EFF offers some suggestions,
> +  both legal and technical, for best practices that balance the needs
> +  of OSPs and their users' privacy and civil liberties.
> + 
> +  Rather than scrubbing the information you don't want in logs, this patch
> +  ensures that the information is never written to disk. Also, for those 
> +  daemons which log through syslog facilities, this patch provides a 
> +  convenient single configuration to limit what you wish to log.
> +  
> +  Here are some related links:
> +  
> +  Best Practices for Online Service Providers
> +  http://www.eff.org/osp
> +  http://www.eff.org/osp/20040819_OSPBestPractices.pdf
> +  
> +  EPIC International Data Retention Page
> +  http://www.epic.org/privacy/intl/data_retention.html
> +  
> +  Working Paper on Usage Log Data Management (from Computer, Freedom, and 
> +  Privacy conference) http://cryptome.org/usage-logs.htm
> +  
> +
> +Installing syslog-ng-anon 
> +  
> + Applying the patch
> +
> +  This patch has been tested against the following versions of syslog-ng:
> + 	. version 1.6.7
> + 	. Debian package syslog-ng_1.6.7-2
> +
> +
> +  To use this patch, obtain the source for syslog-ng 
> +  (http://www.balabit.com/downloads/syslog-ng/1.6/src/) and the latest
> +  syslog-ng-anon patch (http://dev.riseup.net/patches/syslog-ng/). 
> +  Uncompress the syslog-ng source and then apply the patch:
> +
> +  % tar -zxvf syslog-ng.tar.gz
> +  % cd syslog-ng
> +  % patch -p1 < syslog-ng-anon.diff
> + 
> +  Then compile and install syslog-ng as normal.
> +
> + Debian package
> +
> +  Alternately, you can install syslog-ng-anon from this repository:
> +  deb http://deb.riseup.net/debian unstable main
> +
> + How to use it
> +
> +  This patch adds the filter "strip". For example:
> +
> + 	filter f_strip {strip(<regexp>);};
> +
> +  This will strip out all matches of the regular expression on logs to
> +  which the filter is applied and replaces all matches with the fixed length
> +  four dashes ("----").
> +
> +  In place of a regular expression, you can put "ips", which will replace all
> +  internet addresses with 0.0.0.0. For example:
> +
> + 	filter f_strip {strip(ips);};
> +
> +  You can alter what the replacement strings are by using replace:
> diff -uNr orig/syslog-ng-1.6.7/doc/syslog-ng-anon.conf new/syslog-ng-1.6.7/doc/syslog-ng-anon.conf
> --- orig/syslog-ng-1.6.7/doc/syslog-ng-anon.conf	1969-12-31 18:00:00.000000000 -0600
> +++ new/syslog-ng-1.6.7/doc/syslog-ng-anon.conf	2005-05-30 18:25:40.828858265 -0500
> @@ -0,0 +1,243 @@
> +#
> +# Configuration file for syslog-ng under Debian.
> +# Customized for riseup.net using syslog-ng-anon patch
> +# (http://dev.riseup.net/patches/syslog-ng/)
> +#
> +# see http://www.campin.net/syslog-ng/expanded-syslog-ng.conf
> +# for examples.
> +#
> +# levels: emerg alert crit err warning notice info debug
> +#
> +
> +############################################################
> +## global options
> +
> +options {
> +    chain_hostnames(0);
> +    time_reopen(10);
> +    time_reap(360);
> +    sync(0);
> +    log_fifo_size(2048);
> +    create_dirs(yes);
> +    group(adm);
> +    perm(0640);
> +    dir_perm(0755);
> +    use_dns(no);
> +};
> +
> +############################################################
> +## universal source
> +
> +source s_all {
> +    internal();
> +    unix-stream("/dev/log");
> +    file("/proc/kmsg" log_prefix("kernel: "));
> +};
> +
> +############################################################
> +## generic destinations
> +
> +destination df_facility_dot_info   { file("/var/log/$FACILITY.info");   };
> +destination df_facility_dot_notice { file("/var/log/$FACILITY.notice"); };
> +destination df_facility_dot_warn   { file("/var/log/$FACILITY.warn");   };
> +destination df_facility_dot_err    { file("/var/log/$FACILITY.err");    };
> +destination df_facility_dot_crit   { file("/var/log/$FACILITY.crit");   };
> +
> +############################################################
> +## generic filters
> +
> +filter f_strip { strip(ips); };
> +filter f_at_least_info   { level(info..emerg);   };
> +filter f_at_least_notice { level(notice..emerg); };
> +filter f_at_least_warn   { level(warn..emerg);   };
> +filter f_at_least_err    { level(err..emerg);    };
> +filter f_at_least_crit   { level(crit..emerg);   };
> +
> +############################################################
> +## auth.log
> +
> +filter f_auth { facility(auth, authpriv); };
> +destination df_auth { file("/var/log/auth.log"); };
> +log {
> +    source(s_all);
> +    filter(f_auth);
> +    destination(df_auth);
> +};
> +
> +############################################################
> +## daemon.log
> +
> +filter f_daemon { facility(daemon); };
> +destination df_daemon { file("/var/log/daemon.log"); };
> +log {
> +    source(s_all);
> +    filter(f_daemon);
> +    destination(df_daemon);
> +};
> +
> +############################################################
> +## kern.log
> +
> +filter f_kern { facility(kern); };
> +destination df_kern { file("/var/log/kern.log"); };
> +log {
> +    source(s_all);
> +    filter(f_kern);
> +    destination(df_kern);
> +};
> +
> +############################################################
> +## user.log
> +
> +filter f_user { facility(user); };
> +destination df_user { file("/var/log/user.log"); };
> +log {
> +    source(s_all);
> +    filter(f_user);
> +    destination(df_user);
> +};
> +
> +############################################################
> +## sympa.log
> +
> +filter f_sympa { program("^(sympa|bounced|archived|task_manager)"); };
> +destination d_sympa { file("/var/log/sympa.log"); };
> +log {
> +	source(s_all);
> +	filter(f_sympa);
> +	destination(d_sympa);
> +	flags(final);
> +};
> +
> +############################################################
> +## wwsympa.log
> +
> +filter f_wwsympa { program("^wwsympa"); };
> +destination d_wwsympa { file("/var/log/wwsympa.log"); };
> +log {
> +	source(s_all);
> +	filter(f_wwsympa);
> +	filter(f_strip);
> +	destination(d_wwsympa);
> +	flags(final);
> +};
> +
> +############################################################
> +## ldap.log
> +
> +filter f_ldap { program("slapd"); };
> +destination d_ldap { file("/var/log/ldap.log"); };
> +log {
> +	source(s_all);
> +	filter(f_ldap);
> +	destination(d_ldap);
> +	flags(final);
> +};
> +
> +############################################################
> +## postfix.log
> +
> +# special source because of chroot jail
> +#source s_postfix { unix-stream("/var/spool/postfix/dev/log" keep-alive(yes)); }; 
> +filter f_postfix { program("^postfix/"); };
> +destination d_postfix { file("/var/log/postfix.log"); };
> +log {
> +	source(s_all);
> +	filter(f_postfix);
> +	filter(f_strip);
> +	destination(d_postfix);
> +	flags(final);
> +};
> +
> +############################################################
> +## courier.log
> +
> +filter f_courier { program("courier|imap|pop"); };
> +destination d_courier { file("/var/log/courier.log"); };
> +log {
> +	source(s_all);
> +	filter(f_courier);
> +	filter(f_strip);
> +	destination(d_courier);
> +	flags(final);
> +};
> +
> +############################################################
> +## maildrop.log
> +
> +filter f_maildrop { program("^maildrop"); };
> +destination d_maildrop { file("/var/log/maildrop.log"); };
> +log {
> +	source(s_all);
> +	filter(f_maildrop);
> +	destination(d_courier);
> +	flags(final);
> +};
> +
> +############################################################
> +## mail.log
> +
> +filter f_mail { facility(mail); };
> +destination df_mail { file("/var/log/mail.log"); };
> +
> +log {
> +    source(s_all);
> +    filter(f_mail);
> +    destination(df_mail);
> +};
> +
> +############################################################
> +## messages.log
> +
> +filter f_messages {
> +	level(debug,info,notice)
> +	and not facility(auth,authpriv,daemon,mail,user,kern);
> +};
> +destination df_messages { file("/var/log/messages.log"); };
> +log {
> +    source(s_all);
> +    filter(f_messages);
> +    destination(df_messages);
> +};
> +
> +############################################################
> +## errors.log
> +
> +filter f_errors {
> +	level(warn,err,crit,alert,emerg)
> +	and not facility(auth,authpriv,daemon,mail,user,kern);
> +};
> +destination df_errors { file("/var/log/errors.log"); };
> +log {
> +	source(s_all);
> +	filter(f_errors);
> +	destination(df_errors);
> +};
> +
> +############################################################
> +## emergencies
> +
> +filter f_emerg { level(emerg); };
> +destination du_all { usertty("*"); };
> +log {
> +	source(s_all);
> +	filter(f_emerg);
> +	destination(du_all);
> +};
> +
> +############################################################
> +## console messages
> +
> +filter f_xconsole {
> +    facility(daemon,mail)
> +    or level(debug,info,notice,warn)
> +    or (facility(news)
> +    and level(crit,err,notice));
> +};
> +destination dp_xconsole { pipe("/dev/xconsole"); };
> +log {
> +    source(s_all);
> +    filter(f_xconsole);
> +    destination(dp_xconsole);
> +};
> +
> diff -uNr orig/syslog-ng-1.6.7/src/cfg-grammar.y new/syslog-ng-1.6.7/src/cfg-grammar.y
> --- orig/syslog-ng-1.6.7/src/cfg-grammar.y	2004-09-17 04:21:06.000000000 -0500
> +++ new/syslog-ng-1.6.7/src/cfg-grammar.y	2005-05-30 18:25:40.826858634 -0500
> @@ -89,7 +89,7 @@
>  %token KW_REMOVE_IF_OLDER KW_LOG_PREFIX KW_PAD_SIZE
>  
>  /* filter items*/
> -%token KW_FACILITY KW_LEVEL KW_NETMASK KW_HOST KW_MATCH
> +%token KW_FACILITY KW_LEVEL KW_NETMASK KW_HOST KW_MATCH KW_STRIP KW_REPLACE
>  
>  /* yes/no switches */
>  %token KW_YES KW_NO
> @@ -669,6 +669,8 @@
>  	| KW_NETMASK '(' string ')'             { $$ = make_filter_netmask($3); free($3); }
>  	| KW_HOST '(' string ')'		{ $$ = make_filter_host($3); free($3); }	
>  	| KW_MATCH '(' string ')'		{ $$ = make_filter_match($3); free($3); }
> +	| KW_STRIP '(' string ')'		{ $$ = make_filter_strip($3); free($3); }
> +	| KW_REPLACE '(' string string ')'		{ $$ = make_filter_replace($3,$4); free($3); free($4); }
>  	| KW_FILTER '(' string ')'		{ $$ = make_filter_call($3); free($3); }
>  	;
>  
> diff -uNr orig/syslog-ng-1.6.7/src/cfg-lex.l new/syslog-ng-1.6.7/src/cfg-lex.l
> --- orig/syslog-ng-1.6.7/src/cfg-lex.l	2005-05-30 18:27:50.829842715 -0500
> +++ new/syslog-ng-1.6.7/src/cfg-lex.l	2005-05-30 18:25:40.827858450 -0500
> @@ -140,6 +140,8 @@
>  	{ "netmask",            KW_NETMASK },
>          { "host",               KW_HOST },
>          { "match",		KW_MATCH },
> +        { "strip",		KW_STRIP },
> +        { "replace",	KW_REPLACE },
>  
>  	/* on/off switches */
>  	{ "yes",		KW_YES },
> diff -uNr orig/syslog-ng-1.6.7/src/filters.c new/syslog-ng-1.6.7/src/filters.c
> --- orig/syslog-ng-1.6.7/src/filters.c	2004-01-13 12:08:02.000000000 -0600
> +++ new/syslog-ng-1.6.7/src/filters.c	2005-05-30 18:25:40.827858450 -0500
> @@ -163,6 +163,7 @@
>       (name filter_expr_re)
>       (super filter_expr_node)
>       (vars
> +       (replace string)
>         (regex special-struct regex_t #f free_regexp)))
>  */
>  
> @@ -226,6 +227,78 @@
>  	return &self->super;
>  }
>  
> +struct filter_expr_node *make_filter_strip(const char *re)
> +{
> +	if (strcasecmp(re,"ips") == 0)
> +		return make_filter_replace(re,"0.0.0.0");
> +	else
> +		return make_filter_replace(re,"----");
> +}
> +
> +#define FMIN(a,b) (a)<(b) ? (a):(b)
> +
> +static int do_filter_replace(struct filter_expr_node *c, 
> +			   struct log_filter *rule UNUSED,
> +			   struct log_info *log)
> +{
> +	CAST(filter_expr_re, self, c);
> +	char * buffer = log->msg->data;
> +	int snippet_size;
> +	regmatch_t pmatch;
> +	char new_msg[2048];
> +	char * new_msg_max = new_msg+2048;
> +	char * new_msg_ptr = new_msg;
> +	int replace_length = strlen(self->replace->data);
> +	
> +	int error = regexec(&self->regex, buffer, 1, &pmatch, 0);
> +	if (error != 0) return 1;
> +	while (error==0) {
> +		/* copy string snippet which preceeds matched text */
> +		snippet_size = FMIN(pmatch.rm_so, new_msg_max-new_msg_ptr);
> +		memcpy(new_msg_ptr, buffer, snippet_size);
> +		new_msg_ptr += snippet_size;
> +
> +		/* copy replacement string */
> +		snippet_size = FMIN(replace_length, new_msg_max-new_msg_ptr);
> +		memcpy(new_msg_ptr, self->replace->data, snippet_size);
> +		new_msg_ptr += snippet_size;
> +
> +		/* search for next match */
> +		buffer += pmatch.rm_eo;
> +		error = regexec (&self->regex, buffer, 1, &pmatch, REG_NOTBOL);
> +	}
> +	/* copy the rest of the old msg */
> +	snippet_size = FMIN(strlen(buffer),new_msg_max-new_msg_ptr);
> +	memcpy(new_msg_ptr, buffer, snippet_size); 
> +	new_msg_ptr += snippet_size;
> +
> +	ol_string_free(log->msg);
> +	log->msg = c_format_cstring("%s", new_msg_ptr-new_msg,new_msg);
> +	return 1;
> +}
> +
> +struct filter_expr_node *make_filter_replace(const char *re, const char *replacement)
> +{
> +	int regerr;
> +	NEW(filter_expr_re, self);
> +	self->super.eval = do_filter_replace;
> +	self->replace = format_cstring(replacement);
> +	
> +	if (strcasecmp(re,"ips") == 0) {
> +		re = "(25[0-5]|2[0-4][0-9]|[0-1]?[0-9]?[0-9])([\\.\\-](25[0-5]|2[0-4][0-9]|[0-1]?[0-9]?[0-9])){3}";
> +	}
> +	regerr = regcomp(&self->regex, re, REG_ICASE | REG_EXTENDED);
> +	if (regerr) {
> +		char errorbuf[256];
> +		regerror(regerr, &self->regex, errorbuf, sizeof(errorbuf));
> +		werror("Error compiling regular expression: \"%z\" (%z)\n", re, errorbuf);
> +		KILL(self);
> +		return NULL;
> +	}
> +
> +	return &self->super;
> +}
> +
>  static int do_filter_prog(struct filter_expr_node *c, 
>  			  struct log_filter *rule UNUSED,
>  			  struct log_info *log)
> diff -uNr orig/syslog-ng-1.6.7/src/filters.h new/syslog-ng-1.6.7/src/filters.h
> --- orig/syslog-ng-1.6.7/src/filters.h	2002-02-04 10:07:50.000000000 -0600
> +++ new/syslog-ng-1.6.7/src/filters.h	2005-05-30 18:25:40.827858450 -0500
> @@ -66,6 +66,8 @@
>  struct filter_expr_node *make_filter_netmask(const char *nm);
>  struct filter_expr_node *make_filter_host(const char *re);
>  struct filter_expr_node *make_filter_match(const char *re);
> +struct filter_expr_node *make_filter_strip(const char *re);
> +struct filter_expr_node *make_filter_replace(const char *re, const char *replacement);
>  struct filter_expr_node *make_filter_call(const char *name);
>  
>  #endif


-- 
-------------------------------------------------------------
addr://Rathausgasse 31, CH-5001 Aarau  tel://++41 62 823 9355
http://www.terreactive.com             fax://++41 62 823 9356
-------------------------------------------------------------
terreActive AG                       Wir sichern Ihren Erfolg
-------------------------------------------------------------


More information about the syslog-ng mailing list