[syslog-ng]AIX and repeated message formats - fix?

Jon Marks j-marks@uiuc.edu
Fri, 27 Apr 2001 16:39:32 -0500


--u3/rZRmxL6MmkK24
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Earlier I posted a message asking about whether syslog-ng had standard
equipment for dealing with a couple of strange message types:

1) Messages forwarded by AIX syslogd which contain the string

   "Message forwarded from [hostname]: " ...

and 

2) Messages of the form:

   "last message repeated _n times"

I haven't received any responses. I've read traffic in the mailing list
archives (e.g. December 2000) regarding the topic, but I haven't found
any solutions. If I've missed something vital, please help me and let me
know about it.. I certainly apologize if I haven't been observant enough
to skim the answers from the available resources.

I've taken a stab at solving the problem myself, and I thought I'd share it
with all of you. Those of you who are so much more familiar with the source
than I, please comment about whether you find I might have broken other
functionality or adversely affected performance. I'm glad for you suggestions,
or if this can help somebody else. For my installation, being able to handle
AIX messages is particularly important.

My solution is just a simple and straightforward refactoring in
log.c/parse_log_msg(). A patch against the 1.4.11 source distribution
is attached to this message. I've tested it only on AIX, but I think
it ought to work elsewhere since it only adds standard library stuff.
(Basically, a couple of memcmp's). Some commented-out debugging output
(to stderr) is there, too, if you'd want to enable it.

I'll describe the intended effects of this patch:

1) Messages forwarded by AIX systems are recognized, and the source hostname
   is taken from the string appended by AIX syslogd. For example,

   MMM DD HH:MM:SS Message forwarded from <hostname>: ...

   will use <hostname> for the source address. "Message" isn't taken as the
   source host.

2) If the message is forwarded by AIX syslogd but originally of syslog-ng
   origin, it'll have the form

   MMM DD HH:MM:SS Message forwarded from <hostname>: source@hostname ...

   In this case, "source@hostname" will be used as the source hostname (so
   that syslog-ng's source info is preserved).

3) If the message doesn't contain a hostname because it's of the form

   MMM DD HH:MM:SS last message repeated *n* times

   then it won't try to extract one (so it won't improperly use 'last').
   The hostname will just come later when in sources.c it's gotten from the
   message source socket or wherever.

4) If the message is a "last message repeated ..." which was *also*
   forwarded by an AIX syslogd, it'll get the hostname extracted from AIX's
   "Message forwarded from ..." message.

I'm no expert in the syslog protocol/format, and I can see there is variablilty
in implementation. So I'm certain there are plenty of cases that are left
unaddressed. If you don't find this patch helpful in general, perhaps its
useful for some common cases.

One thing that's not addressed: 

When a "last message repeated ..." line is received, it's certainly
helpful to know *which* was the last line-- the one that got repeated. If
you're outputting to files by PROGRAM, then no program name is inherently
available from this kind of message itself. And syslog-ng isn't keeping
track of what PROGRAM logged last from a particular source host. So these
"last message repeated ..." messages won't fall into the same logs as
the lines for the PROGRAMs which produced the repeated messages. No simple
solution is readily apparent to me...

Also, please note that "keep_hostname(yes)" is important, otherwise the
AIX sender of forwarded messages is used anyway-- despite that the real
origin is available. If you use chain_hostnames(yes), then this info
ought to be preserved anyway.

Again, please comment and critique! Thanks very much.

-- 
Jonathan Marks

Systems Administrator, Production Systems Group
Computing and Communication Services Office
University of Illinois at Urbana-Champaign



--u3/rZRmxL6MmkK24
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="syslog-ng_log.c_20010427.diff"

--- syslog-ng-1.4.11-orig/src/log.c	Sun Feb 25 06:30:22 2001
+++ syslog-ng-1.4.11/src/log.c	Fri Apr 27 16:28:32 2001
@@ -38,6 +38,9 @@
 #include "log.h.x"
 #undef CLASS_DEFINE
 
+static char aix_fwd_string[] = "Message forwarded from ";
+static char repeat_msg_string[] = "last message repeated";
+
 static void parse_log_msg(struct log_info *lm, UINT32 length, UINT8 *data)
 {
 	char *src;
@@ -49,8 +52,11 @@
 	
 	src = data;
 	left = length;
-	
+
+	/* Determine the priority of the message. Assume buffer starts
+	   with '<' if there's one to read. */
 	if (left && src[0] == '<') {
+	        /* Expected buffer format: <digits> .... */
 		src++;
 		left--;
 		pri = 0;
@@ -72,21 +78,32 @@
 			left--;
 		}
 	}
+	/* No priority info in the buffer? Just assign a default. */
 	else {
-		lm->pri = LOG_USER | LOG_NOTICE;
+	        lm->pri = LOG_USER | LOG_NOTICE;
 	}
-	while (left && *src == ' ') {
+
+
+	while (left && *src == ' ') {  /* Move past whitespace */
 		src++;
 		left--;
 	}
+
+	/* If the next chars look like a date, then read them as a date. */
 	if (left >= 15) {
-		if (src[3] == ' ' && src[6] == ' ' && src[9] == ':' && src[12] == ':') {
-			struct tm tm, *nowtm;
+  	        /* Expected buffer format: MMM DD HH:MM:SS ... */
+
+	        if (src[3] == ' ' && src[6] == ' ' &&
+		    src[9] == ':' && src[12] == ':') {
+	                struct tm tm, *nowtm;
 
+			/* Just read the buffer data into a textual
+			   datestamp. */
 			lm->date = c_format_cstring("%s", 15, src);
 			src += 15;
 			left -= 15;
 
+			/* And also make struct time timestamp for the msg */
 			nowtm = localtime(&now);
 			memset(&tm, 0, sizeof(tm));
 			strptime(lm->date->data, "%b %e %H:%M:%S", &tm);
@@ -97,54 +114,123 @@
 			lm->stamp = mktime(&tm);
 		}
 	}
-
-	if (lm->date) {
+ 
+        if (lm->date) {
+	        /* Expected format: hostname program[pid]: */
+	        /* Possibly: Message forwarded from hostname: ... */
+	        char *hostname_start = NULL;
+		int hostname_len = 0;
 		while (left && *src == ' ') {
 			src++; /* skip whitespace */
 			left--;
 		}
-		oldsrc = src;	
-		oldleft = left;
-		while (left && *src != ' ' && *src != ':' && *src != '[') {
-			src++;
-			left--;
+
+/* fprintf(stderr, "DETECTING AIX FWD STRING. BUFFER:\n%s\n", src); */
+
+		/* Detect funny AIX syslogd forwarded message. */
+		if (left >= (sizeof(aix_fwd_string) - 1) &&
+		    !memcmp(src, aix_fwd_string, sizeof(aix_fwd_string) - 1)) {
+
+		        oldsrc = src;
+			oldleft = left;
+			src += sizeof(aix_fwd_string) - 1;
+			left -= sizeof(aix_fwd_string) - 1;
+			hostname_start = src;
+			hostname_len   = 0;
+			while (left && *src != ':') {
+			  src++;
+			  left--;
+			  hostname_len++;
+			}
+			while (left && (*src == ' ' || *src == ':')) {
+				src++; /* skip whitespace */
+				left--;
+			}
 		}
-		if (left && *src == ' ') {
-			/* this was a hostname */
-			lm->host = c_format_cstring("%s", oldleft - left, oldsrc);
+
+/* fprintf(stderr, "DETECTING LAST MSG REPEATED STRING. BUFFER:\n%s\n", src); */
+
+		/* Now, try to tell if it's a "last message repeated" line */
+		if (left >= sizeof(repeat_msg_string) &&
+		    !memcmp(src, repeat_msg_string, 
+			    sizeof(repeat_msg_string) - 1)) {
+			; /* It is. Do nothing since there's no hostname or
+			     program name coming. */
 		}
+		/* It's a regular ol' message. */
 		else {
+			/* If we haven't already found the original hostname,
+			   look for it now. */
+
+/* fprintf(stderr, "DETECTING HOSTNAME. BUFFER:\n%s\n", src); */
+
+		        oldsrc = src;	
+			oldleft = left;
+
+			while (left && *src != ' ' && *src != ':' 
+			       && *src != '[') {
+		 		src++;
+				left--;
+			}
+
+			if (left && *src == ' ') {
+		  		/* This was a hostname. It came from a
+				   syslog-ng, since syslogd doesn't send
+				   hostnames. It's even better then the one
+				   we got from the AIX fwd message, if we
+				   did. */
+				hostname_start = oldsrc;
+				hostname_len   = oldleft - left;
+			} else {
+		  		src = oldsrc;
+		  		left = oldleft;
+			}
+
+			/* Skip whitespace. */
+			while (left && *src == ' ') {
+				src++;
+				left--;
+			}
+
+
+/* fprintf(stderr, "DETECTING PROGRAM. BUFFER:\n%s\n", src); */
+
+			/* Try to extract a program name */
+			oldsrc = src;
+			oldleft = left;
+			while (left && *src != ':' && *src != '[') {
+				src++;
+				left--;
+			}
+			if (left) {
+				lm->program = c_format_cstring("%s",
+						oldleft - left, oldsrc);
+			} 
+			
 			src = oldsrc;
 			left = oldleft;
 		}
 
-		while (left && *src == ' ') {
-			src++; /* skip whitespace */
-			left--;
-		}
-		
-		oldsrc = src;
-		oldleft = left;
-
-		while (left && *src != ':' && *src != '[') {
-			src++;
-			left--;
-		}
-		if (left) {
-			lm->program = c_format_cstring("%s", oldleft - left, oldsrc);
-		}
-		src = oldsrc;
-		left = oldleft;
+		/* If we did manage to find a hostname, store it. */
+		if (hostname_start)
+			lm->host = c_format_cstring("%s", hostname_len,
+						    hostname_start);
 	}
 	else {
+	        /* Different format */
+
 		oldsrc = src;
 		oldleft = left;
+		/* A kernel message? Use 'kernel' as the program name. */
 		if ((lm->pri & LOG_FACMASK) == LOG_KERN) {
 			lm->program = c_format_cstring("kernel");
 		}
+		/* No, not a kernel message. */
 		else {
-			while (left && *src != ' ' && *src != '[' && *src != ':' && 
-			       *src != '/' && *src != ',' && *src != '<') {
+		        /* Capture the program name */ 
+		        while (left && *src != ' ' && *src != '['
+			       && *src != ':' && *src != '/' &&
+			       *src != ',' && *src != '<') {
 				src++;
 				left--;
 			}
@@ -157,10 +243,20 @@
 		lm->stamp = now;
 	}
 
+	/* The rest of the buffer is the message body. But first, squash any
+	   LFs or CRs! */
 	for (oldsrc = src, oldleft = left; oldleft >= 0; oldleft--, oldsrc++) {
 		if (*oldsrc == '\n' || *oldsrc == '\r') *oldsrc = ' ';
 	}
 	lm->msg = c_format_cstring("%s", left, src);
+
+/* DEBUG */
+/* fprintf(stderr, "MESSAGE CONSTRUCTED:\nHOST(%s) PROG(%s) MSG(%s)\n",
+	lm->host ? lm->host->data : "NONE",
+	lm->program ? lm->program->data : "NONE",
+	lm->msg ? lm->msg->data : "<No message>"); */
+/* DEBUG -- zero out buffer so its easy to read during debugging... */
+/* bzero(data, length); */
 }
 
 struct log_info *log_info_use(struct log_info *msg)

--u3/rZRmxL6MmkK24--