patterndb ESTRING delimiter char
I need to find a valid char to use as the ESTRING delimiter that won't appear in a URI. I had been using comma, which usually works, but lots of sites have commas in the URI which messes up the parsing of these logs. Tab is outlawed as per the documentation or I'd have used that. I've seen pipe used in URI's, so that's no better. Keep in mind that what I'm parsing has been URI unescaped. Is there a way to use a null byte or something? Thanks, Martin
these logs. Tab is outlawed as per the documentation or I'd have used that. I've seen pipe used in URI's, so that's no better. Keep in mind that what I'm parsing has been URI unescaped. Is there a way to use a null byte or something?
How about a space? If your URI's are escaped, then any internal spaces are encoded as '%20'.
No, they are unescaped, not escaped. On Tue, Nov 16, 2010 at 10:35 AM, Lars Kellogg-Stedman <lars@oddbit.com> wrote:
these logs. Tab is outlawed as per the documentation or I'd have used that. I've seen pipe used in URI's, so that's no better. Keep in mind that what I'm parsing has been URI unescaped. Is there a way to use a null byte or something?
How about a space? If your URI's are escaped, then any internal spaces are encoded as '%20'. ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
No, they are unescaped, not escaped.
Sorry, I misread that. Where exactly are you getting these logs from? Remember that ESTRING can accept multi-character sequences: "As of syslog-ng 3.1, it is possible to specify a stopstring instead of a single character, e.g., @ESTRING::stop_here.@. The @ character cannot be a stopcharacter, nor can line-breaks or tabs." ...so if you're building the log messages yourself you could (as a simple example) embed the URIs inside of |BEGINURI|...|ENDURI| pairs, and then use |ENDURI| as your match.
The logs are coming from httpry via a wrapper script. I believe I have solved this by (expensively) regexp swapping any pipe chars into backslashes, which seems to have solved the problem, though I'm paying a small CPU toll to do so and my data is now modified. I know that I could use any char sequence as an ESTRING delim, but I was looking for something that could not exist in an URI but could exist as a delim. It's theoretically possible, however unlikely, that something like BEGINURI would be in the stream to be parsed. It also adds a fair amount of overhead to the messages. I was hoping there would be a silver bullet solution with a null byte char or some other special char that would be the perfect solution, but this swapping at the source should suffice. On Tue, Nov 16, 2010 at 11:13 AM, Lars Kellogg-Stedman <lars@oddbit.com> wrote:
No, they are unescaped, not escaped.
Sorry, I misread that. Where exactly are you getting these logs from? Remember that ESTRING can accept multi-character sequences:
"As of syslog-ng 3.1, it is possible to specify a stopstring instead of a single character, e.g., @ESTRING::stop_here.@. The @ character cannot be a stopcharacter, nor can line-breaks or tabs."
...so if you're building the log messages yourself you could (as a simple example) embed the URIs inside of |BEGINURI|...|ENDURI| pairs, and then use |ENDURI| as your match. ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.campin.net/syslog-ng/faq.html
participants (2)
-
Lars Kellogg-Stedman
-
Martin Holste