[syslog-ng] GeoIP template function

Gergely Nagy algernon at balabit.hu
Wed May 9 20:47:56 CEST 2012


Balazs Scheidler <bazsi77 at gmail.com> writes:

>> > diff --git a/modules/tfgeoip/tfgeoip.c b/modules/tfgeoip/tfgeoip.c
>> > new file mode 100644
>> > index 0000000..eeea330
>> > --- /dev/null
>> > +++ b/modules/tfgeoip/tfgeoip.c
>> [...]
>> > +static void
>> > +tf_geoip(LogMessage *msg, gint argc, GString *argv[], GString *result)
>> > +{
>> > +   GeoIP * gi;
>> > +   const char * returnedCountry;
>> > +
>> > +   if (argc != 1)
>> > +       return;
>> > +   gi = GeoIP_open("/usr/share/GeoIP/GeoIP.dat", GEOIP_STANDARD);
>
> probably some caching would make sense here. perhaps even enhancing
> the dns cache code with the ability to store such
> information. template functions can be invoked for every message,
> opening and parsing the geoip database is probably very expensive.

A quick look at the GeoIP library suggest that it can do caching, and it
can even keep the file open, and even supports figuring out itself where
the file is.

So if I turn the template function into one that can keep state, all of
this can be taken care of.

>> > +
>> > +   returnedCountry = GeoIP_country_code_by_addr(gi, argv[0]->str);
>> > +   g_string_append_printf (result, "%s", returnedCountry);
>> > +}
>> 
>> As you mentioned in person, it would be nice if the template function
>> could be configured, preferably globally, so that one wouldn't have to
>> repeat the same parameters over and over again in every template where
>> it would be used.
>
> I'm not sure, templates can be declared as objects, referenced from
> multiple locations in the config. it is true though that this would
> probably be used in the naming of files which can't be declared in
> advance. what kind of settings would this be about?

Settings could control whether we want the country code (hu) or country
name (Hungary), whether the ip is ipv4 or ipv6 (though, I'm not entirely
sure this is needed - the geoip api is a bit confusing in this regard).

>> Things like returning country vs country code, the location of the
>> database and so on and so forth should be configurable.
>> 
>
> hmmm. isn't the database location fixed?

There appears to be a separate database for IPv6 addresses, so no, the
location is probably not fixed.

> I think when naming files, the country code should be enough.

I'd love to store the country name in my database too, now that I know
it's possible =)

> what about using global @define based variables for this purpose? that would probably be simple to access from within the template function and is global.
>
> the con of this is that nothing else uses @define yet.

Ick. I don't like this idea, I'm afraid.

>
>> The best would be to have a geoip() option in the global option space,
>> but I'm not sure a plugin can hook into that.
>> 
>
> The syslog-ng team added some mechanisms to hook into the global
> options block though, but last time I looked I wasn't very happy about
> the implementation.

That sounds something I just might want to port and pretty-up for OSE.

>> I suppose the first thing would be to add commandline handling to
>> tf_geoip first, so at least it's configurable. Then figure out a way to
>> make global configuration possible. I have no idea whether this latter
>> is possible with current syslog-ng 3.4, but this is a great opportunity
>> to make it possible to do this.
>> 
>> Unless you want to work on this yourself further, I'll spice it up with
>> error handling and basic configurability in the next few days.
>> 
>> Come to think of it, perhaps it would make sense to introduce a new
>> rewrite command, something along the lines of:
>> 
>> rewrite r_geoip {
>>     apply(geoip("SRCIP"), "GEOIP");
>> };
>> 
>> This would apply the "geoip" function to the value of the "SRCIP" key,
>> and place the result in ${GEOIP}. The advantage of this is that it's
>> perhaps faster to do this than parsing a template function and using
>> g_string_append_printf to reassemble another string.
>> 
>
> I'm not sure why this would be faster than the set() rewrite op.

For GeoIP, it probably wouldn't. But it was a trivial example for
something that has an output, as opposed to value-pairs() which modifies
things in-place.

>> In essence, apply() would have a syntax like
>>   apply(FUNCTION(...)[, OUTPUT_VARIABLE])
>> 
>> The function could either modify the LogMessage object itself, or return
>> a value. Functions would need to signal which one they do, and the
>> config parser would not allow using a function with an output value
>> without specifying a destination, etc.
>> 
>> ...but I guess I'll draft up an RFC about this instead with a few more
>> possible use cases, advantages & disadvantages and the rest.
>> 
>
> good ideas, but I would be careful to introduce another programmable
> mechanism in this syntax. so the usecases are interesting, I'm not
> sure about the proposed solutions.

There's probably a better and/or easier syntax, perhaps it should not be
done within the context of rewrite()... I do like the
apply(function,...) idea though.

I'll prepare a few more use cases for it, and we'll see what can be made
out of them.

-- 
|8]



More information about the syslog-ng mailing list