couple questions - geoip and also list archives
Hi, I would like to understand how the geoip module works, but cannot find any documentation (or really - any relevant source code so far). I assume it must use some downloaded database (or else it would seem like it would kill a local resolver) but cannot find anything describing it. Is anyone using it in reasonably high-performance environments? (like 5000+ events per second) Also, as I was looking for info, I cannot access the lists.balabit.hu archives (maybe I am *way* behind time, is this still maintained anywhere?) Thanks! Jim
Hi, It's documented under template functions at http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.6-guide... It's usually available in a separate subpackage to reduce the number of external dependencies of the syslog-ng core package. List archives are available at https://lists.balabit.hu/pipermail/syslog-ng/ (actually it's linked from the info page under each e-mail...). Bye, Peter Czanik (CzP) <peter.czanik@balabit.com> BalaBit IT Security / syslog-ng upstream http://czanik.blogs.balabit.com/ https://twitter.com/PCzanik On Fri, Feb 20, 2015 at 10:52 AM, <jrhendri@roadrunner.com> wrote:
Hi,
I would like to understand how the geoip module works, but cannot find any documentation (or really - any relevant source code so far).
I assume it must use some downloaded database (or else it would seem like it would kill a local resolver) but cannot find anything describing it.
Is anyone using it in reasonably high-performance environments? (like 5000+ events per second)
Also, as I was looking for info, I cannot access the lists.balabit.hu archives (maybe I am *way* behind time, is this still maintained anywhere?)
Thanks! Jim
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Thanks - the archive site is timing out for me Jim ---- "Czanik wrote:
Hi,
It's documented under template functions at http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.6-guide... It's usually available in a separate subpackage to reduce the number of external dependencies of the syslog-ng core package.
List archives are available at https://lists.balabit.hu/pipermail/syslog-ng/ (actually it's linked from the info page under each e-mail...).
Bye,
Peter Czanik (CzP) <peter.czanik@balabit.com> BalaBit IT Security / syslog-ng upstream http://czanik.blogs.balabit.com/ https://twitter.com/PCzanik
On Fri, Feb 20, 2015 at 10:52 AM, <jrhendri@roadrunner.com> wrote:
Hi,
I would like to understand how the geoip module works, but cannot find any documentation (or really - any relevant source code so far).
I assume it must use some downloaded database (or else it would seem like it would kill a local resolver) but cannot find anything describing it.
Is anyone using it in reasonably high-performance environments? (like 5000+ events per second)
Also, as I was looking for info, I cannot access the lists.balabit.hu archives (maybe I am *way* behind time, is this still maintained anywhere?)
Thanks! Jim
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi Jim, On Fri, Feb 20, 2015 at 01:52:19PM -0500, jrhendri@roadrunner.com wrote:
Is anyone using it in reasonably high-performance environments? (like 5000+ events per second)
we're using the module in a 3keps environment with very good performance. we have had some issues in the past in threaded mode with some segfaults. The geoip library documentation mentions a few sentences about thread safety. I'd be curious to hear some feedback about your future experience. cheers
Hi Fabian, I have done just some preliminary testing (maybe 1500 EPS for a few minutes) and was seeing a lot of dns traffic (~1MB/s) Obviously, if the field is a hostname, to do a geoip lookup there needs to be name resolution before the IP can be mapped to a geo database. I will be looking for ways to minimize this. Current use-cases are for parsing proxy, email and fire-eye logs. Recall, my base architecture is syslog-ng using patterndb sending format-json to a local redis destination (lpush) redis is run with no local disk storage and acts as an in-memory buffer between syslog-ng and logstash logstash (also running locally on the same box) pulling (blpop) and feeding an elasticsearch cluster (4 nodes right now) Currently taking live proxy logs at ~7 - 10 K EPS running very well. Looking to add the email and fireeye logs soon and starting to enhance the data (with user and host metadata) Thoughts right now are: - only resolve location for addresses (not hostnames) - run a caching nameserver locally on the syslog-ng box and dealing with the "ramp up" period (initially clearly the names would not be in cache - just not sure how long it would take to get to a steady state and how big to make the cache, etc.) I'll keep you posted. Thanks again! Jim On 02/20/2015 03:24 PM, Fabien Wernli wrote:
Hi Jim,
On Fri, Feb 20, 2015 at 01:52:19PM -0500, jrhendri@roadrunner.com wrote:
Is anyone using it in reasonably high-performance environments? (like 5000+ events per second)
we're using the module in a 3keps environment with very good performance. we have had some issues in the past in threaded mode with some segfaults. The geoip library documentation mentions a few sentences about thread safety. I'd be curious to hear some feedback about your future experience.
cheers ______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, I would think that adding forward DNS lookups to the syslog-ng dns cache code (or ripping out that code entirely and rewrite it from scratch while adding this feature) would produce _much_ better results than a locally running DNS server. That's why the DNS cache code was added in the first place, a caching only name server is still too slow for name lookups for every message posted. The geoip code uses libgeoip1. The database is: $ apt-cache show geoip-database Package: geoip-database Priority: standard Section: net Installed-Size: 3881 Version: 20140313-1 Recommends: libgeoip1 Breaks: libgeoip1 (<< 1.4.5.dfsg) Filename: pool/main/g/geoip-database/geoip-database_20140313-1_all.deb Size: 1195894 MD5sum: ab4d4f6bc0e04b25cad2fbe1479f44bc SHA1: 06d38aee4084124f86351dfa6f1c404a8ae3e83b SHA256: 30dc5a2c3296180ed0740fb4ec70eb1ea5b49efc5e48a091913a8106f6895c7e Description-en: IP lookup command line tools that use the GeoIP library (country database) GeoIP is a C library that enables the user to find the country that any IP address or hostname originates from. It uses a file based database. . This database simply contains IP blocks as keys, and countries as values and it should be more complete and accurate than using reverse DNS lookups. . This package contains the free GeoLiteCountry database. Description-md5: 3bfa5b4c9f973261799fb4d9355f3b6c Homepage: http://www.maxmind.com/ Bugs: https://bugs.launchpad.net/ubuntu/+filebug Origin: Ubuntu Supported: 5y Task: standard, kubuntu-active, kubuntu-active, mythbuntu-frontend, mythbuntu-frontend, mythbuntu-desktop, mythbuntu-backend-slave, mythbuntu-backend-slave, mythbuntu-backend-master, mythbuntu-backend-master So it is about a year old, but quite probably the version in Debian sid can be installed on top without problems, and that's pretty fresh, being dated 9th February. https://packages.debian.org/sid/geoip-database On Sat, Feb 21, 2015 at 1:24 PM, Jim Hendrick <jrhendri@roadrunner.com> wrote:
Hi Fabian, I have done just some preliminary testing (maybe 1500 EPS for a few minutes) and was seeing a lot of dns traffic (~1MB/s)
Obviously, if the field is a hostname, to do a geoip lookup there needs to be name resolution before the IP can be mapped to a geo database.
I will be looking for ways to minimize this.
Current use-cases are for parsing proxy, email and fire-eye logs.
Recall, my base architecture is syslog-ng using patterndb sending format-json to a local redis destination (lpush) redis is run with no local disk storage and acts as an in-memory buffer between syslog-ng and logstash logstash (also running locally on the same box) pulling (blpop) and feeding an elasticsearch cluster (4 nodes right now)
Currently taking live proxy logs at ~7 - 10 K EPS running very well. Looking to add the email and fireeye logs soon and starting to enhance the data (with user and host metadata)
Thoughts right now are: - only resolve location for addresses (not hostnames) - run a caching nameserver locally on the syslog-ng box and dealing with the "ramp up" period (initially clearly the names would not be in cache - just not sure how long it would take to get to a steady state and how big to make the cache, etc.)
I'll keep you posted.
Thanks again! Jim
On 02/20/2015 03:24 PM, Fabien Wernli wrote:
Hi Jim,
On Fri, Feb 20, 2015 at 01:52:19PM -0500, jrhendri@roadrunner.com wrote:
Is anyone using it in reasonably high-performance environments? (like 5000+ events per second)
we're using the module in a 3keps environment with very good performance. we have had some issues in the past in threaded mode with some segfaults. The geoip library documentation mentions a few sentences about thread safety. I'd be curious to hear some feedback about your future experience.
cheers
______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Bazsi
Balazs Scheidler wrote:
Hi,
I would think that adding forward DNS lookups to the syslog-ng dns cache code (or ripping out that code entirely and rewrite it from scratch while adding this feature) would produce _much_ better results than a locally running DNS server. That's why the DNS cache code was added in the first place, a caching only name server is still too slow for name lookups for every message posted.
Doesn't nscd take care of a lot of that load? Also, using rbldnsd for serving the geoip data might make sense too. -- Per Jessen, Zürich (3.1°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland.
Thanks Balazs, I will try some more "controlled" testing using different settings for syslog-ng resolving and caching. I think I installed libgeopi-dev (not sure right now - it's installed on a system at work) so I'll check that package also. One question on code paths: If I use an IP address pattern in patterndb (within the message - e.g. proxy or email logs) where a ${GEO} macro was assigned, will those be the only things that get resolved? (or by setting a cache within the syslog-ng config will that enable resolution for ${HOST} as well? I am (mostly) interested in things like user access to sites by IP address through the proxy, and wanting to enhance the logs with geoip data for elasticsearch. (obviously if it were fast enough, I would add the data for all sites - but initially I think IP only would be more interesting) Thanks again! Jim On 02/22/2015 02:35 PM, Balazs Scheidler wrote:
Hi,
I would think that adding forward DNS lookups to the syslog-ng dns cache code (or ripping out that code entirely and rewrite it from scratch while adding this feature) would produce _much_ better results than a locally running DNS server. That's why the DNS cache code was added in the first place, a caching only name server is still too slow for name lookups for every message posted.
The geoip code uses libgeoip1.
The database is:
$ apt-cache show geoip-database Package: geoip-database Priority: standard Section: net Installed-Size: 3881
Version: 20140313-1 Recommends: libgeoip1 Breaks: libgeoip1 (<< 1.4.5.dfsg) Filename: pool/main/g/geoip-database/geoip-database_20140313-1_all.deb Size: 1195894 MD5sum: ab4d4f6bc0e04b25cad2fbe1479f44bc SHA1: 06d38aee4084124f86351dfa6f1c404a8ae3e83b SHA256: 30dc5a2c3296180ed0740fb4ec70eb1ea5b49efc5e48a091913a8106f6895c7e Description-en: IP lookup command line tools that use the GeoIP library (country database) GeoIP is a C library that enables the user to find the country that any IP address or hostname originates from. It uses a file based database. . This database simply contains IP blocks as keys, and countries as values and it should be more complete and accurate than using reverse DNS lookups. . This package contains the free GeoLiteCountry database. Description-md5: 3bfa5b4c9f973261799fb4d9355f3b6c Homepage: http://www.maxmind.com/ Bugs: https://bugs.launchpad.net/ubuntu/+filebug Origin: Ubuntu Supported: 5y Task: standard, kubuntu-active, kubuntu-active, mythbuntu-frontend, mythbuntu-frontend, mythbuntu-desktop, mythbuntu-backend-slave, mythbuntu-backend-slave, mythbuntu-backend-master, mythbuntu-backend-master
So it is about a year old, but quite probably the version in Debian sid can be installed on top without problems, and that's pretty fresh, being dated 9th February.
https://packages.debian.org/sid/geoip-database
On Sat, Feb 21, 2015 at 1:24 PM, Jim Hendrick <jrhendri@roadrunner.com <mailto:jrhendri@roadrunner.com>> wrote:
Hi Fabian, I have done just some preliminary testing (maybe 1500 EPS for a few minutes) and was seeing a lot of dns traffic (~1MB/s)
Obviously, if the field is a hostname, to do a geoip lookup there needs to be name resolution before the IP can be mapped to a geo database.
I will be looking for ways to minimize this.
Current use-cases are for parsing proxy, email and fire-eye logs.
Recall, my base architecture is syslog-ng using patterndb sending format-json to a local redis destination (lpush) redis is run with no local disk storage and acts as an in-memory buffer between syslog-ng and logstash logstash (also running locally on the same box) pulling (blpop) and feeding an elasticsearch cluster (4 nodes right now)
Currently taking live proxy logs at ~7 - 10 K EPS running very well. Looking to add the email and fireeye logs soon and starting to enhance the data (with user and host metadata)
Thoughts right now are: - only resolve location for addresses (not hostnames) - run a caching nameserver locally on the syslog-ng box and dealing with the "ramp up" period (initially clearly the names would not be in cache - just not sure how long it would take to get to a steady state and how big to make the cache, etc.)
I'll keep you posted.
Thanks again! Jim
On 02/20/2015 03:24 PM, Fabien Wernli wrote: > Hi Jim, > > On Fri, Feb 20, 2015 at 01:52:19PM -0500, jrhendri@roadrunner.com <mailto:jrhendri@roadrunner.com> wrote: >> Is anyone using it in reasonably high-performance environments? (like 5000+ events per second) >> > we're using the module in a 3keps environment with very good performance. we > have had some issues in the past in threaded mode with some segfaults. The > geoip library documentation mentions a few sentences about thread safety. > I'd be curious to hear some feedback about your future > experience. > > cheers > ______________________________________________________________________________ > Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng > Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng > FAQ: http://www.balabit.com/wiki/syslog-ng-faq > >
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Bazsi
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
Hi, On Wed, Feb 25, 2015 at 06:25:09AM -0500, Jim Hendrick wrote:
I am (mostly) interested in things like user access to sites by IP address through the proxy, and wanting to enhance the logs with geoip data for elasticsearch.
On a sidenote, the syslog-ng geoip() template function currently only supports getting the country code. While this is currently enough to cover our needs, I think if somebody has the resources it would be awesome to get other fields too. cheers
participants (6)
-
Balazs Scheidler
-
Czanik, Péter
-
Fabien Wernli
-
Jim Hendrick
-
jrhendri@roadrunner.com
-
Per Jessen