Thanks Balazs, I will try some more "controlled" testing using different settings for syslog-ng resolving and caching. I think I installed libgeopi-dev (not sure right now - it's installed on a system at work) so I'll check that package also. One question on code paths: If I use an IP address pattern in patterndb (within the message - e.g. proxy or email logs) where a ${GEO} macro was assigned, will those be the only things that get resolved? (or by setting a cache within the syslog-ng config will that enable resolution for ${HOST} as well? I am (mostly) interested in things like user access to sites by IP address through the proxy, and wanting to enhance the logs with geoip data for elasticsearch. (obviously if it were fast enough, I would add the data for all sites - but initially I think IP only would be more interesting) Thanks again! Jim On 02/22/2015 02:35 PM, Balazs Scheidler wrote:
Hi,
I would think that adding forward DNS lookups to the syslog-ng dns cache code (or ripping out that code entirely and rewrite it from scratch while adding this feature) would produce _much_ better results than a locally running DNS server. That's why the DNS cache code was added in the first place, a caching only name server is still too slow for name lookups for every message posted.
The geoip code uses libgeoip1.
The database is:
$ apt-cache show geoip-database Package: geoip-database Priority: standard Section: net Installed-Size: 3881
Version: 20140313-1 Recommends: libgeoip1 Breaks: libgeoip1 (<< 1.4.5.dfsg) Filename: pool/main/g/geoip-database/geoip-database_20140313-1_all.deb Size: 1195894 MD5sum: ab4d4f6bc0e04b25cad2fbe1479f44bc SHA1: 06d38aee4084124f86351dfa6f1c404a8ae3e83b SHA256: 30dc5a2c3296180ed0740fb4ec70eb1ea5b49efc5e48a091913a8106f6895c7e Description-en: IP lookup command line tools that use the GeoIP library (country database) GeoIP is a C library that enables the user to find the country that any IP address or hostname originates from. It uses a file based database. . This database simply contains IP blocks as keys, and countries as values and it should be more complete and accurate than using reverse DNS lookups. . This package contains the free GeoLiteCountry database. Description-md5: 3bfa5b4c9f973261799fb4d9355f3b6c Homepage: http://www.maxmind.com/ Bugs: https://bugs.launchpad.net/ubuntu/+filebug Origin: Ubuntu Supported: 5y Task: standard, kubuntu-active, kubuntu-active, mythbuntu-frontend, mythbuntu-frontend, mythbuntu-desktop, mythbuntu-backend-slave, mythbuntu-backend-slave, mythbuntu-backend-master, mythbuntu-backend-master
So it is about a year old, but quite probably the version in Debian sid can be installed on top without problems, and that's pretty fresh, being dated 9th February.
https://packages.debian.org/sid/geoip-database
On Sat, Feb 21, 2015 at 1:24 PM, Jim Hendrick <jrhendri@roadrunner.com <mailto:jrhendri@roadrunner.com>> wrote:
Hi Fabian, I have done just some preliminary testing (maybe 1500 EPS for a few minutes) and was seeing a lot of dns traffic (~1MB/s)
Obviously, if the field is a hostname, to do a geoip lookup there needs to be name resolution before the IP can be mapped to a geo database.
I will be looking for ways to minimize this.
Current use-cases are for parsing proxy, email and fire-eye logs.
Recall, my base architecture is syslog-ng using patterndb sending format-json to a local redis destination (lpush) redis is run with no local disk storage and acts as an in-memory buffer between syslog-ng and logstash logstash (also running locally on the same box) pulling (blpop) and feeding an elasticsearch cluster (4 nodes right now)
Currently taking live proxy logs at ~7 - 10 K EPS running very well. Looking to add the email and fireeye logs soon and starting to enhance the data (with user and host metadata)
Thoughts right now are: - only resolve location for addresses (not hostnames) - run a caching nameserver locally on the syslog-ng box and dealing with the "ramp up" period (initially clearly the names would not be in cache - just not sure how long it would take to get to a steady state and how big to make the cache, etc.)
I'll keep you posted.
Thanks again! Jim
On 02/20/2015 03:24 PM, Fabien Wernli wrote: > Hi Jim, > > On Fri, Feb 20, 2015 at 01:52:19PM -0500, jrhendri@roadrunner.com <mailto:jrhendri@roadrunner.com> wrote: >> Is anyone using it in reasonably high-performance environments? (like 5000+ events per second) >> > we're using the module in a 3keps environment with very good performance. we > have had some issues in the past in threaded mode with some segfaults. The > geoip library documentation mentions a few sentences about thread safety. > I'd be curious to hear some feedback about your future > experience. > > cheers > ______________________________________________________________________________ > Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng > Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng > FAQ: http://www.balabit.com/wiki/syslog-ng-faq > >
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq
-- Bazsi
______________________________________________________________________________ Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng FAQ: http://www.balabit.com/wiki/syslog-ng-faq