Hi Jorge,

This seems to be a bit tricky message. If I see it correctly, after the syslog header, you have:

 * some additional info that ends with [captcha] (is that literally [captcha], or it changes with every message?)
 * then you have some JSON
 * "while logging request," string
 * and finally some key:value pairs.

As a first try, I would use the csv-parser() to separate the message to the four blocks I listed above. Use syslog-ng OSE 3.7, because then you can use strings as delimiters (see https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/reference-parsers-csv.html#csv-parser-delimiter). For example, you can use the ] character and the "while logging request," string as delimiters.
Then you can run the json-parser on the macro containing the JSON-part of the message, and a kv-parser with the : delimiter to parse the last block if needed. (Note that for kv-parser part you need a recent development version of syslog-ng).

Make sure to use the prefix() option in the json-parser, because as I see the key:value block seems to have some of the same keys that the json block.

BTW, is this a publicly available application that emits such a log message? It seems ideal to showcase the wide range of available syslog-ng parsers :)

Regards,
Robert

On Sun, Jul 3, 2016 at 3:32 PM, Jorge Pereira <jpereiran@gmail.com> wrote:
Hi,

    I am not sure about the best approach and way to fix my problem, below more information.

1) I receive the below packet sent from a nginx/openresty instance.

2016/07/02 01:17:04 [emerg] 19081#0: *13163 [lua] init.lua:115: [captcha] {"fail_count":"","response_code":200,"client_ip":"192.168.1.22","hostname":"server-lab01","request_id":"2016-07-02T01:17:03Z|9175f93c0c||i0Xb3BuBWV","host":"www.mytest.com","http_request":{"verb":"GET","url":"\/","user-agent":"Mozilla\/5.0 (pc-x86_64-linux-gnu) Siege\/3.0.8","http_version":"1.1","all":"{\"host\":\"www.mytest.com\",\"x-country-code\":\"US\",\"connection\":\"close\",\"accept\":\"*\\\/*\",\"x-client-ip\":\"192.168.1.22\",\"user-agent\":\"Mozilla\\\/5.0 (pc-x86_64-linux-gnu) Siege\\\/3.0.8\",\"accept-encoding\":\"gzip\"}"},"geoip":{"location":"-90.5334,38.6500","city_name":"Chesterfield","country_name":"United States","longitude":-90.5334,"area_code":314,"latitude":38.65,"country_code2":"US","country_code3":"USA"},"got":"","action":"show","expected":"h1szmM","webapp_domain":"www.mytest.com"} while logging request, client: 192.168.1.22, server: www.mytest.com, request: "GET / HTTP/1.1", host: "www.mytest.com"

2) In my server side, I need to save the logs following a value of host: "www.mytest.com" like:

/var/log/syslog-ng/www.mytest.com.log

3) The problem is because the packet received has a part being a jSON, but I can't use the json-parser().

4) What is the best approach? I have used:

# Extracting only the jSON payload
rewrite p_nginx_wb_error_log_clean {
    subst(".*captcha] ", "", value("MESSAGE"), flags("global"));
    subst(" while logging request.*$", "", value("MESSAGE"), flags("global"));
};

parser p_nginx_wb_error_log_json {
    json-parser(
        marker("")
        prefix("j.")
    );  
};

destination d_nginx_wb_error_log {
    file("/var/log/syslog-ng/nginx/${j.webapp_domain:-unknow-payload}_error.log"                                                                                                                                                               
         create_dirs(yes)
         owner("root")
         group("root")
         perm(0644)
         dir_perm(0755)
         template("${MSG}\n")
    );  
};

--
Jorge Pereira

______________________________________________________________________________
Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
Documentation: http://www.balabit.com/support/documentation/?product=syslog-ng
FAQ: http://www.balabit.com/wiki/syslog-ng-faq