[syslog-ng] syslog-ng Digest, Vol 206, Issue 2

Balazs Scheidler bazsi77 at gmail.com
Fri Jun 10 18:35:02 UTC 2022


Hi Александр,

The problem is that the "answers" element of this entry is a list of dicts.
Previously lists were not even properly supported. Recent versions of
syslog-ng (that is very recent we merged quite a few PRs in this area, e.g.
https://github.com/syslog-ng/syslog-ng/pull/3885) do support lists embedded
in JSON objects, and turn them into a list of strings (in case they have
simple types) or keep them a literal "JSON" typed value, which is
reproduced verbatim.

I have configuration:

@version: 3.37

log {
        source { tcp(port(2000) flags(no-parse)); };
        parser { json-parser(prefix(".json.")); };
        destination { file("/tmp/json.out" template("$(format-flat-json
--subkeys .json.)\n")); };
};

With a recent version of syslog-ng, the $(format-flat-json) would look like
this:

{
  "timestamp-rfc3339": "2022-06-06T08:47:58.797332215Z",
  "response-port": "53",
  "response-ip": "192.168.yy.zz",
  "rcode": "NOERROR",
  "query-port": "51000",
  "query-ip": "192.168.xx.zz",
  "qtype": "TXT",
  "qname": "_dnsaddr.bootstrap.libp2p.io",
  "protocol": "TCP",
  "operation": "CLIENT_RESPONSE",
  "length": "691",
  "latency": "0.000000",
  "identity": "ns-server.example.com",
  "family": "INET",
  "country-isocode": "-",
  "answers": "[{\"name\":\"_dnsaddr.bootstrap.libp2p.io
\",\"rdatatype\":\"TXT\",\"ttl\":600,\"rdata\":\"dnsaddr=\\/dnsaddr\\/
ams-2.bootstrap.libp2p.io
\\/p2p\\/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb\"},{\"name\":\"_
dnsaddr.bootstrap.libp2p.io
\",\"rdatatype\":\"TXT\",\"ttl\":600,\"rdata\":\"dnsaddr=\\/dnsaddr\\/
sjc-1.bootstrap.libp2p.io
\\/p2p\\/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN\"},{\"name\":\"_
dnsaddr.bootstrap.libp2p.io
\",\"rdatatype\":\"TXT\",\"ttl\":600,\"rdata\":\"dnsaddr=\\/dnsaddr\\/
ewr-1.bootstrap.libp2p.io
\\/p2p\\/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa\"},{\"name\":\"_
dnsaddr.bootstrap.libp2p.io
\",\"rdatatype\":\"TXT\",\"ttl\":600,\"rdata\":\"dnsaddr=\\/dnsaddr\\/
ams-rust.bootstrap.libp2p.io
\\/p2p\\/12D3KooWEZXjE41uU4EL2gpkAQeDXYok6wghN7wwNVPF5bwkaNfS\"},{\"name\":\"_
dnsaddr.bootstrap.libp2p.io
\",\"rdatatype\":\"TXT\",\"ttl\":600,\"rdata\":\"dnsaddr=\\/dnsaddr\\/
sjc-2.bootstrap.libp2p.io
\\/p2p\\/QmZa1sAxajnQjVM8WjWXoMbmPd7NsWhfKsPkErzpm9wGkp\"},{\"name\":\"_
dnsaddr.bootstrap.libp2p.io
\",\"rdatatype\":\"TXT\",\"ttl\":600,\"rdata\":\"dnsaddr=\\/dnsaddr\\/
nrt-1.bootstrap.libp2p.io
\\/p2p\\/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt\"}]"
}

"anwers" is now a string, containing a JSON expression that contains a list
of objects.

If I set the syslog-ng config version to 4.0, I get the new typing
behaviour, which means that ${.json.answers} is now a JSON literal, so the
output becomes this:

{
  "timestamp-rfc3339": "2022-06-06T08:47:58.797332215Z",
  "response-port": "53",
  "response-ip": "192.168.yy.zz",
  "rcode": "NOERROR",
  "query-port": "51000",
  "query-ip": "192.168.xx.zz",
  "qtype": "TXT",
  "qname": "_dnsaddr.bootstrap.libp2p.io",
  "protocol": "TCP",
  "operation": "CLIENT_RESPONSE",
  "length": 691,
  "latency": "0.000000",
  "identity": "ns-server.example.com",
  "family": "INET",
  "country-isocode": "-",
  "answers": [
    {
      "name": "_dnsaddr.bootstrap.libp2p.io",
      "rdatatype": "TXT",
      "ttl": 600,
      "rdata": "dnsaddr=/dnsaddr/
ams-2.bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb
"
    },
    {
      "name": "_dnsaddr.bootstrap.libp2p.io",
      "rdatatype": "TXT",
      "ttl": 600,
      "rdata": "dnsaddr=/dnsaddr/
sjc-1.bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN
"
    },
    {
      "name": "_dnsaddr.bootstrap.libp2p.io",
      "rdatatype": "TXT",
      "ttl": 600,
      "rdata": "dnsaddr=/dnsaddr/
ewr-1.bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa
"
    },
    {
      "name": "_dnsaddr.bootstrap.libp2p.io",
      "rdatatype": "TXT",
      "ttl": 600,
      "rdata": "dnsaddr=/dnsaddr/
ams-rust.bootstrap.libp2p.io/p2p/12D3KooWEZXjE41uU4EL2gpkAQeDXYok6wghN7wwNVPF5bwkaNfS
"
    },
    {
      "name": "_dnsaddr.bootstrap.libp2p.io",
      "rdatatype": "TXT",
      "ttl": 600,
      "rdata": "dnsaddr=/dnsaddr/
sjc-2.bootstrap.libp2p.io/p2p/QmZa1sAxajnQjVM8WjWXoMbmPd7NsWhfKsPkErzpm9wGkp
"
    },
    {
      "name": "_dnsaddr.bootstrap.libp2p.io",
      "rdatatype": "TXT",
      "ttl": 600,
      "rdata": "dnsaddr=/dnsaddr/
nrt-1.bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt
"
    }
  ]
}

e.g. at least "answers" becomes an array and not a string. That's only
slightly better. Let me parse the embedded list a second time. I am adding
this parser to the config:

@version: 4.0

log {
        source { tcp(port(2000) flags(no-parse)); };

        parser { json-parser(prefix(".json.")); };

        parser { json-parser(prefix(".json.answers")
template("${.json.answers}")); };

        destination { file("/tmp/json.out" template("$(format-flat-json
--subkeys .json.)\n")); };
};

The 2nd parser finds that the input to be parsed is a list. The new
syslog-ng 4.0 behaviour is to parse elements into $1, $2, etc. This is the
trace output of the 2nd JSON parser:

[2022-06-10T18:20:12.927168] Setting value; name='1', value='{"name":"_
dnsaddr.bootstrap.libp2p.io
","rdatatype":"TXT","ttl":600,"rdata":"dnsaddr=\/dnsaddr\/
ams-2.bootstrap.libp2p.io\/p2p\/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb"}',
type='json', msg='0x7ffff0014190', rcptid='102'
[2022-06-10T18:20:12.927179] Setting value; name='2', value='{"name":"_
dnsaddr.bootstrap.libp2p.io
","rdatatype":"TXT","ttl":600,"rdata":"dnsaddr=\/dnsaddr\/
sjc-1.bootstrap.libp2p.io\/p2p\/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN"}',
type='json', msg='0x7ffff0014190', rcptid='102'
[2022-06-10T18:20:12.927190] Setting value; name='3', value='{"name":"_
dnsaddr.bootstrap.libp2p.io
","rdatatype":"TXT","ttl":600,"rdata":"dnsaddr=\/dnsaddr\/
ewr-1.bootstrap.libp2p.io\/p2p\/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa"}',
type='json', msg='0x7ffff0014190', rcptid='102'
[2022-06-10T18:20:12.927199] Setting value; name='4', value='{"name":"_
dnsaddr.bootstrap.libp2p.io
","rdatatype":"TXT","ttl":600,"rdata":"dnsaddr=\/dnsaddr\/
ams-rust.bootstrap.libp2p.io\/p2p\/12D3KooWEZXjE41uU4EL2gpkAQeDXYok6wghN7wwNVPF5bwkaNfS"}',
type='json', msg='0x7ffff0014190', rcptid='102'
[2022-06-10T18:20:12.927207] Setting value; name='5', value='{"name":"_
dnsaddr.bootstrap.libp2p.io
","rdatatype":"TXT","ttl":600,"rdata":"dnsaddr=\/dnsaddr\/
sjc-2.bootstrap.libp2p.io\/p2p\/QmZa1sAxajnQjVM8WjWXoMbmPd7NsWhfKsPkErzpm9wGkp"}',
type='json', msg='0x7ffff0014190', rcptid='102'
[2022-06-10T18:20:12.927216] Setting value; name='6', value='{"name":"_
dnsaddr.bootstrap.libp2p.io
","rdatatype":"TXT","ttl":600,"rdata":"dnsaddr=\/dnsaddr\/
nrt-1.bootstrap.libp2p.io\/p2p\/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt"}',
type='json', msg='0x7ffff0014190', rcptid='102'

e.g. $1 would be the first element of the JSON array, $2 being the 2nd and
so on. We can turn these matches into a syslog-ng list, using the special
macro "$*" and manipulate it using the list related template functions
$(list-*).

We can reparse these elements back into the original message using a 3rd
invocation of the json-parser.

Config:
@version: 4.0

log {
        source { tcp(port(2000) flags(no-parse)); };

        parser { json-parser(prefix(".json.")); };

parser { json-parser(prefix(".json.answers") template("${.json.answers}"));
};

parser { json-parser(prefix(".json.answers_0_") template("$1")); };

        destination { file("/tmp/json.out" template("$(format-flat-json
--subkeys .json.)\n")); };
};


Trace:
[2022-06-10T18:30:42.820943] Setting value; name='.json.answers_0_name',
value='_dnsaddr.bootstrap.libp2p.io', type='string', msg='0x7ffff0014190',
rcptid='104'
[2022-06-10T18:30:42.820970] Setting value;
name='.json.answers_0_rdatatype', value='TXT', type='string',
msg='0x7ffff0014190', rcptid='104'
[2022-06-10T18:30:42.820994] Setting value; name='.json.answers_0_ttl',
value='600', type='int64', msg='0x7ffff0014190', rcptid='104'
[2022-06-10T18:30:42.821021] Setting value; name='.json.answers_0_rdata',
value='dnsaddr=/dnsaddr/
ams-2.bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb',
type='string', msg='0x7ffff0014190', rcptid='104'

Getting somewhere. You can do this for each of your elements. The only
issue is that you can't loop over the array. Yet.

So if you know there's a limited number of these elements, you can do this,
by checking if $N is set, and do this parsing if it is.

I am afraid that's what we have at the moment. It is probably faster than
doing it all in Python, but hey syslog-ng is not a programming language, is
it? :)

I am giving a thought how we could do some level of iteration, but I don't
have a very good idea at the moment.

Hope this helps,
Balazs

On Fri, Jun 10, 2022 at 5:07 PM Александр Масленников <
alexander.a.maslennikov at gmail.com> wrote:

> i'm not sure that *format-flat-json* works with lists same as with nested
> dicts.
> My example uses format-flat-json, but without an additional filter on
> pothon, I was unable to flatten it
> There is original message
>
> {
> 	"operation": "CLIENT_RESPONSE",
> 	"identity": "ns-server.example.com",
> 	"family": "INET",
> 	"protocol": "TCP",
> 	"query-ip": "192.168.xx.zz",
> 	"query-port": "51000",
> 	"response-ip": "192.168.yy.zz",
> 	"response-port": "53",
> 	"length": 691,
> 	"rcode": "NOERROR",
> 	"qname": "_dnsaddr.bootstrap.libp2p.io",
> 	"qtype": "TXT",
> 	"latency": "0.000000",
> 	"timestamp-rfc3339": "2022-06-06T08:47:58.797332215Z",
> 	"answers": [{
> 		"name": "_dnsaddr.bootstrap.libp2p.io",
> 		"rdatatype": "TXT",
> 		"ttl": 600,
> 		"rdata": "dnsaddr=/dnsaddr/ams-2.bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb"
> 	}, {
> 		"name": "_dnsaddr.bootstrap.libp2p.io",
> 		"rdatatype": "TXT",
> 		"ttl": 600,
> 		"rdata": "dnsaddr=/dnsaddr/sjc-1.bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN"
> 	}, {
> 		"name": "_dnsaddr.bootstrap.libp2p.io",
> 		"rdatatype": "TXT",
> 		"ttl": 600,
> 		"rdata": "dnsaddr=/dnsaddr/ewr-1.bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa"
> 	}, {
> 		"name": "_dnsaddr.bootstrap.libp2p.io",
> 		"rdatatype": "TXT",
> 		"ttl": 600,
> 		"rdata": "dnsaddr=/dnsaddr/ams-rust.bootstrap.libp2p.io/p2p/12D3KooWEZXjE41uU4EL2gpkAQeDXYok6wghN7wwNVPF5bwkaNfS"
> 	}, {
> 		"name": "_dnsaddr.bootstrap.libp2p.io",
> 		"rdatatype": "TXT",
> 		"ttl": 600,
> 		"rdata": "dnsaddr=/dnsaddr/sjc-2.bootstrap.libp2p.io/p2p/QmZa1sAxajnQjVM8WjWXoMbmPd7NsWhfKsPkErzpm9wGkp"
> 	}, {
> 		"name": "_dnsaddr.bootstrap.libp2p.io",
> 		"rdatatype": "TXT",
> 		"ttl": 600,
> 		"rdata": "dnsaddr=/dnsaddr/nrt-1.bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt"
> 	}],
> 	"country-isocode": "-"
> }
>
> I would be very grateful to you if you have a solution using the built-in
> functions of syslog-ng.
>
> пт, 10 июн. 2022 г. в 15:00, <syslog-ng-request at lists.balabit.hu>:
>
>> Send syslog-ng mailing list submissions to
>>         syslog-ng at lists.balabit.hu
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://lists.balabit.hu/mailman/listinfo/syslog-ng
>> or, via email, send a message with subject or body 'help' to
>>         syslog-ng-request at lists.balabit.hu
>>
>> You can reach the person managing the list at
>>         syslog-ng-owner at lists.balabit.hu
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of syslog-ng digest..."
>>
>>
>> Today's Topics:
>>
>>    1.  need help with parser to make flat nested json list of
>>       dictionaries (????????? ???????????)
>>    2. Re:  need help with parser to make flat nested json list of
>>       dictionaries (Peter Kokai (pkokai))
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Fri, 10 Jun 2022 11:02:55 +0300
>> From: ????????? ???????????  <alexander.a.maslennikov at gmail.com>
>> To: syslog-ng at lists.balabit.hu
>> Subject: [syslog-ng] need help with parser to make flat nested json
>>         list of dictionaries
>> Message-ID:
>>         <CA+G0nAjp1b6_50LbCROVPje1_B4R_AzNYiZ-_dT0m=
>> fXcqwmHA at mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> hi all
>> i have a json message that contains a nested json list of dicts
>>
>> {"a":1,"b":[{"c":1},{"c":2},{"c":3}]}
>>
>> i want to flat that message, so expected result looks like {
>> "a": 1,
>> "b_0_c": 1,
>> "b_1_c": 2,
>> "b_2_c": 3
>> }
>>
>> My approach is a python implemented parser.
>> Is it possible to achieve the same result using the built-in syslog-ng
>> tools?
>> My solution below
>>
>> @define kafka-implementation kafka-c
>>
>> python {
>>
>> import collections
>> import json
>>
>> class FlattenedJson(object):
>>
>>     def parse(self, log_message, flat_message=None):
>>         def flatten(d, parent_key='', sep='_'):
>>             items = []
>>             for k, v in d.items():
>>                 new_key = parent_key + sep + k if parent_key else k
>>                 if isinstance(v, collections.MutableMapping):
>>                     items.extend(flatten(v, new_key, sep=sep).items())
>>                 elif isinstance(v, list):
>>                     for idx, value in enumerate(v):
>>                         items.extend(flatten(value, new_key + sep +
>> str(idx), sep).items())
>>                 else:
>>                     items.append((new_key, v))
>>             return dict(items)
>>         try:
>>             decoded_msg =
>> json.loads(log_message['MESSAGE'].decode('utf-8'))
>>             flat_message = flatten(decoded_msg)
>>             final_message =
>> str(json.dumps(flat_message)).encode(encoding='utf-8')
>>             log_message['MESSAGE'] = final_message
>>         except Exception as error:
>>             log_message['python_error'] = 'An exception occurred:
>> {}'.format(error)
>>         return True
>> };
>>
>> destination d_kafka_dnstap {
>>   kafka(
>>     topic("mytopic")
>>     bootstrap-servers("localhost:9092")
>>     message("$(format-flat-json  --scope all-nv-pairs
>> application_name=myapp @timestamp=${ISODATE} )")
>>   );
>> };
>>
>> source s_net_dnstap { network( transport(udp) port(514) flags(no-parse)
>> ); };
>>
>> parser p_dnstap { channel {
>>     parser { python(class("FlattenedJson")); };
>>     parser { json-parser(prefix("dnstap.")); };
>>   };
>> };
>>
>> log { source(s_net_dnstap); parser(p_dnstap);
>> destination(d_kafka_dnstap); };
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <
>> http://lists.balabit.hu/pipermail/syslog-ng/attachments/20220610/0fb59c2a/attachment-0001.htm
>> >
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Fri, 10 Jun 2022 08:09:24 +0000
>> From: "Peter Kokai (pkokai)" <Peter.Kokai at oneidentity.com>
>> To: "syslog-ng at lists.balabit.hu" <syslog-ng at lists.balabit.hu>
>> Subject: Re: [syslog-ng] need help with parser to make flat nested
>>         json list of dictionaries
>> Message-ID:
>>         <
>> SA1PR19MB5641CF9A4E2AB1502C5AE348F8A69 at SA1PR19MB5641.namprd19.prod.outlook.com
>> >
>>
>> Content-Type: text/plain; charset="koi8-r"
>>
>> Hello,
>>
>> If the underlines are not a must in the key, yes you can use
>> *format-flat-json* (it uses dot instead of underscore). It uses the same
>> syntax as format-json.
>>
>> --
>> Kokan
>>
>> ________________________________________
>> From: syslog-ng <syslog-ng-bounces at lists.balabit.hu> on behalf of
>> ????????? ??????????? <alexander.a.maslennikov at gmail.com>
>> Sent: 10 June 2022 10:02
>> To: syslog-ng at lists.balabit.hu
>> Subject: [syslog-ng] need help with parser to make flat nested json list
>> of dictionaries
>>
>> CAUTION: This email originated from outside of the organization. Do not
>> follow guidance, click links, or open attachments unless you recognize the
>> sender and know the content is safe.
>>
>> hi all
>> i have a json message that contains a nested json list of dicts
>>
>> {"a":1,"b":[{"c":1},{"c":2},{"c":3}]}
>>
>> i want to flat that message, so expected result looks like {
>> "a": 1,
>> "b_0_c": 1,
>> "b_1_c": 2,
>> "b_2_c": 3
>> }
>>
>> My approach is a python implemented parser.
>> Is it possible to achieve the same result using the built-in syslog-ng
>> tools?
>> My solution below
>>
>>
>> @define kafka-implementation kafka-c
>>
>> python {
>>
>> import collections
>> import json
>>
>> class FlattenedJson(object):
>>
>>     def parse(self, log_message, flat_message=None):
>>         def flatten(d, parent_key='', sep='_'):
>>             items = []
>>             for k, v in d.items():
>>                 new_key = parent_key + sep + k if parent_key else k
>>                 if isinstance(v, collections.MutableMapping):
>>                     items.extend(flatten(v, new_key, sep=sep).items())
>>                 elif isinstance(v, list):
>>                     for idx, value in enumerate(v):
>>                         items.extend(flatten(value, new_key + sep +
>> str(idx), sep).items())
>>                 else:
>>                     items.append((new_key, v))
>>             return dict(items)
>>         try:
>>             decoded_msg =
>> json.loads(log_message['MESSAGE'].decode('utf-8'))
>>             flat_message = flatten(decoded_msg)
>>             final_message =
>> str(json.dumps(flat_message)).encode(encoding='utf-8')
>>             log_message['MESSAGE'] = final_message
>>         except Exception as error:
>>             log_message['python_error'] = 'An exception occurred:
>> {}'.format(error)
>>         return True
>> };
>>
>> destination d_kafka_dnstap {
>>   kafka(
>>     topic("mytopic")
>>     bootstrap-servers("localhost:9092")
>>     message("$(format-flat-json  --scope all-nv-pairs
>> application_name=myapp @timestamp=${ISODATE} )")
>>   );
>> };
>>
>> source s_net_dnstap { network( transport(udp) port(514) flags(no-parse)
>> ); };
>>
>> parser p_dnstap { channel {
>>     parser { python(class("FlattenedJson")); };
>>     parser { json-parser(prefix("dnstap.")); };
>>   };
>> };
>>
>> log { source(s_net_dnstap); parser(p_dnstap);
>> destination(d_kafka_dnstap); };
>>
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> syslog-ng maillist  -  syslog-ng at lists.balabit.hu
>> https://lists.balabit.hu/mailman/listinfo/syslog-ng
>>
>>
>> ------------------------------
>>
>> End of syslog-ng Digest, Vol 206, Issue 2
>> *****************************************
>>
>
> ______________________________________________________________________________
> Member info: https://lists.balabit.hu/mailman/listinfo/syslog-ng
> Documentation:
> http://www.balabit.com/support/documentation/?product=syslog-ng
> FAQ: http://www.balabit.com/wiki/syslog-ng-faq
>
>

-- 
Bazsi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.balabit.hu/pipermail/syslog-ng/attachments/20220610/bcde0aae/attachment-0001.htm>


More information about the syslog-ng mailing list