Encoding provlem when writing to postgres
Hello List, i get the following error when i try to write to postres: Error running SQL query; type='pgsql', host='XXX', port='', user='XXX', database='syslogng', error='ERROR: invalid byte sequence for encoding "UTF8": 0xe96461\x0aHINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".\x0a', query='INSERT INTO msg_table (msg_rcv_time, msg_sent_time, hostname, msg_facility, msg_priority, msg_text) VALUES (\'2008-10-14T19:59:46+02:00\', \'2008-10-14T19:59:46+02:00\', \'foo\', \'mail\', \'warning\', \'mimedefang.pl[3497]: Message contains more than one Subject: header: Bulletin d\\\'\'info d\\\'\'Enseigner.TV et la fiche p\xffdagogique de l\\\'\'\xffmission du mois \xff 14 octobre 208 --> Bulletin d\\\'\'info d\\\'\'Enseigner.TV et la fiche p\xffdagogique de l\\\'\'\xffmission du mois \xff 14 octobre 208\')' How can i fix this? I am running postgresql-8.1 with Debian 4.0. Thanks, Mario
On Wed, 2008-10-15 at 09:53 +0200, ml@bortal.de wrote:
Hello List,
i get the following error when i try to write to postres:
Error running SQL query; type='pgsql', host='XXX', port='', user='XXX', database='syslogng', error='ERROR: invalid byte sequence for encoding "UTF8": 0xe96461\x0aHINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".\x0a', query='INSERT INTO msg_table (msg_rcv_time, msg_sent_time, hostname, msg_facility, msg_priority, msg_text) VALUES (\'2008-10-14T19:59:46+02:00\', \'2008-10-14T19:59:46+02:00\', \'foo\', \'mail\', \'warning\', \'mimedefang.pl[3497]: Message contains more than one Subject: header: Bulletin d\\\'\'info d\\\'\'Enseigner.TV et la fiche p\xffdagogique de l\\\'\'\xffmission du mois \xff 14 octobre 208 --> Bulletin d\\\'\'info d\\\'\'Enseigner.TV et la fiche p\xffdagogique de l\\\'\'\xffmission du mois \xff 14 octobre 208\')'
How can i fix this? I am running postgresql-8.1 with Debian 4.0.
Probably the syslog client sends non-utf8 characters in its message. syslog-ng 2.0 and 2.1 do not really care about the message contents. 3.0 has support for various encodings and utf8 validation like this: This assumes that the input is latin1: source s_net { udp(encoding("iso-8859-1")); }; And this one enforces valid utf8 messages source s_net { udp(flags(validate-utf8)); }; I was planning to add another flag, but this is not yet implemented: source s_net { udp(flags(force-utf8)); }; Which would enforce valid utf8 sequences by changing the input. As an alternative don't use utf8 as the character encoding of your database, use latin1, that'll permit any kind of data in the database. -- Bazsi
participants (2)
-
Balazs Scheidler
-
ml@bortal.de