Balazs Scheidler <bazsi@balabit.hu> writes:
Where, when, how?? ==================
There are multiple goals to accomplish, which I believe are not exclusive: we want to keep the sql() statement in the configuration file, and maintain backwards compatibility aswell. On the other hand, we want to improve performance too, and that is best accomplished by using the native database libraries. To this end, I propose we introduce new database types: "mysql-native", "postgresql-native", "sqlite-native", and so on and so forth.
I would probably not create separate types for these, is there a point in specifying which API to use from a user's point of view? I would guess, we have the responsibility to decide which one to use, and if there's a notable, real difference between methods (except performance), make some knobs to adjust those, but only those.
Good point. So if syslog-ng was compiled with libmysqlclient present, then sql(type(mysql)) would automatically use the native driver instead of libdbi? HandleSocket would/will be a little different, as it's not really SQL: it just happens to use MySQL for storage, but that's about it. I'm still pondering how (if it is possible at all) to use that as part of afsql.
mysql_list_tables() -------------------
Instead of doing a full select to verify that a table exists, we can just try to list the table. This is probably not a significant increase, but every bit counts.
I'm not sure this is worth the effort, a table is only checked if it exists when we first one to insert a message and when syslog-ng is restarted or an error occurs.
Indeed, in that case, it's not worth it.
I'd recommend to make it possible to reuse RDBMS independent code blocks. (like checking if a table exists), because if we do everything in an RDBMS specific way, it could become a maintenance hell.
Yep, reusing as much code as possible is part of my plans. But, I'd rather do the native drivers first, possibly reimplementing a lot of stuff, and then - in a next iteration - move the RDBMS independent code into a common place.
Insertion plans ===============
There's a couple of alternative ways to improve insert speed by changing how we actually do them. All have their pros and cons, which I will try to enumerate below.
LOAD DATA INFILE ----------------
The fastest way by far is LOAD DATA INFILE, which is what people use in the various perl scripts, as far as I remember. The downside is that - as far as I saw so far - this does need a temporary file, which has its downsides, like writing to disk (unless put on tmpfs or similar). An option which I haven't tried would be to use a pipe(), so that the data never hits the disk.
But nevertheless, the mysql-native db type will have an option to use LOAD DATA INFILE.
* Pros: + The fastest way to insert * Cons: - Needs a temporary file, with all its drawbacks, some of which can be worked around.
well, at least on Linux this could be worked around by using POSIX shared memory, which has an underlying file on Linux (under /dev/shm), but also on Solaris (/tmp/.SHMD<shmname>). Possibly other platforms too.
Yeah. And on other platforms, there's still named pipes. I'll probably start with named pipes, and add platform hacks later.
- Serialisation is costlier than simple INSERTs.
I wouldn't say that. Even INSERTs need escaping in order to avoid SQL injection, and the escaping needed for the LOAD format is not really different (if at all).
There's a difference though: with inserts, assuming that we're using prepared statements, there's not much extra processing than resolving the templates, the rest is handled by the native client library. With prepared statements, we don't need to do construct INSERT commands: we already did that, we just bind our new values to it. For example, assuming we have a statement template like this: INSERT INTO syslog (seqid, host, message) VALUES (?, ?, ?) Then whenever we want to insert a message, we do something along these lines: rdbms_prep_statement_bind (stmnt_handle, 0, seqid); rdbms_prep_statement_bind (stmnt_handle, 1, host); rdbms_prep_statement_bind (stmnt_handle, 2, message); rdbms_prep_statement_execute (stmnt_handle); There's no extra escaping needed, since the server does not need to parse the whole statement, we already prepared it ages ago. We just send two parts: the template and the variables, the server does the rest. However, in LOAD DATA INFILE's case, we'd need to do escaping ourselves (which is not needed when using INSERT with prepared statements) and format the data into a suitable format. On the other hand, the extra overhead of formatting is far smaller than the overhead of using INSERTs instead of LOAD DATA INFILE, so as it turns out, this is a non-issue aswell :)
- No fine-grained error checking: if something goes wrong, the whole LOAD will have to be dropped (unless we add code that splits bulks up into smaller chunks, and retries - but that's something I want to avoid if possible)
What we do with transactions currently, is that we put items being added to the transactions to the "backlog" queue, which is then dropped when the transaction succeeds, or rewound when it doesn't.
I think the same can work with LOAD DATA.
Yep, but that still doesn't tell us which of the 1000 messages triggered the error, only that one of them did. Nevertheless, this is a very very minor inconvenience, and if there's no bugs in our code, it doesn't even matter.
For LOAD DATA to work, we will have to format the data to insert into a suitable format, and if there's an error in the serialisation somewhere, we'll have a hard time to figure out where the problem is, and syslog-ng will just give up.
Also, beating our data into a suitable format isn't exactly cheap, either (think escaping: processing large amounts of string data, possibly changing strings - there's both processing and memory ramifications involved).
The need for a file is also a (minor) issue. I suppose named pipes would work, that way data never hits the disk between syslog-ng and mysql. However, that still involves copying data around. We'll see if these affect LOAD DATA's performance badly enough (they probably don't).
I was considering this option, earlier when I was thinking about impementing bulk loads.
There's a "LOAD DATA LOCAL INFILE" option, can't that be used to generate the file without having to write to disk? Or is that completely hidden by the mysql client lib?
LOAD DATA LOCAL INFILE is what the driver will use, and it still needs a temporary file. The difference between LOAD DATA INFILE and LOAD DATA LOCAL INFILE is that in the latter case, the file is on client-side, while in the former, it's on server side. All forms of LOAD DATA need an input file (it's really LOAD DATA INFILE, but that's just too long to write every time O:).
The Plan ========
The plan is to implement all the strategies above, and measure them against each other and external scripts. This will take a little time, because I want to get the architecture right, and figure out whether we want a single or multiple writer threads (per-destination, of course), or if I want to make that a configurable option aswell.
I would recommend not to implement all loading options in parallel, but rather choose the one which offers the best pros/cons ratio and stick with it.
After thinking about it more, and based on your and Martin's suggestion, I'll stick with LOAD DATA INFILE & HandlerSocket for mysql. For Postgres, I'll go with COPY, and we'll see about the rest when I get there.
Completely independent methods for writing databases can again become a maintenance burden.
Yep, but at least for me, it makes initial prototyping easier. Once that's done, I will collect the common stuff into a shared core, and leave only those things in the database-specific drivers that do need to be there (basically, the calls into the native client libraries in most cases). At least, that's how I see it now. Once I sit down and start coding, I'll see if this idea is flawed or not. The end result will be the same in both cases, though.
I'd really doubt if multiple feeding threads would indeed increase the loading speed. I'd think this is really disk bound, and/or internal locking would inhibit scaling to multiple threads anyway. Although it'd probably make sense to try it :)
In the meantime, I tried a few experiments, and concluded that multiple feeders are not worth the hassle. -- |8]