On Sat, 2011-01-08 at 20:41 -0600, Martin Holste wrote:
Ahahaha that is awesome! Search will actually be really easy since you can index on anything in there. I think what would work best for full-text search in mojology (doesn't roll off my tongue, but whatever fuels your passion...)
Think of it is a "monology", with a j instead of n.
is to have an optional second process that goes through newly inserted logs and does an in-place update. So if a log entry starts with:
{ _id: ... "timestamp": ... "dyn": { "classifier": "class": "some class" }, "msg": "hello, world, this is a test", }
Then do something like this to update it: db.getCollection("logs").update({"timestamp": { $gt: <date last fulltext indexed>, $lt: <now> }}, { $set: { "fulltext": msg.split(/\s+/) }}, true);
Which adds the fulltext column to yield: { _id: ... "timestamp": ... "dyn": { "classifier": "class": "some class" }, "msg": "hello, world, this is a test", "fulltext": [ "hello", "world", "this", "is", "a", "test" ] }
I'm a little shaky on the Mongo update code there, but you get the idea. The point is that since it would be an optional second-pass, it would be easy to tune or eliminate for performance. If you do ensureIndex("dyn") and ensureIndex("fulltext") then you have pretty much all of your searching-bases covered. You could of course add this as an option to your Mongo Syslog-NG driver to do the split when the original insert occurs for better overall performance and less database fragmentation, but there would be a significantly higher insert time.
I was considering something lke that (and a few other things, that would involve updating the db), and my current idea is to use a separate collection instead, so that if the original collection is, say, a capped collection, we don't unnecessarily add extra burden to it. That, and updating has a reasonable chance of fragmenting the document on-disk... So instead, I'll see if I can use a $Docref (or whatever that is called). That would make mojology a little slower, but it wouldn't need to touch the source collection at all. I didn't think about fulltext search though, so thanks for the suggestion! -- |8]