Hi,

thank you all for the feedback. I've created a separate branch in my repository for this feature and updated it according to your comments. It can be viewed online at https://github.com/gyp/syslog-ng-3.4/commits/feature/3.4/tfhash

See my comments in-line.

On 08/06/2012 11:47 AM, Balazs Scheidler wrote:
On Sun, 2012-08-05 at 17:37 +0200, Gergely Nagy wrote:
Gergely Nagy <algernon@madhouse-project.org> writes:

1) template functions should rather use command-line parsing than
positional arguments. (e.g. I'd prefer --method, --length, etc as
arguments instead of hardcoding the order)
Well, in general, I would agree, but not in this case: for $(hash), the
method is mandatory - at least in the version I made from Peter's
original -, so is the string, length is the only optional argument, and
$(hash --method SHA1 --length 8 $BLAH) is longer than $(hash SHA1 $BLAH
8) and is not easier to understand, either, as far as I'm concerned.
Nevermind that. If we want to support multiple positional arguments,
then --length needs to be a dashed-option anyway. I'd keep the rest (ie,
method) as positional though.

I'll push the updated code to the same branch in a couple of hours.

Thanks, another review, another set of comments (seems I see things
iteratively, larger stuff first, smaller stuff later, sorry about that).

0) argument parsing is still using positional arguments

let's discuss this IRL which form to use.

I've updated it to have a "--length 8"-style argument and made it easier and less copy-paste-y to add new algorithms:

commit 576a2d4b99ca72ca48013080c573f2fd7c8737d1
Author: Peter Gyongyosi <gyp@balabit.hu>
Date:   Mon Aug 6 16:15:52 2012 +0200

    tfhash: followed-up changes in unit tests and fixed a regression they revealed
   
    Signed-off-by: Peter Gyongyosi

commit 9883210491ebbf4481f8ab6c6d14d975e465d12d
Author: Peter Gyongyosi <gyp@balabit.hu>
Date:   Mon Aug 6 19:04:13 2012 +0200

    tfhash: argument handling reworked and added support for more digest algorithms
   
    Length can now be specified using --length or -l instead of a positional
    argument and all arguments are concatenated when calculating the hash.
    Also, it now takes almost no effort and adds no redundancy to add support
    for a new digest algorithm.
   
    Signed-off-by: Peter Gyongyosi <gyp@balabit.hu>

The final way to use it can be seen well from these examples from the tests:

  testcase(msg, "$(sha1 foo)", "0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33");
  testcase(msg, "$(md5 foo)", "acbd18db4cc2f85cedef654fccc4a4d8");
  testcase(msg, "$(sha1 --length 5 foo)", "0beec");
  testcase(msg, "$(sha1 foo bar)", "8843d7f92416211de9ebb963ff4ce28125932878");
  testcase(msg, "$(sha1 \"foo bar\")", "3773dea65156909838fa6c22825cafe090ff8030");



1) parse_int()

we do have a scan_int() function in str-format.c already, isn't that
enough?

I'd love to get rid of duplicated functions if possible.

For now, I've reverted this change and moved back parse_int() into basicfuncs in this patch where it was originally:

commit 98d19f26269207b48ddc30c5bc48a6cdae7e044a
Author: Peter Gyongyosi <gyp@balabit.hu>
Date:   Mon Aug 6 19:54:58 2012 +0200

    tfhash: moved back parse_int() to the basicfuncs module as tfhash no longer needs it
   
    Signed-off-by: Peter Gyongyosi <gyp@balabit.hu>


I know it's not the nicest solution, but scan_int() in str-format.c is does bit too much magic for me to understand now... if possible, I'd leave this to you.


2) right now cryptofuncs is linked against OPENSSL, so there's really no
point in using #ifdef ENABLE_SSL there, is there?

we might as well compile cryptofuncs conditionally, if openssl is
present do it, otherwise don't. The drawback seems to be that the
$(uuid) was available until now with libuuid too.

this is just a decision, I'm fine by a "thin" cryptofuncs in case
openssl was not found, but the original naming came because I figured
these would depend on openssl. I simply forgot the libuuid based
implementation.

I vote for a thin-without-SSL-cryptofuncs as well. I've merged in Gergely's cryptofuncs branch and solved the remaining problems here:

commit d3511efbbacfa78ee0f56ccbce791e9f4c3e287e
Merge: 9883210 00c5df4
Author: Peter Gyongyosi <gyp@balabit.hu>
Date:   Mon Aug 6 19:45:20 2012 +0200

    Merge remote-tracking branch 'algernon/feature/3.4/cryptofuncs' into feature/3.4/tfhash
   
    Conflicts:
        modules/basicfuncs/str-funcs.c
        tests/unit/test_template.c


3) digest naming and hash length

by using the EVP_MD API in openssl, you might get rid of the if-else-if
construct on the hash name, you'd get all openssl supported algorithms
for free.

Also, by using argv[0] to help parsing, we might have a single function
that implements all $(hash) and the actual hashing names. Certainly this
function would have to be registered under a number of names, just like
symlinks that point to the same executable in the filesystem.

Along with the changes around the arguments, I've changed it to use EVP_MD_API, too. Now the only place a new digest is mentioned is in cryptofuncs.c, here:

static Plugin cryptofuncs_plugins[] =
{
  TEMPLATE_FUNCTION_PLUGIN(tf_uuid, "uuid"),
#if ENABLE_SSL
  TEMPLATE_FUNCTION_PLUGIN(tf_hash, "hash"),
  TEMPLATE_FUNCTION_PLUGIN(tf_hash, "sha1"),
  TEMPLATE_FUNCTION_PLUGIN(tf_hash, "sha256"),
  TEMPLATE_FUNCTION_PLUGIN(tf_hash, "sha512"),
  TEMPLATE_FUNCTION_PLUGIN(tf_hash, "md4"),
  TEMPLATE_FUNCTION_PLUGIN(tf_hash, "md5"),
#endif
};

...and of course the unit tests.

("hash" is a default when you do not care and which currently defaults to md5).



4) longer term I'd split source code into separate files if there'd be
more topics than there's now, just like it is done with the basicfuncs
module to aid maintenance.

there not all modules are separate compilation units, but still make it
simple to maintain the sources.

I'd do it when it'll be necessary, not right now.


Let me know if you want me to change these or you can finish this up.
Timewise it'd be best if you could do it, but I know I can be pesky at
times. :)

    No, you are not :)

greets,
Peter