[syslog-ng] [PATCH] value-pairs: Add value_pairs_foreach_sorted().

Balazs Scheidler bazsi at balabit.hu
Thu Dec 22 15:08:57 CET 2011


Hi,

On Thu, 2011-12-22 at 12:06 +0100, Gergely Nagy wrote:
> The new function can be used to sort the keys before iterating over
> them.
> 
> Signed-off-by: Gergely Nagy <algernon at balabit.hu>
> ---
>  lib/value-pairs.c |   46 ++++++++++++++++++++++++++++++++++++++++------
>  lib/value-pairs.h |    3 +++
>  2 files changed, 43 insertions(+), 6 deletions(-)
> 
> diff --git a/lib/value-pairs.c b/lib/value-pairs.c
> index 6ecd810..b68296a 100644
> --- a/lib/value-pairs.c
> +++ b/lib/value-pairs.c
> @@ -279,11 +279,10 @@ vp_merge_set(ValuePairs *vp, LogMessage *msg, gint32 seq_num, ValuePairSpec *set
>    scratch_buffer_release(sb);
>  }
>  
> -void
> -value_pairs_foreach (ValuePairs *vp, VPForeachFunc func,
> -		     LogMessage *msg, gint32 seq_num, gpointer user_data)
> +static GHashTable *
> +value_pairs_foreach_prepare (ValuePairs *vp, LogMessage *msg, gint32 seq_num)
>  {
> -  gpointer args[] = { vp, func, msg, GINT_TO_POINTER (seq_num), user_data, NULL };
> +  gpointer args[] = { vp, NULL, msg, GINT_TO_POINTER (seq_num), NULL, NULL };
>    GHashTable *scope_set;
>  
>    scope_set = g_hash_table_new_full(g_str_hash, g_str_equal,
> @@ -315,12 +314,47 @@ value_pairs_foreach (ValuePairs *vp, VPForeachFunc func,
>    /* Merge the explicit key-value pairs too */
>    g_hash_table_foreach(vp->vpairs, (GHFunc) vp_pairs_foreach, args);
>  
> -  /* Aaand we run it through the callback! */
> -  g_hash_table_foreach(scope_set, (GHFunc)func, user_data);
> +  return scope_set;
> +}
>  
> +static void
> +value_pairs_foreach_teardown (ValuePairs *vp, GHashTable *scope_set)
> +{
>    g_hash_table_destroy(scope_set);
>  }
>  
> +void
> +value_pairs_foreach (ValuePairs *vp, VPForeachFunc func,
> +		     LogMessage *msg, gint32 seq_num, gpointer user_data)
> +{
> +  GHashTable *scope_set = value_pairs_foreach_prepare (vp, msg, seq_num);
> +
> +  g_hash_table_foreach(scope_set, (GHFunc)func, user_data);
> +
> +  value_pairs_foreach_teardown (vp, scope_set);
> +}
> +
> +void
> +value_pairs_foreach_sorted(ValuePairs *vp, GCompareFunc cmpf, VPForeachFunc func,
> +                           LogMessage *msg, gint32 seq_num,
> +                           gpointer user_data)
> +{
> +  GHashTable *scope_set = value_pairs_foreach_prepare (vp, msg, seq_num);
> +  GList *keys;
> +  gint i;
> +
> +  keys = g_list_sort (g_hash_table_get_keys (scope_set), cmpf);
> +
> +  for (i = 0; i < g_list_length (keys); i++)
> +    {
> +      const gchar *k = (const gchar *)(g_list_nth (keys, i)->data);
> +
> +      func (k, (const gchar *)g_hash_table_lookup (scope_set, k), user_data);
> +    }
> +  g_list_free (keys);
> +
> +  value_pairs_foreach_teardown (vp, scope_set);
> +}

This is uberslow. 

- iterates over the hash table: O(n), plus a lot of allocations (GList)
- sorts the list (O(logN)
- iterates the list and then at each iteration it looks up the Nth
element of the list: O(N*N)
- for each iteration does a hash lookup: O(logN)

I would love to use a different data structure that keeps keys/data in
sorted form instead of a hash table, that might be faster than a
hashtable in the first place and could be sorted regardless of the
value-pairs invocation.

What do you think?

-- 
Bazsi




More information about the syslog-ng mailing list