[PATCH (3.4) 0/1]: value-pairs: Add value_pairs_foreach_sorted().
The patch that will follow shortly implements a value_pairs_foreach_sorted() function, that can be used for multiple purposes in the future: - Template functions that want to format their output with keys always in a predictable order. - Various other parts of the code, that work better when operating on a sorted collection. I'm submitting it separately, so that it'll be easier for me to build upon it in the not too distant future. (And because I almost forgot to send it, even though it was origianlly made in early november...)
The new function can be used to sort the keys before iterating over them. Signed-off-by: Gergely Nagy <algernon@balabit.hu> --- lib/value-pairs.c | 46 ++++++++++++++++++++++++++++++++++++++++------ lib/value-pairs.h | 3 +++ 2 files changed, 43 insertions(+), 6 deletions(-) diff --git a/lib/value-pairs.c b/lib/value-pairs.c index 6ecd810..b68296a 100644 --- a/lib/value-pairs.c +++ b/lib/value-pairs.c @@ -279,11 +279,10 @@ vp_merge_set(ValuePairs *vp, LogMessage *msg, gint32 seq_num, ValuePairSpec *set scratch_buffer_release(sb); } -void -value_pairs_foreach (ValuePairs *vp, VPForeachFunc func, - LogMessage *msg, gint32 seq_num, gpointer user_data) +static GHashTable * +value_pairs_foreach_prepare (ValuePairs *vp, LogMessage *msg, gint32 seq_num) { - gpointer args[] = { vp, func, msg, GINT_TO_POINTER (seq_num), user_data, NULL }; + gpointer args[] = { vp, NULL, msg, GINT_TO_POINTER (seq_num), NULL, NULL }; GHashTable *scope_set; scope_set = g_hash_table_new_full(g_str_hash, g_str_equal, @@ -315,12 +314,47 @@ value_pairs_foreach (ValuePairs *vp, VPForeachFunc func, /* Merge the explicit key-value pairs too */ g_hash_table_foreach(vp->vpairs, (GHFunc) vp_pairs_foreach, args); - /* Aaand we run it through the callback! */ - g_hash_table_foreach(scope_set, (GHFunc)func, user_data); + return scope_set; +} +static void +value_pairs_foreach_teardown (ValuePairs *vp, GHashTable *scope_set) +{ g_hash_table_destroy(scope_set); } +void +value_pairs_foreach (ValuePairs *vp, VPForeachFunc func, + LogMessage *msg, gint32 seq_num, gpointer user_data) +{ + GHashTable *scope_set = value_pairs_foreach_prepare (vp, msg, seq_num); + + g_hash_table_foreach(scope_set, (GHFunc)func, user_data); + + value_pairs_foreach_teardown (vp, scope_set); +} + +void +value_pairs_foreach_sorted(ValuePairs *vp, GCompareFunc cmpf, VPForeachFunc func, + LogMessage *msg, gint32 seq_num, + gpointer user_data) +{ + GHashTable *scope_set = value_pairs_foreach_prepare (vp, msg, seq_num); + GList *keys; + gint i; + + keys = g_list_sort (g_hash_table_get_keys (scope_set), cmpf); + + for (i = 0; i < g_list_length (keys); i++) + { + const gchar *k = (const gchar *)(g_list_nth (keys, i)->data); + + func (k, (const gchar *)g_hash_table_lookup (scope_set, k), user_data); + } + g_list_free (keys); + + value_pairs_foreach_teardown (vp, scope_set); +} static void value_pairs_init_set(ValuePairSpec *set) diff --git a/lib/value-pairs.h b/lib/value-pairs.h index 9b77266..30aa443 100644 --- a/lib/value-pairs.h +++ b/lib/value-pairs.h @@ -40,6 +40,9 @@ void value_pairs_add_transforms(ValuePairs *vp, gpointer *vpts); void value_pairs_foreach(ValuePairs *vp, VPForeachFunc func, LogMessage *msg, gint32 seq_num, gpointer user_data); +void value_pairs_foreach_sorted(ValuePairs *vp, GCompareFunc cmpf, VPForeachFunc func, + LogMessage *msg, gint32 seq_num, + gpointer user_data); ValuePairs *value_pairs_new(void); void value_pairs_free(ValuePairs *vp); -- 1.7.7.3
Hi, On Thu, 2011-12-22 at 12:06 +0100, Gergely Nagy wrote:
The new function can be used to sort the keys before iterating over them.
Signed-off-by: Gergely Nagy <algernon@balabit.hu> --- lib/value-pairs.c | 46 ++++++++++++++++++++++++++++++++++++++++------ lib/value-pairs.h | 3 +++ 2 files changed, 43 insertions(+), 6 deletions(-)
diff --git a/lib/value-pairs.c b/lib/value-pairs.c index 6ecd810..b68296a 100644 --- a/lib/value-pairs.c +++ b/lib/value-pairs.c @@ -279,11 +279,10 @@ vp_merge_set(ValuePairs *vp, LogMessage *msg, gint32 seq_num, ValuePairSpec *set scratch_buffer_release(sb); }
-void -value_pairs_foreach (ValuePairs *vp, VPForeachFunc func, - LogMessage *msg, gint32 seq_num, gpointer user_data) +static GHashTable * +value_pairs_foreach_prepare (ValuePairs *vp, LogMessage *msg, gint32 seq_num) { - gpointer args[] = { vp, func, msg, GINT_TO_POINTER (seq_num), user_data, NULL }; + gpointer args[] = { vp, NULL, msg, GINT_TO_POINTER (seq_num), NULL, NULL }; GHashTable *scope_set;
scope_set = g_hash_table_new_full(g_str_hash, g_str_equal, @@ -315,12 +314,47 @@ value_pairs_foreach (ValuePairs *vp, VPForeachFunc func, /* Merge the explicit key-value pairs too */ g_hash_table_foreach(vp->vpairs, (GHFunc) vp_pairs_foreach, args);
- /* Aaand we run it through the callback! */ - g_hash_table_foreach(scope_set, (GHFunc)func, user_data); + return scope_set; +}
+static void +value_pairs_foreach_teardown (ValuePairs *vp, GHashTable *scope_set) +{ g_hash_table_destroy(scope_set); }
+void +value_pairs_foreach (ValuePairs *vp, VPForeachFunc func, + LogMessage *msg, gint32 seq_num, gpointer user_data) +{ + GHashTable *scope_set = value_pairs_foreach_prepare (vp, msg, seq_num); + + g_hash_table_foreach(scope_set, (GHFunc)func, user_data); + + value_pairs_foreach_teardown (vp, scope_set); +} + +void +value_pairs_foreach_sorted(ValuePairs *vp, GCompareFunc cmpf, VPForeachFunc func, + LogMessage *msg, gint32 seq_num, + gpointer user_data) +{ + GHashTable *scope_set = value_pairs_foreach_prepare (vp, msg, seq_num); + GList *keys; + gint i; + + keys = g_list_sort (g_hash_table_get_keys (scope_set), cmpf); + + for (i = 0; i < g_list_length (keys); i++) + { + const gchar *k = (const gchar *)(g_list_nth (keys, i)->data); + + func (k, (const gchar *)g_hash_table_lookup (scope_set, k), user_data); + } + g_list_free (keys); + + value_pairs_foreach_teardown (vp, scope_set); +}
This is uberslow. - iterates over the hash table: O(n), plus a lot of allocations (GList) - sorts the list (O(logN) - iterates the list and then at each iteration it looks up the Nth element of the list: O(N*N) - for each iteration does a hash lookup: O(logN) I would love to use a different data structure that keeps keys/data in sorted form instead of a hash table, that might be faster than a hashtable in the first place and could be sorted regardless of the value-pairs invocation. What do you think? -- Bazsi
Balazs Scheidler <bazsi@balabit.hu> writes: [...]
This is uberslow.
Indeed, it is.
- iterates over the hash table: O(n), plus a lot of allocations (GList) - sorts the list (O(logN) - iterates the list and then at each iteration it looks up the Nth element of the list: O(N*N) - for each iteration does a hash lookup: O(logN)
I would love to use a different data structure that keeps keys/data in sorted form instead of a hash table, that might be faster than a hashtable in the first place and could be sorted regardless of the value-pairs invocation.
What do you think?
I agree, will try to think of something sensible during the holidays. -- |8]
participants (2)
-
Balazs Scheidler
-
Gergely Nagy