aggregate hash function

From: "Matthew Dennis" <mdennis(at)merfer(dot)net>
To: PGSQL <pgsql-general(at)postgresql(dot)org>
Subject: aggregate hash function
Date: 2008-01-30 19:53:52
Message-ID: e94d85500801301153u6b976e31m89e311c7134a0160@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm in need of an aggregate hash function. Something like "select
md5_agg(someTextColumn) from (select someTextColumn from someTable order by
someOrderingColumn)". I know that there is an existing MD5 function, but it
is not an aggregate. I have thought about writing a "concat" aggregate
function that would concatenate the input into a long string and then using
MD5() on that, but that seems like it would have some bad performance
implications (memory consumption, possibly spilling to disk, many large
memory copies, etc) as it would buildup the entire concatenated string first
before hashing it.

I also thought about making a aggregate function that works by keeping the
MD5 result as a string in the state, then concatenating the new input with
the current state, hashing that and using it as the new state. This solves
the problem of building up a giant string to just traverse over at the end
to get the MD5 sum. This approach would actually work for me, but it
doesn't give me the actual MD5 sum of the data which is what I really want.

comments/ideas/suggestions?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message vincent 2008-01-30 19:58:14 Re: postgresql book - practical or something newer?
Previous Message Joshua D. Drake 2008-01-30 19:02:35 Re: postgresql book - practical or something newer?