Re: Is there an md5sum for tables?

From: Steve Atkins <steve(at)blighty(dot)com>
To: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Is there an md5sum for tables?
Date: 2008-04-02 16:43:49
Message-ID: 818D0EEB-0F3C-4C08-B096-B0539528A4E2@blighty.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Apr 1, 2008, at 8:56 PM, Michael Enke wrote:
> Hi all,
> I need to know if multiple tables have (may have most probably)
> identical content.
> Since I want a fast solution (which means not comparing tables row
> by row),
> I thought it would be a good idea to have an sql function operating
> on a table or view
> similar to md5sum on a file and only compare the generated hashes.
> I did not find such a function in the documentation.
>
> Is such a function already available?

No, but you could probably build it fairly easily from parts that
are already there.

>
> If not, does it make no sense to use such an approach?

Possibly, but generating the hash might be time consuming,
as you'd need to iterate through the whole table, probably
in a specific order, to generate the hash. So every time you
modified the table you'd have to do something like
"select * from table order by <something unique>" to
maintain a current hash. (If you used a weak hash, something
like XOR of the md5 of each row, then you could do it cheaply
incrementally - and that might even be a usable hash if there's
a unique constraint on some field).

Overall, though, this is a very non-database thing to want to
do. Iit sounds like you may have made some bad
decisions mapping your data into the database. Usually
you'd abstract out common data and refer to it via a foreign
key, or somesuch.

What's the actual application?

Cheers,
Steve

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ian Sillitoe 2008-04-02 16:49:37 (FAQ?) JOIN condition - 'WHERE NULL = NULL'
Previous Message Steve Crawford 2008-04-02 16:43:20 Re: [GENERAL] SHA1 on postgres 8.3