Quick Links

Experimenting with hash tables inside pg_dump

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Experimenting with hash tables inside pg_dump
Date:	2021-10-21 22:27:25
Message-ID:	2595220.1634855245@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Today, pg_dump does a lot of internal lookups via binary search
in presorted arrays. I thought it might improve matters
to replace those binary searches with hash tables, theoretically
converting O(log N) searches into O(1) searches. So I tried making
a hash table indexed by CatalogId (tableoid+oid) with simplehash.h,
and replacing as many data structures as I could with that.

This makes the code shorter and (IMO anyway) cleaner, but

(a) the executable size increases by a few KB --- apparently, even
the minimum subset of simplehash.h's functionality is code-wasteful.

(b) I couldn't measure any change in performance at all. I tried
it on the regression database and on a toy DB with 10000 simple
tables. Maybe on a really large DB you'd notice some difference,
but I'm not very optimistic now.

So this experiment feels like a failure, but I thought I'd post
the patch and results for the archives' sake. Maybe somebody
will think of a way to improve matters. Or maybe it's worth
doing just to shorten the code?

regards, tom lane

Attachment	Content-Type	Size
use-simplehash-in-pg-dump-1.patch	text/x-diff	24.6 KB

Responses

Re: Experimenting with hash tables inside pg_dump at 2021-10-21 23:13:11 from Bossart, Nathan
Re: Experimenting with hash tables inside pg_dump at 2021-10-21 23:37:57 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bossart, Nathan	2021-10-21 23:04:34	Re: CREATEROLE and role ownership hierarchies
Previous Message	Bossart, Nathan	2021-10-21 22:23:20	Re: Fixing WAL instability in various TAP tests