pgsql: Fix an O(N^2) performance issue for sessions modifying many rela

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Fix an O(N^2) performance issue for sessions modifying many rela
Date: 2013-01-20 18:45:55
Message-ID: E1Twztz-0003MV-2I@gemulon.postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix an O(N^2) performance issue for sessions modifying many relations.

AtEOXact_RelationCache() scanned the entire relation cache at the end of
any transaction that created a new relation or assigned a new relfilenode.
Thus, clients such as pg_restore had an O(N^2) performance problem that
would start to be noticeable after creating 10000 or so tables. Since
typically only a small number of relcache entries need any cleanup, we
can fix this by keeping a small list of their OIDs and doing hash_searches
for them. We fall back to the full-table scan if the list overflows.

Ideally, the maximum list length would be set at the point where N
hash_searches would cost just less than the full-table scan. Some quick
experimentation says that point might be around 50-100; I (tgl)
conservatively set MAX_EOXACT_LIST = 32. For the case that we're worried
about here, which is short single-statement transactions, it's unlikely
there would ever be more than about a dozen list entries anyway; so it's
probably not worth being too tense about the value.

We could avoid the hash_searches by instead keeping the target relcache
entries linked into a list, but that would be noticeably more complicated
and bug-prone because of the need to maintain such a list in the face of
relcache entry drops. Since a relcache entry can only need such cleanup
after a somewhat-heavyweight filesystem operation, trying to save a
hash_search per cleanup doesn't seem very useful anyway --- it's the scan
over all the not-needing-cleanup entries that we wish to avoid here.

Jeff Janes, reviewed and tweaked a bit by Tom Lane

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/d5b31cc32b0762fa8920a9c1f70af2f82fb0aaa5

Modified Files
--------------
src/backend/utils/cache/relcache.c | 174 ++++++++++++++++++++++++++----------
1 files changed, 128 insertions(+), 46 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2013-01-21 00:49:44 pgsql: doc: Fix syntax of a URL
Previous Message Magnus Hagander 2013-01-20 15:13:37 pgsql: Clarify that streaming replication can be both async and sync