Re: foreign key locks, 2nd attempt

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: foreign key locks, 2nd attempt
Date: 2012-03-13 17:46:24
Message-ID: CA+TgmobONEodK-M4CF5Ov0UONxb-5d5ru4QDnL77OqzBcEnLNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 12, 2012 at 3:28 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> I agree with you that some worst case performance tests should be
> done. Could you please say what you think the worst cases would be, so
> those can be tested? That would avoid wasting time or getting anything
> backwards.

I've thought about this some and here's what I've come up with so far:

1. SELECT FOR SHARE on a large table on a system with no write cache.

2. A small parent table (say 30 rows or so) and a larger child table
with a many-to-one FK relationship to the parent (say 100 child rows
per parent row), with heavy update activity on the child table, on a
system where fsyncs are very slow. This should generate lots of mxid
consumption, and every 1600 or so mxids (I think) we've got to fsync;
does that generate a noticeable performance hit?

3. It would be nice to test the impact of increased mxid lookups in
the parent, but I've realized that the visibility map will probably
mask a good chunk of that effect, which is a good thing. Still, maybe
something like this: a fairly large parent table, say a million rows,
but narrow rows, so that many of them fit on a page, with frequent
reads and occasional updates (if there are only reads, autovacuum
might end with all the visibility map bits set); plus a child table
with one or a few rows per parent which is heavily updated. In theory
this ought to be good for the patch, since the the more fine-grained
locking will avoid blocking, but in this case the parent table is
large enough that you shouldn't get much blocking anyway, yet you'll
still pay the cost of mxid lookups because the occasional updates on
the parent will clear VM bits. This might not be the exactly right
workload to measure this effect, but if it's not maybe someone can
devote a little time to thinking about what would be.

4. A plain old pgbench run or two, to see whether there's any
regression when none of this matters at all...

This isn't exactly a test case, but from Noah's previous comments I
gather that there is a theoretical risk of mxid consumption running
ahead of xid consumption. We should try to think about whether there
are any realistic workloads where that might actually happen. I'm
willing to believe that there aren't, but not just because somebody
asserts it. The reason I'm concerned about this is because, if it
should happen, the result will be more frequent anti-wraparound
vacuums on every table in the cluster. Those are already quite
painful for some users.

It would be nice if Noah or someone else who has reviewed this patch
in detail could comment further. I am shooting from the hip here, a
bit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-03-13 17:53:34 Re: patch for parallel pg_dump
Previous Message Greg Stark 2012-03-13 17:46:06 Re: pg_upgrade and statistics