Re: pg_dump --snapshot

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump --snapshot
Date: 2013-05-07 01:27:50
Message-ID: 20130507012750.GB24957@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-05-06 21:07:36 -0400, Stephen Frost wrote:
> * Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> > On 2013-05-06 20:18:26 -0400, Stephen Frost wrote:
> > > Because parallel pg_dump didn't make the problem any *worse*..? This
> > > does. The problem existed before parallel pg_dump.
> >
> > Yes, it did.
>
> That's not entirely clear- are you agreeing with my statements, or
> not?

I am agreeing its a very old problem that has existed before parallel
pg_dump.

> > > The API exposes it, yes, but *pg_dump* isn't any worse than it was
> > > before.
> >
> > No, but its still broken. pg_dump without the parameter being passed
> > isn't any worse off after the patch has been applied. With the parameter
> > the window gets a bit bigger sure...
>
> I'm not entirely following the distinction you're making here. What I
> think you're saying is that "pg_dump is still busted" and "pg_dump when
> the parameter isn't passed is busted" and "pg_dump creates a bigger
> window where it can break if the parameter is passed".

Yes, that's what I was trying to say.

> All of which I
> think I agree with, but I don't agree with the conclusion that this
> larger window is somehow acceptable because there's a very small window
> (one which can't be made any smaller, today..) which exists today.

The window isn't that small currently:

a) If one of our lock statements has to wait for a preexisting
conflicting lock we have to wait, possibly for a very long
time. Allthewhile some other objects are not locked by any backend.
b) Locking all relations in a big database can take a second or some,
even if there are no conflicting locks.

> > Given that we don't have all that many types of objects we can lock,
> > that task isn't all that complicated.
>
> Alright, then let's provide a function which will do that and tell
> people to use it instead of just using pg_export_snapshot(), which
> clearly doesn't do that.

If it were clear cut what to lock and we had locks for
everything. Maybe. But we don't have locks for everything. So we would
need to take locks preventing any modification on any of system catalogs
which doesn't really seem like a good thing, especially as we can't
release them from sql during the dump were we can allow creation of
temp tables and everything without problems.

Also, as explained above, the problem already exists in larger
timeframes than referenced in this thread, so I really don't see how
anything thats only based on plain locks on user objects can solve the
issue in a relevant enough way.

> > But I'd guess a very common usage
> > is to start the snapshot and immediately fork pg_dump. In that case the
> > window between snapshot acquiration and reading the object list is
> > probably smaller than the one between reading the object list and
> > locking.
>
> How would it be smaller..? I agree that it may only be a few seconds
> larger, but you're adding things to the front which the current code
> doesn't run, yet running everything the current code runs, so it'd have
> to be larger..

I am comparing the time between 'snapshot acquiration' and 'getting
the object list' with the time between 'getting the object list' and
'locking the object list'. What I am saying is that in many scenarios
the second part will be the bigger part.

> > This all reads like a textbook case of "perfect is the enemy of good" to
> > me.
>
> I believe the main argument here is really around "you should think
> about these issues before just throwing this in" and not "it must be
> perfect before it goes in". Perhaps "it shouldn't make things *worse*
> than they are now" would also be apt..

That's not how I read 8465(dot)1367860037(at)sss(dot)pgh(dot)pa(dot)us :(

> > A rather useful feature has to fix a bug in pg_dump which a) exists for
> > ages b) has yet to be reported to the lists c) is rather complicated to
> > fix and quite possibly requires proper snapshots for internals?

> I've not seen anyone calling for this to be fixed in pg_dump first,
> though I did suggest how that might be done.

I think there is no point in fixing it somewhere else. The problem is in
pg_dump, not the snapshot import/export.

You did suggest how it can be fixed? You mean
20130506214515(dot)GL4361(at)tamriel(dot)snowman(dot)net?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-05-07 04:53:48 Re: 9.3 Beta1 status report
Previous Message Craig Ringer 2013-05-07 01:12:46 Re: pg_dump --snapshot