Re: pg_dump --snapshot

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump --snapshot
Date: 2013-05-07 12:54:54
Message-ID: 20130507125453.GP4361@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> > All of which I
> > think I agree with, but I don't agree with the conclusion that this
> > larger window is somehow acceptable because there's a very small window
> > (one which can't be made any smaller, today..) which exists today.
>
> The window isn't that small currently:

Agreed- but it also isn't currently possible to make it any smaller.

> b) Locking all relations in a big database can take a second or some,
> even if there are no conflicting locks.

Yes, I've noticed.. :( You can also run out of locks, which is quite
painful.

> > Alright, then let's provide a function which will do that and tell
> > people to use it instead of just using pg_export_snapshot(), which
> > clearly doesn't do that.
>
> If it were clear cut what to lock and we had locks for
> everything. Maybe. But we don't have locks for everything.

My suggestion was to lock everything that pg_dump locks, which we
clearly have locks for since pg_dump is acquiring them. Also, I don't
believe it'd be that difficult to identify what pg_dump would lock, at
least in a 'default' whole-database run. This is more of a stop-gap
than a complete solution.

> So we would
> need to take locks preventing any modification on any of system catalogs
> which doesn't really seem like a good thing, especially as we can't
> release them from sql during the dump were we can allow creation of
> temp tables and everything without problems.

That's already an issue when pg_dump runs, no? Not sure why this is
different.

> Also, as explained above, the problem already exists in larger
> timeframes than referenced in this thread, so I really don't see how
> anything thats only based on plain locks on user objects can solve the
> issue in a relevant enough way.

The point is to try and avoid making the problem worse..

> I am comparing the time between 'snapshot acquiration' and 'getting
> the object list' with the time between 'getting the object list' and
> 'locking the object list'. What I am saying is that in many scenarios
> the second part will be the bigger part.

I can see how that can happen, sure.

> > I believe the main argument here is really around "you should think
> > about these issues before just throwing this in" and not "it must be
> > perfect before it goes in". Perhaps "it shouldn't make things *worse*
> > than they are now" would also be apt..
>
> That's not how I read 8465(dot)1367860037(at)sss(dot)pgh(dot)pa(dot)us :(

I believe the point that Tom is making is that we shouldn't paint
ourselves into a corner by letting users provide old snapshots to
pg_dump which haven't acquired any of the necessary locks. The goal, at
least as I read it, is to come up with a workable design (and I don't
know that we have, but still) which provides a way for the locks to be
taken at least as quickly as what pg_dump does today and which we could
modify down the road to take the locks pre-snapshot (presuming we can
figure out a way to make that work).

The proposed patch certainly doesn't make any attempt to address that
issue and would encourage users to open themselves up to this risk more
than they are exposted today w/ pg_dump.

> I think there is no point in fixing it somewhere else. The problem is in
> pg_dump, not the snapshot import/export.

It's really a problem for just about everything that uses transactions
and locking, isn't it? pg_dump just happens to have it worst since it
wants to go and touch every object in the database. It's certainly
possible for people to connect to the DB, look at pg_class and then try
to access some object which no longer exists (even though it's in
pg_class). This will be an interesting thing to consider when
implementing MVCC for the catalog.

> You did suggest how it can be fixed? You mean
> 20130506214515(dot)GL4361(at)tamriel(dot)snowman(dot)net?

I suggested how it might be done. :) There's undoubtably issues with an
all-database-objects lock, but it would certainly reduce the time
between transaction start and getting all the locks acquired and shrink
the window that much more. If we did implement such a beast, how could
we ensure that the locks were taken immediately after transaction start
if the snapshot is being passed to pg_dump? Basically, if we *did*
solve this issue for pg_dump in some way in the future, how would we use
it if pg_dump can accept an outside snapshot?

One other thought did occur to me- we could simply have pg_dump export
the snapshot that it gets to stdout, a file, whatever, and systems which
are trying to do this magic "everyone gets the same view" could glob
onto the snapshot created by pg_dump, after all the locks have been
acquired..

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2013-05-07 13:27:30 Re: pg_dump --snapshot
Previous Message Heikki Linnakangas 2013-05-07 12:50:05 Re: Recovery target 'immediate'