Re: pg_dump restore time and Foreign Keys

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump restore time and Foreign Keys
Date: 2008-06-05 13:27:50
Message-ID: 1212672470.19964.71.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Thu, 2008-06-05 at 16:01 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2008-06-05 at 10:19 +0300, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >>> I'm guessing that the WITHOUT CHECK option would not be acceptable as an
> >>> unprotected trap for our lazy and wicked users. :-)
> >> Yes, that sounds scary.
> >>
> >> Instead, I'd suggest finding ways to speed up the ALTER TABLE ADD
> >> FOREIGN KEY.
> >
> > I managed a suggestion for improving it for integers only, but if
> > anybody has any other ideas, I'm all ears.
>
> Well, one idea would be to allow adding multiple foreign keys in one
> command, and checking them all at once with one SQL query instead of one
> per foreign key. Right now we need one seq scan over the table per
> foreign key, by checking all references at once we would only need one
> seq scan to check them all.

No need. Just parallelise the restore with concurrent psql. Which would
speed up the index creation also. Does Greg have plans for further work?

> >> Or speeding up COPY into a table with foreign keys already
> >> defined. For example, you might want to build an in-memory hash table of
> >> the keys in the target table, instead of issuing a query on each INSERT,
> >> if the target table isn't huge.
> >
> > No, that's not the problem, but I agree that is a problem also.
>
> It is related, because if we can make COPY into a table with foreign
> keys fast enough, we could rearrange dumps so that foreign keys are
> created before loading data. That would save the seqscan over the table
> altogether.

True.

> Thinking about this idea a bit more, instead of loading the whole target
> table into memory, it would probably make more sense to keep a hash
> table as just a cache of the most recent keys that have been referenced.

If you can think of a way of improving hash joins generally, then it
will work for this specific case also.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2008-06-05 13:35:10 Re: pg_dump restore time and Foreign Keys
Previous Message Mario Weilguni 2008-06-05 13:17:46 Re: Logging conflicted queries on deadlocks