Re: pg_dump: Sorted output, referential integrity

From: Christof Petig <christof(at)petig-baender(dot)de>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Philip Warner <pjw(at)rhyme(dot)com(dot)au>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump: Sorted output, referential integrity
Date: 2001-12-10 08:21:40
Message-ID: 3C147094.DB58BBF4@petig-baender.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Christopher Kings-Lynne wrote:

> > > > but to manage the data in a version
> > > >control system you need it consistently sorted. So a flag to sort by
> > > >either primary key or left to right would be of great value. (--sorted
> > > >?)
> > >
> > > Not really very generalizable when you consider user defined types,
> > > triggers etc.
> >
> > Hmmm. But if we have a primary key on columns (A,B,C) and request the data
> > 'order by A,B,C' this should be portable, shouldn't it?
> > If we don't have a primary key simply ordering by 1,2,3,...n
> > should also work.
> > Or am I missing something?
>
> I can see how ordering a dump by the primary key would be a neat way of
> 'clustering' your data after a restore, however I have qualms about the
> scalability of such a scheme. What if someone has a 100GB table? They may
> have arranged things so that they never get a sort from it or something, or
> it might take ages. However I guess if it's an optional parameter it might
> be neat.
>
> My feeling is that it won't happen unless you actually code it into a patch
> that makes it a parameter to pg_dump. Having an actual patch is a great way
> of getting something you want done ;)
>
> Alternatively, have you tried just writing a PERL script (or some clever sed
> script) that will just sort the COPY FROM sections...?

That's beyond my perl skills. And I believe sed to be not the right tool. (hmm,
perhaps split (at 'COPY FROM' and at '\.'), then sort, then cat ... many
(perhaps big) temporary files, let the db do the hard work)

But making a patch to pg_dump is a matter of (say) up to 4 hours.
I'll do it since you seem to like it and nobody started doing it so far.

Christof

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christopher Kings-Lynne 2001-12-10 08:53:49 Re: pg_dump: Sorted output, referential integrity
Previous Message Hiroshi Inoue 2001-12-10 07:38:57 Re: Bug #533: BLOB (lo type) objects could not be restored