Skip site navigation (1) Skip section navigation (2)

Re: pg_dump additional options for performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: daveg <daveg(at)sonic(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: pg_dump additional options for performance
Date: 2008-07-21 03:34:35
Message-ID: 18415.1216611275@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * daveg (daveg(at)sonic(dot)net) wrote:
>> One observation, indexes should be built right after the table data
>> is loaded for each table, this way, the index build gets a hot cache
>> for the table data instead of having to re-read it later as we do now.

> That's not how pg_dump has traditionally worked, and the point of this
> patch is to add options to easily segregate the main pieces of the
> existing pg_dump output (main schema definition, data dump, key/index
> building).  You suggestion brings up an interesting point that should
> pg_dump's traditional output structure change the "--schema-post-load"
> set of objects wouldn't be as clear to newcomers since the load and the
> indexes would be interleaved in the regular output.

Yeah.  Also, that is pushing into an entirely different line of
development, which is to enable multithreaded pg_restore.  The patch
at hand is necessarily incompatible with that type of operation, and
wouldn't be used together with it.

As far as the documentation/definition aspect goes, I think it should
just say the parts are
	* stuff needed before you can load the data
	* the data
	* stuff needed after loading the data
and not try to be any more specific than that.  There are corner cases
that will turn any simple breakdown into a lie, and I doubt that it's
worth trying to explain them all.  (Take a close look at the dependency
loop breaking logic in pg_dump if you doubt this.)

I hadn't realized that Simon was using "pre-schema" and "post-schema"
to name the first and third parts.  I'd agree that this is confusing
nomenclature: it looks like it's trying to say that the data is the
schema, and the schema is not!  How about "pre-data and "post-data"?

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2008-07-21 05:38:16
Subject: Re: TODO item: Have psql show current values for a sequence
Previous:From: Stephen FrostDate: 2008-07-21 01:58:50
Subject: Re: pg_dump additional options for performance

pgsql-patches by date

Next:From: Tom LaneDate: 2008-07-21 05:38:16
Subject: Re: TODO item: Have psql show current values for a sequence
Previous:From: Stephen FrostDate: 2008-07-21 01:58:50
Subject: Re: pg_dump additional options for performance

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group