Re: Idea for improving speed of pg_restore

From: cbbrowne(at)acm(dot)org
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Idea for improving speed of pg_restore
Date: 2003-09-16 20:49:54
Message-ID: 60isnsh30t.fsf@dev6.int.libertyrms.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane) writes:
> Christopher Browne <cbbrowne(at)acm(dot)org> writes:
>> Restoring a database involves, for each table:
>> 1. Reading table data from the source file;
>> 2. Writing data to the database file for the table;
>> 3. After that, reading the database file data, and
>> 4. Writing the sorted bits to the index file.
>> 5. Along with all of this, HEFTY amounts of updates to WAL.
>
> An idea that Marc and Jan and I were kicking around last night was
> to offer a GUC option to disable writes to WAL. During initial data
> load, you might as well go back to initdb if you have any failure,
> so why bother with full ACID compliance? I'm not sure if the
> performance benefit would be great enough to make it worth equipping
> the system with such a large-caliber foot-gun, but it's something to
> think about.

That is a good thought.

To make it _marginally_ less unsafe, it might be an idea to only
enable this in some variation on single user mode so that it is made
manifestly clear that this is a "recovery" process and that nobody
else should be poking their head in on.

(Or, in keeping with the metaphor, it involves having a large flashing
light that says "Stay off the range - live fire artillery exercise
under way!" "Make sure this isn't pointed at Mabel or any other
highly trained monkeys or databases you want to keep around!"
<http://cbbrowne.com/info/images/shooting.jpg> Similarly, if my
brother gets behind the wheel of Certain Large Vehicles, it is good to
keep any lurking trees out of the way...
<http://www3.ns.sympatico.ca/coffee/pics/brdtnk2.gif> :-))

> I tend to agree with your doubts about parallelizing index builds,
> but there may be scenarios where it's a win; it'd depend on your
> relative CPU and disk horsepower. (Consider fast disk and multiple
> not-so-fast CPUs; serial index builds can only use one of the CPUs.)
> Question is, is it a big enough win for enough people to make it
> worth supporting?

That one would take some benchmarking to see if/where it would be a
win. And "fast disk and multiple cheezy CPUs" sounds pretty atypical
in these days of Opterons and multiple GHz Intel hardware. That being
said, I could point to a _perfect_ server to try it out on, with the
problem that it isn't representative of the sort of 'production'
environment where you'd actually care about tuning the builds.
Certainly doesn't fit into the set of things I can afford to kludge
into my schedule :-(.
--
let name="aa454" and tld="freenet.carleton.ca" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/x.html
If nothing ever sticks to Teflon, how do they make Teflon stick to the
pan?

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Rawnsley 2003-09-16 20:51:22 Re: State of Beta 2
Previous Message scott.marlowe 2003-09-16 20:44:17 Re: Idea for improving speed of pg_restore