Re: pg_dump additional options for performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_dump additional options for performance
Date: 2008-02-26 22:36:29
Message-ID: 9194.1204065389@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> We can easily, and backwards-compatibly, improve pg_restore to do
>> concurrent restores. Trying to make psql do something like this will
>> require a complete rewrite, and there is no prospect that it will work
>> for any input that didn't come from (an updated version of) pg_dump
>> anyway.

> The "complete rewrite" in this case would be the "concurrent psql" patch I
> submitted a while back.

Uh, no, that's not even the tip of the iceberg.

The problem with trying to manhandle psql for this purpose is that psql
is purely a reactive engine: it does what you tell it, when you tell it
to do it, and it knows nothing worth mentioning about the semantics of
the specific SQL commands you're passing through it. The
concurrent-sessions feature is cool but it does not alter that
fundamental property of the program. To make psql do the sort of things
being spoken of here would be completely outside its realm; it would
need to decide *on its own* when to issue what, and it would have to
acquire a whole lot of knowledge it doesn't now have in order to make
those decisions. That would be bolting on a ton of code that is
unrelated to psql's normal purposes, and would likely even interfere with
using psql for its normal purposes. (Would you like psql to suddenly
start making its own decisions about whether to submit a command you've
given it?)

I think a sane way to think about what Simon would like to accomplish
is not "turn psql into a parallel job scheduler" but "teach pg_restore
how to do parallel scheduling, and then see if it can be made to do
anything useful with plain-text instead of archive-dump input".
At least that way you're talking about something that's within the scope
of the program's purpose.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2008-02-26 22:45:23 Re: Two Coverity Scan volunteers needed
Previous Message Tom Lane 2008-02-26 22:14:22 Re: Proposed changes to DTrace probe implementation