From: | "Florian G(dot) Pflug" <fgp(at)phlo(dot)org> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | Brian Hurt <bhurt(at)janestcapital(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Postgresql-Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: An idea for parallelizing COPY within one backend |
Date: | 2008-02-27 17:03:34 |
Message-ID: | 47C597E6.5060609@phlo.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andrew Dunstan wrote:
> Florian G. Pflug wrote:
>>> Would it be possible to determine when the copy is starting that this
>>> case holds, and not use the parallel parsing idea in those cases?
>>
>> In theory, yes. In pratice, I don't want to be the one who has to
>> answer to an angry user who just suffered a major drop in COPY
>> performance after adding an ENUM column to his table.
>>
> I am yet to be convinced that this is even theoretically a good path to
> follow. Any sufficiently large table could probably be partitioned and
> then we could use the parallelism that is being discussed for pg_restore
> without any modification to the backend at all. Similar tricks could be
> played by an external bulk loader for third party data sources.
That assumes that some specific bulkloader like pg_restore, pgloader
or similar is used to perform the load. Plain libpq-users would either
need to duplicate the logic these loaders contain, or wouldn't be able
to take advantage of fast loads.
Plus, I'd see this as a kind of testbed for gently introducing
parallelism into postgres backends (especially thinking about sorting
here). CPU gain more and more cores, so in the long run I fear that we
will have to find ways to utilize more than one of those to execute a
single query.
But of course the architectural details need to be sorted out before any
credible judgement about the feasability of this idea can be made...
regards, Florian Pflug
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-02-27 17:11:32 | Re: An idea for parallelizing COPY within one backend |
Previous Message | Alvaro Herrera | 2008-02-27 16:56:24 | ResourceOwners for Snapshots? holdable portals |