Re: multiple table scan performance

From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Samuel Gendler <sgendler(at)ideasculptor(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: multiple table scan performance
Date: 2011-03-30 00:05:08
Message-ID: AANLkTi=KVHc1+f8td00U-d0KAJXuZ95JL0773-whYP4d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Mar 30, 2011 at 01:16, Samuel Gendler <sgendler(at)ideasculptor(dot)com> wrote:
> I've got some functionality that necessarily must scan a relatively large table

> Is there any performance benefit to revamping the workload such that it issues
> a single:
> insert into (...) select ... UNION select ... UNION select
> as opposed to 3 separate "insert into () select ..." statements.

Apparently not, as explained by Claudio Freire. This seems like missed
opportunity for the planner, however. If it scanned all three UNION
subqueries in parallel, the synchronized seqscans feature would kick
in and the physical table would only be read once, instead of 3 times.

(I'm assuming that seqscan disk access is your bottleneck)

You can trick Postgres (8.3.x and newer) into doing it in parallel
anyway: open 3 separate database connections and issue each of these
'INSERT INTO ... SELECT' parts separately. This way all the queries
should execute in about 1/3 the time, compared to running them in one
session or with UNION ALL.

Regards,
Marti

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Samuel Gendler 2011-03-30 00:12:24 Re: multiple table scan performance
Previous Message Craig James 2011-03-29 23:31:26 Re: multiple table scan performance