Re: is parallel union all possible over dblink?

From: Svetlin Manavski <svetlin(dot)manavski(at)gmail(dot)com>
To: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>
Cc: Marinos Yannikos <mjy(at)geizhals(dot)at>, pgsql-performance(at)postgresql(dot)org
Subject: Re: is parallel union all possible over dblink?
Date: 2011-06-30 09:02:07
Message-ID: BANLkTim7gViO93cRhBPCjd-+n96mTTrORw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I am now a bit puzzled after the initial satisfaction by Marinos' reply.

1. what do you mean exactly by "to ensure your UNION succeeds". The dblink
docs do not mention anything about issues using directly the suggested
dblink_send_query() + dblink_get_results(). What problems should I expect in
using them as suggested by Marinos?

2. If I understand correctly your method, it is not applicable from inside a
stored procedure, is it? I need to keep all the business logic within
PostgreSQL and provide just a clean interface to a simple GUI layer

3. Unfortunately GridSQL and Pgpool-II do not seem mature and stable
products to be used in commercial software. Neither one provides clear
documentation. GridSQL has been discontinued and it is not clear what kind
of future it will have. I have not tried GridSQL but I did try Pgpool-II. It
is disappointing that it may stop working correctly even just because of the
way you write the query (e.g. using uppercase in a field or using named
field in group by, ecc.). Even worse, when it does not recognize something
in the parallel query, it just provides incorrect result (from only the
local DB) rather than raising an exception. So I guess Pgpool-II in its
current state is good only for very simple applications, which are not
supposed to be reliable at all.

Thank you,
Svetlin Manavski

On Wed, Jun 29, 2011 at 8:14 PM, Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>wrote:

>
> This does work however you'll need to add a little more to it to ensure
> your UNION succeeds. In pseudo...
>
> connection #1:
> CREATE TABLE target_1 ...
> BEGIN;
> LOCK TABLE target_1 IN ACCESS EXCLUSIVE MODE;
> INSERT INTO target_1 SELECT ...
> COMMIT;
>
> connection #2:
> CREATE TABLE target_2 ...
> BEGIN;
> LOCK TABLE target_2 IN ACCESS EXCLUSIVE MODE;
> INSERT INTO target_2 SELECT ...
> COMMIT;
>
> connection #3:
> SELECT * FROM target_1 UNION SELECT * FROM target_2;
>
> Connections 1 and 2 can be done in simultaneously and after both have
> reached the LOCK statement then the SELECT on connection 3 can be executed.
> Same fundamentals if all three connections are to different databases and
> connection 3 uses dblink to pull the data.
>
> Another alternative is to use GridSQL. I haven't used it myself but seen
> it in action on a large install with 4 backend databases. Pretty slick.
>
> Greg
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Samuel Gendler 2011-06-30 09:40:10 Re: near identical queries have vastly different plans
Previous Message Samuel Gendler 2011-06-30 08:53:14 near identical queries have vastly different plans