[Fwd: dblink patch - Asynchronous queries and parallel execution]

From: Joe Conway <mail(at)joeconway(dot)com>
To: "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Cc: Kai Londenberg <K(dot)Londenberg(at)librics(dot)de>
Subject: [Fwd: dblink patch - Asynchronous queries and parallel execution]
Date: 2006-07-24 18:19:54
Message-ID: 44C50F4A.7050708@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

I just received this (offlist), and have not had a chance to review it
myself yet, but figured I should post it now in case others want to have
a look and comment or discuss before feature freeze.

If there are no major objections to the concept, I'll take
responsibility to review and commit once I'm through with the "Values
list-of-targetlists" stuff.

(I'm not sure where we finished off with the discussion of PATCHES vs
HACKERS list for this kind of stuff, so I'm going to send another copy
of this to HACKERS without the attachement)

Thanks,

Joe

-------- Original Message --------
Subject: dblink patch - Asynchronous queries and parallel execution
Date: Mon, 24 Jul 2006 12:47:51 +0200
From: Kai Londenberg <K(dot)Londenberg(at)librics(dot)de>
To: mail(at)joeconway(dot)com

Hello,

I needed parallel query execution features for a project, and so I
modified the dblink module to add support
for asynchronous query execution.

I thought others might find these features useful as well, therefore I'd
like to contribute this to the
current Postgresql contrib/dblink codebase.

The code is based on the contrib/dblink code included with the current
8.1.4 version of PostgreSQL.
I'm including the entire modified contrib/dblink directory in archived form.

I modified dblink.c and dblink.sql.in, and created the file README.async

Hope you like it, and include it in a possible next version of dblink.

The code still needs some testing and code review. I made it work for
me, but I don't have any
experience writing Postgresql Extensions, and haven't touched C for a while.

The most important thing about this code is that it allows parallel
execution of queries on several
backend databases, and re-joining of their results. This solves a lot of
scalability problems.

This is my corresponding README.async file which describes my additions.

-----
dblink-async patch by Kai Londenberg (K(dot)Londenberg(at)librics(dot)de)
24.7.2006

All code is licensed under the same terms as the rest of the dblink
code.

SQL Function declarations have been added at the bottom of dblink.sql

Added functions:

int dblink_send_query(connstr text, sql text)
Sends a query to a remote server for asynchronous execution.

returns immediately without waiting for results.

returns 1 on success, or 1 on failure.
results *must* be fetched by dblink_get_result(connstr)
a running query may be cancelled by dblink_cancel_query(connstr)

dblink_get_result(connstr text[,bool fail_on_error])
retrieves the result of a query started by dblink_send_query.

Blocks until a result gets available.

This function *must* be called if dblink_send_query returned
a 1, even on cancelled queries - otherwise the connection
can't be used anymore.

dblink_get_connections()
List all open dblink connections by name.
Returns a comma separated string of all connection names.
Takes no params

Example: SELECT string_to_array(dblink_get_connections(), ',');

int dblink_is_busy(connstr)
returns 1 if connection is busy, 0 if it is not busy.
If this function returns 0, it is guaranteed that dblink_get_result
will not block.

text dblink_cancel_query(connstr)
Cancels a running query on a given connection.
returns "OK" on success, or an error message on failure.

Examples:

---- Example 1 - Union over parallel executed remote queries --

SELECT dblink_connect('dtest1', 'host=server1 port=5432 dbname=dtest_1
user=duser password=pass');
SELECT * from
dblink_send_query('dtest1', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

SELECT dblink_connect('dtest2', 'host=server2 port=5432 dbname=dtest_2
user=duser password=pass');
SELECT * from
dblink_send_query('dtest2', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

SELECT dblink_connect('dtest3', 'host=server3 port=5432 dbname=dtest_3
user=duser password=pass');
SELECT * from
dblink_send_query('dtest3', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

CREATE TEMPORARY TABLE result AS
(SELECT * from dblink_get_result('dtest1') as t1(country_code text, city
text))
UNION
(SELECT * from dblink_get_result('dtest2') as t2(country_code text, city
text))
UNION
(SELECT * from dblink_get_result('dtest3') as t3(country_code text, city
text))
ORDER by city DESC LIMIT 100;

SELECT dblink_disconnect('dtest1');
SELECT dblink_disconnect('dtest2');
SELECT dblink_disconnect('dtest3');
SELECT * from result;

--- End of Example 1 ---

best regards,

Kai Londenberg

Attachment Content-Type Size
dblink.tar.gz application/x-gzip 21.5 KB

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2006-07-24 18:39:12 Re: The vacuum-ignore-vacuum patch
Previous Message Alvaro Herrera 2006-07-24 18:14:08 Re: The vacuum-ignore-vacuum patch