Quick Links

Re: Speed dblink using alternate libpq tuple storage

From:	Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To:	markokr(at)gmail(dot)com
Cc:	mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject:	Re: Speed dblink using alternate libpq tuple storage
Date:	2012-01-30 09:06:57
Message-ID:	20120130.180657.220412574.horiguchi.kyotaro@oss.ntt.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Thank you for comments, this is revised version of the patch.

The gain of performance is more than expected. Measure script now
does query via dblink ten times for stability of measuring, so
the figures become about ten times longer than the previous ones.

sec % to Original
Original : 31.5 100.0%
RowProcessor patch : 31.3 99.4%
dblink patch : 24.6 78.1%

RowProcessor patch alone makes no loss or very-little gain, and
full patch gives us 22% gain for the benchmark(*1).

The modifications are listed below.

- No more use of PGresAttValue for this mechanism, and added
PGrowValue instead. PGresAttValue has been put back to
libpq-int.h

- pqAddTuple() is restored as original and new function
paAddRow() to use as RowProcessor. (Previous pqAddTuple
implement had been buggily mixed the two usage of
PGresAttValue)

- PQgetRowProcessorParam has been dropped. Contextual parameter
is passed as one of the parameters of RowProcessor().

- RowProcessor() returns int (as bool, is that libpq convension?)
instead of void *. (Actually, void * had already become useless
as of previous patch)

- PQsetRowProcessorErrMes() is changed to do strdup internally.

- The callers of RowProcessor() no more set null_field to
PGrowValue.value. Plus, the PGrowValue[] which RowProcessor()
receives has nfields + 1 elements to be able to make rough
estimate by cols->value[nfields].value - cols->value[0].value -
something. The somthing here is 4 * nfields for protocol3 and
4 * (non-null fields) for protocol2. I fear that this applies
only for textual transfer usage...

- PQregisterRowProcessor() sets the default handler when given
NULL. (pg_conn|pg_result).rowProcessor cannot be NULL for its
lifetime.

- initStoreInfo() and storeHandler() has been provided with
malloc error handling.

And more..

- getAnotherTuple()@fe-protocol2.c is not tested utterly.

- The uniformity of the size of columns in the test data prevents
realloc from execution in dblink... More test should be done.

regards,

=====
(*1) The benchmark is done as follows,

==test.sql
select dblink_connect('c', 'host=localhost dbname=test');
select * from dblink('c', 'select a,c from foo limit 2000000') as (a text b bytea) limit 1;
...(repeat 9 times more)
select dblink_disconnect('c');
==

$ for i in $(seq 1 10); do time psql test -f t.sql; done

test=# select count(*),
min(length(a)) as a_min, max(length(a)) as a_max,
min(length(c)) as c_min, max(length(c)) as c_max from foo;

count | a_min | a_max | c_min | c_max
---------+-------+-------+-------+-------
2000000 | 29 | 29 | 29 | 29
(1 row)

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment	Content-Type	Size
libpq_rowproc_20120130.patch	text/x-patch	19.1 KB
libpq_rowproc_doc_20120130.patch	text/x-patch	5.5 KB
dblink_use_rowproc_20120130.patch	text/x-patch	11.7 KB

In response to

Re: Speed dblink using alternate libpq tuple storage at 2012-01-27 15:48:11 from Marko Kreen

Responses

Re: Speed dblink using alternate libpq tuple storage at 2012-01-30 18:15:39 from Marko Kreen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2012-01-30 09:25:53	Re: Hot standby off of hot standby?
Previous Message	Hitoshi Harada	2012-01-30 08:42:26	Re: Patch: Allow SQL-language functions to reference parameters by parameter name