Re: Speed dblink using alternate libpq tuple storage

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndQuadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-01-27 15:42:14
Message-ID: 20120127154214.GB4107@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 27, 2012 at 05:57:01PM +0900, Kyotaro HORIGUCHI wrote:
> Hello, This is a new version of the patch formerly known as
> 'alternative storage for libpq'.
>
> - Changed the concept to 'Alternative Row Processor' from
> 'Storage handler'. Symbol names are also changed.
>
> - Callback function is modified following to the comment.
>
> - From the restriction of time, I did minimum check for this
> patch. The purpose of this patch is to show the new implement.
>
> - Proformance is not measured for this patch for the same
> reason. I will do that on next monday.
>
> - The meaning of PGresAttValue is changed. The field 'value' now
> contains a value withOUT terminating zero. This change seems to
> have no effect on any other portion within the whole source
> tree of postgresql from what I've seen.

I think we have general structure in place. Good.

Minor notes:

= rowhandler api =

* It returns bool, so void* is wrong. Instead libpq style is to use int,
with 1=OK, 0=Failure. Seems that was also old pqAddTuple() convention.

* Drop PQgetRowProcessorParam(), instead give param as argument.

* PQsetRowProcessorErrMes() should strdup() the message. That gets rid of
allocator requirements in API. This also makes safe to pass static
strings there. If strdup() fails, fall back to generic no-mem message.

* Create new struct to replace PGresAttValue for rowhandler usage.
RowHandler API is pretty unique and self-contained. It should have
it's own struct. Main reason is that it allows to properly document it.
Otherwise the minor details get lost as they are different from
libpq-internal usage. Also this allows two structs to be
improved separately. (PGresRawValue?)

* Stop storing null_value into ->value. It's libpq internal detail.
Instead the ->value should always point into buffer where the value
info is located, even for NULL. This makes safe to simply subtract
pointers to get row size estimate. Seems pqAddTuple() already
does null_value logic, so no need to do it in rowhandler api.

= libpq =

Currently its confusing whether rowProcessor can be NULL, and what
should be done if so. I think its better to fix usage so that
it is always set.

* PQregisterRowProcessor() should use default func if func==NULL.
and set default handler if so.
* Never set rowProcessor directly, always via PQregisterRowProcessor()
* Drop all if(rowProcessor) checks.

= dblink =

* There are malloc failure checks missing in initStoreInfo() & storeHandler().

--
marko

PS. You did not hear it from me, but most raw values are actually
nul-terminated in protocol. Think big-endian. And those which
are not, you can make so, as the data is not touched anymore.
You cannot do it for last value, as next byte may not be allocated.
But you could memmove() it lower address so you can null-terminate.

I'm not suggesting it for official patch, but it would be fun to know
if such hack is benchmarkable, and benchmarkable on realistic load.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2012-01-27 15:45:19 Unreliable "pg_ctl -w start" again
Previous Message Merlin Moncure 2012-01-27 15:35:04 Re: Speed dblink using alternate libpq tuple storage