Skip site navigation (1) Skip section navigation (2)

Re: Speed dblink using alternate libpq tuple storage

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, greg(at)2ndQuadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-01-27 15:42:14
Message-ID: 20120127154214.GB4107@gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Fri, Jan 27, 2012 at 05:57:01PM +0900, Kyotaro HORIGUCHI wrote:
> Hello, This is a new version of the patch formerly known as
> 'alternative storage for libpq'.
> 
> - Changed the concept to 'Alternative Row Processor' from
>   'Storage handler'. Symbol names are also changed.
> 
> - Callback function is modified following to the comment.
> 
> - From the restriction of time, I did minimum check for this
>   patch. The purpose of this patch is to show the new implement.
> 
> - Proformance is not measured for this patch for the same
>   reason. I will do that on next monday.
> 
> - The meaning of PGresAttValue is changed. The field 'value' now
>   contains a value withOUT terminating zero. This change seems to
>   have no effect on any other portion within the whole source
>   tree of postgresql from what I've seen.


I think we have general structure in place.  Good.

Minor notes:

= rowhandler api =

* It returns bool, so void* is wrong.  Instead libpq style is to use int,
  with 1=OK, 0=Failure.  Seems that was also old pqAddTuple() convention.

* Drop PQgetRowProcessorParam(), instead give param as argument.

* PQsetRowProcessorErrMes() should strdup() the message.  That gets rid of
  allocator requirements in API.  This also makes safe to pass static
  strings there.  If strdup() fails, fall back to generic no-mem message.

* Create new struct to replace PGresAttValue for rowhandler usage.
  RowHandler API is pretty unique and self-contained.  It should have
  it's own struct.  Main reason is that it allows to properly document it.
  Otherwise the minor details get lost as they are different from
  libpq-internal usage.  Also this allows two structs to be
  improved separately.  (PGresRawValue?)

* Stop storing null_value into ->value.  It's libpq internal detail.
  Instead the ->value should always point into buffer where the value
  info is located, even for NULL.  This makes safe to simply subtract
  pointers to get row size estimate. Seems pqAddTuple() already
  does null_value logic, so no need to do it in rowhandler api.

= libpq =

Currently its confusing whether rowProcessor can be NULL, and what
should be done if so.  I think its better to fix usage so that
it is always set.

* PQregisterRowProcessor() should use default func if func==NULL.
  and set default handler if so.
* Never set rowProcessor directly, always via PQregisterRowProcessor()
* Drop all if(rowProcessor) checks.

= dblink =

* There are malloc failure checks missing in initStoreInfo() & storeHandler().


-- 
marko


PS.  You did not hear it from me, but most raw values are actually
nul-terminated in protocol.  Think big-endian.  And those which
are not, you can make so, as the data is not touched anymore.
You cannot do it for last value, as next byte may not be allocated.
But you could memmove() it lower address so you can null-terminate.

I'm not suggesting it for official patch, but it would be fun to know
if such hack is benchmarkable, and benchmarkable on realistic load.


In response to

pgsql-hackers by date

Next:From: MauMauDate: 2012-01-27 15:45:19
Subject: Unreliable "pg_ctl -w start" again
Previous:From: Merlin MoncureDate: 2012-01-27 15:35:04
Subject: Re: Speed dblink using alternate libpq tuple storage

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group