Skip site navigation (1) Skip section navigation (2)

Re: Speed dblink using alternate libpq tuple storage

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>
To: markokr(at)gmail(dot)com
Cc: mmoncure(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, greg(at)2ndquadrant(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-01-30 09:06:57
Message-ID: 20120130.180657.220412574.horiguchi.kyotaro@oss.ntt.co.jp (view raw or flat)
Thread:
Lists: pgsql-hackers
Thank you for comments, this is revised version of the patch.

The gain of performance is more than expected. Measure script now
does query via dblink ten times for stability of measuring, so
the figures become about ten times longer than the previous ones.

                       sec    % to Original
Original             : 31.5     100.0%
RowProcessor patch   : 31.3      99.4%
dblink patch         : 24.6      78.1%

RowProcessor patch alone makes no loss or very-little gain, and
full patch gives us 22% gain for the benchmark(*1).


The modifications are listed below.


- No more use of PGresAttValue for this mechanism, and added
  PGrowValue instead. PGresAttValue has been put back to
  libpq-int.h

- pqAddTuple() is restored as original and new function
  paAddRow() to use as RowProcessor. (Previous pqAddTuple
  implement had been buggily mixed the two usage of
  PGresAttValue)

- PQgetRowProcessorParam has been dropped. Contextual parameter
  is passed as one of the parameters of RowProcessor().

- RowProcessor() returns int (as bool, is that libpq convension?)
  instead of void *. (Actually, void * had already become useless
  as of previous patch)

- PQsetRowProcessorErrMes() is changed to do strdup internally.

- The callers of RowProcessor() no more set null_field to
  PGrowValue.value. Plus, the PGrowValue[] which RowProcessor()
  receives has nfields + 1 elements to be able to make rough
  estimate by cols->value[nfields].value - cols->value[0].value -
  something.  The somthing here is 4 * nfields for protocol3 and
  4 * (non-null fields) for protocol2. I fear that this applies
  only for textual transfer usage...

- PQregisterRowProcessor() sets the default handler when given
  NULL. (pg_conn|pg_result).rowProcessor cannot be NULL for its
  lifetime.

- initStoreInfo() and storeHandler() has been provided with
  malloc error handling.


And more..

- getAnotherTuple()@fe-protocol2.c is not tested utterly.

- The uniformity of the size of columns in the test data prevents
  realloc from execution in dblink... More test should be done.


 regards,

=====
(*1) The benchmark is done as follows,

==test.sql
select dblink_connect('c', 'host=localhost dbname=test');
select * from dblink('c', 'select a,c from foo limit 2000000') as (a text b bytea) limit 1;
...(repeat 9 times more)
select dblink_disconnect('c');
==

$ for i in $(seq 1 10); do time psql test -f t.sql; done

The environment is
  CentOS 6.2 on VirtualBox on Core i7 965 3.2GHz
  # of processor  1
  Allocated mem   2GB
  
Test DB schema is
   Column | Type  | Modifiers 
  --------+-------+-----------
   a      | text  | 
   b      | text  | 
   c      | bytea | 
  Indexes:
      "foo_a_bt" btree (a)
      "foo_c_bt" btree (c)

test=# select count(*),
               min(length(a)) as a_min, max(length(a)) as a_max,
               min(length(c)) as c_min, max(length(c)) as c_max from foo;

  count  | a_min | a_max | c_min | c_max 
---------+-------+-------+-------+-------
 2000000 |    29 |    29 |    29 |    29
(1 row)

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment: dblink_use_rowproc_20120130.patch
Description: text/x-patch (11.7 KB)
Attachment: libpq_rowproc_doc_20120130.patch
Description: text/x-patch (5.5 KB)
Attachment: libpq_rowproc_20120130.patch
Description: text/x-patch (19.1 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Simon RiggsDate: 2012-01-30 09:25:53
Subject: Re: Hot standby off of hot standby?
Previous:From: Hitoshi HaradaDate: 2012-01-30 08:42:26
Subject: Re: Patch: Allow SQL-language functions to reference parameters by parameter name

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group