Re: Speed dblink using alternate libpq tuple storage

From: Marko Kreen <markokr(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp>, greg(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org, mmoncure(at)gmail(dot)com, shigeru(dot)hanada(at)gmail(dot)com
Subject: Re: Speed dblink using alternate libpq tuple storage
Date: 2012-03-27 20:41:18
Message-ID: 20120327204118.GA3439@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 24, 2012 at 02:22:24AM +0200, Marko Kreen wrote:
> Main advantage of including PQgetRow() together with low-level
> rowproc API is that it allows de-emphasizing more complex parts of
> rowproc API (exceptions, early exit, skipresult, custom error msg).
> And drop/undocument them or simply call them postgres-internal-only.

I thought more about exceptions and PQgetRow and found
interesting pattern:

- Exceptions are problematic if always allowed. Although PQexec() is
easy to fix in current code, trying to keep to promise of "exceptions
are allowed from everywhere" adds non-trivial maintainability overhead
to future libpq changes, so instead we should simply fix documentation.
Especially as I cannot see any usage scenario that would benefit from
such promise.

- Multiple SELECTs from PQexec() are problematic even without
exceptions: additional documentation is needed how to detect
that rows are coming from new statement.

Now the interesting one:

- PQregisterProcessor() API allows changing the callback permanently.
Thus breaking any user code which simply calls PQexec()
and expects regular PGresult back. Again, nothing to fix
code-wise, need to document that callback should be set
only for current query, later changed back.

My conclusion is that row-processor API is low-level expert API and
quite easy to misuse. It would be preferable to have something more
robust as end-user API, the PQgetRow() is my suggestion for that.
Thus I see 3 choices:

1) Push row-processor as main API anyway and describe all dangerous
scenarios in documentation.
2) Have both PQgetRow() and row-processor available in <libpq-fe.h>,
PQgetRow() as preferred API and row-processor for expert usage,
with proper documentation what works and what does not.
3) Have PQgetRow() in <libpq-fe.h>, move row-processor to <libpq-int.h>.

I guess this needs committer decision which way to go?

Second conclusion is that current dblink row-processor usage is broken
when user uses multiple SELECTs in SQL as dblink uses plain PQexec().
Simplest fix would be to use PQexecParams() instead, but if keeping old
behaviour is important, then dblink needs to emulate PQexec() resultset
behaviour with row-processor or PQgetRow().

--
marko

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-03-27 21:05:30 Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
Previous Message Robert Haas 2012-03-27 20:38:57 Re: Command Triggers patch v18