Re: [v9.5] Custom Plan API

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 06:18:51
Message-ID: CA+U5nMKZua6TyNL29RCN0X+isdUiZ9o5BrYW9xaNd3FvWy9bhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7 May 2014 02:05, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> Prior to the development cycle towards v9.5, I'd like to reopen
> the discussion of custom-plan interface. Even though we had lots
> of discussion during the last three commit-fests, several issues
> are still under discussion. So, I'd like to clarify direction of
> the implementation, prior to the first commit-fest.
>
> (1) DDL support and system catalog
>
> Simon suggested that DDL command should be supported to track custom-
> plan providers being installed, and to avoid nonsense hook calls
> if it is an obvious case that custom-plan provider can help. It also
> makes sense to give a chance to load extensions once installed.
> (In the previous design, I assumed modules are loaded by LOAD command
> or *_preload_libraries parameters).
>
> I tried to implement the following syntax:
>
> CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;

Thank you for exploring that thought and leading the way on this
research. I've been thinking about this also.

What I think we need is a declarative form that expresses the linkage
between base table(s) and a related data structures that can be used
to optimize a query, while still providing accurate results.

In other DBMS, we have concepts such as a JoinIndex or a MatView which
allow some kind of lookaside behaviour. Just for clarity, a concrete
example is Oracle's Materialized Views which can be set using ENABLE
QUERY REWRITE so that the MatView can be used as an alternative path
for a query. We do already have this concept in PostgreSQL, where an
index can be used to perform an IndexOnlyScan rather than accessing
the heap itself.

We have considerable evidence that the idea of alternate data
structures results in performance gains.
* KaiGai's work - https://wiki.postgresql.org/wiki/PGStrom
* http://www.postgresql.org/message-id/52C59858.9090500@garret.ru
* http://citusdata.github.io/cstore_fdw/
* University of Manchester - exploring GPUs as part of the AXLE project
* Barcelona SuperComputer Centre - exploring FPGAs, as part of the AXLE project
* Some other authors have also cited gains using GPU technology in databases

So I would like to have a mechanism that provides a *generic*
Lookaside for a table or foreign table.

Tom and Kevin have previously expressed that MatViews would represent
a planning problem, in the general case. One way to solve that
planning issue is to link structures directly together, in the same
way that an index and a table are linked. We can then process the
lookaside in the same way we handle a partial index - check
prerequisites and if usable, calculate a cost for the alternate path.
We need not add planning time other than to the tables that might
benefit from that.

Roughly, I'm thinking of this...

CREATE LOOKASIDE ON foo
TO foo_mat_view;

and also this...

CREATE LOOKASIDE ON foo
TO foo_as_a_foreign_table /* e.g. PGStrom */

This would allow the planner to consider alternate plans for foo_mv
during set_plain_rel_pathlist() similarly to the way it considers
index paths, in one of the common cases that the mat view covers just
one table.

This concept is similar to ENABLE QUERY REWRITE in Oracle, but this
thought goes much further, to include any generic user-defined data
structure or foreign table.

Do we need this? For MVs, we *might* be able to deduce that the MV is
rewritable for "foo", but that is not deducible for Foreign Tables, by
current definition, so I prefer the explicit definition of objects
that are linked - since doing this for indexes is already familiar to
people.

Having an explicit linkage between data structures allows us to
enhance an existing application by transaparently adding new
structures, just as we already do with indexes. Specifically, that we
allow more than one lookaside structure on any one table.

Forget the exact name, thats not important. But I think the
requirements here are...

* Explicit definition that we are attaching an alternate path onto a
table (conceptually similar to adding an index)

* Ability to check that the alternate path is viable (similar to the
way we validate use of partial indexes prior to usage)
Checks on columns(SELECT), rows(WHERE), aggregations(GROUP)

* Ability to consider access cost for both normal table and alternate
path (like an index) - this allows the alternate path to *not* be
chosen when we are performing some operation that is sub-optimal (for
whatever reason).

* There may be some need to define operator classes that are
implemented via the alternate path

which works for single tables, but a later requirement would then be

* allows the join of one or more tables to be replaced with a single lookaside

Hopefully, we won't need a "Custom Plan" at all, just the ability to
lookaside when useful.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2014-05-07 06:27:54 Re: New pg_lsn type doesn't have hash/btree opclasses
Previous Message Mark Kirkwood 2014-05-07 05:55:40 Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers