Re: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)
Date: 2015-02-13 07:40:40
Message-ID: CAB7nPqQN-xBYnzx-mO-_=D+0fcdQXVVmWUUpLJ8qCPUc5wwu1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 6, 2015 at 10:51 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:

> Jim, Thanks for your reviewing the patch.
>
> The attached patch is revised one according to your suggestion,
> and also includes bug fix I could found.
>
> * Definitions of TIDXXXXOperator was moved to pg_operator.h
> as other operator doing.
> * Support the case of "ctid (operator) Param" expression.
> * Add checks if commutator of TID was not found.
> * Qualifiers gets extracted on plan stage, not path stage.
> * Adjust cost estimation logic to fit SeqScan manner.
> * Add some new test cases for regression test.
>
> > On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:
> > >>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai
> > >>> <kaigai(at)ak(dot)jp(dot)nec(dot)com>
> > >> wrote:
> > >>>> I'm not certain whether we should have this functionality in
> > >>>> contrib from the perspective of workload that can help, but its
> > >>>> major worth is for an example of custom-scan interface.
> > >>> worker_spi is now in src/test/modules. We may add it there as well,
> > no?
> > >>>
> > >> Hmm, it makes sense for me. Does it also make sense to add a
> > >> test-case to the core regression test cases?
> > >>
> > > The attached patch adds ctidscan module at test/module instead of
> contrib.
> > > Basic portion is not changed from the previous post, but file
> > > locations and test cases in regression test are changed.
> >
> > First, I'm glad for this. It will be VERY valuable for anyone trying to
> > clean up the end of a majorly bloated table.
> >
> > Here's a partial review...
> >
> > +++ b/src/test/modules/ctidscan/ctidscan.c
> >
> > +/* missing declaration in pg_proc.h */
> > +#ifndef TIDGreaterOperator
> > +#define TIDGreaterOperator 2800
> > ...
> > If we're calling that out here, should we have a corresponding comment in
> > pg_proc.h, in case these ever get renumbered?
> >
> It was a quick hack when I moved this module out of the tree.
> Yep, I should revert when I submit this patch again.
>
> > +CTidQualFromExpr(Node *expr, int varno)
> > ...
> > + if (IsCTIDVar(arg1, varno))
> > + other = arg2;
> > + else if (IsCTIDVar(arg2, varno))
> > + other = arg1;
> > + else
> > + return NULL;
> > + if (exprType(other) != TIDOID)
> > + return NULL; /* should not happen */
> > + /* The other argument must be a pseudoconstant */
> > + if (!is_pseudo_constant_clause(other))
> > + return NULL;
> >
> > I think this needs some additional blank lines...
> >
> OK. And, I also noticed the coding style around this function is
> different from other built-in plans, so I redefined the role of
> this function just to check whether the supplied RestrictInfo is
> OpExpr that involves TID inequality operator, or not.
>
> Expression node shall be extracted when Plan node is created
> from Path node.
>
> > + if (IsCTIDVar(arg1, varno))
> > + other = arg2;
> > + else if (IsCTIDVar(arg2, varno))
> > + other = arg1;
> > + else
> > + return NULL;
> > +
> > + if (exprType(other) != TIDOID)
> > + return NULL; /* should not happen */
> > +
> > + /* The other argument must be a pseudoconstant */
> > + if (!is_pseudo_constant_clause(other))
> > + return NULL;
> >
> > + * CTidEstimateCosts
> > + *
> > + * It estimates cost to scan the target relation according to the given
> > + * restriction clauses. Its logic to scan relations are almost same as
> > + * SeqScan doing, because it uses regular heap_getnext(), except for
> > + * the number of tuples to be scanned if restriction clauses work well.
> >
> > <grammar>That should read "same as what SeqScan is doing"... however,
> what
> > actual function are you talking about? I couldn't find
> SeqScanEstimateCosts
> > (or anything ending EstimateCosts).
> >
> It is cost_seqscan(). But I don't put a raw function name in the source
> code comments in other portion, because nobody can guarantee it is up to
> date in the future...
>
> > BTW, there's other grammar issues but it'd be best to handle those all at
> > once after all the code stuff is done.
> >
> Yep. Help by native English speaker is very helpful for us.
>
> > + opno = get_commutator(op->opno);
> > What happens if there's no commutator? Perhaps best to Assert(opno !=
> > InvalidOid) at the end of that if block.
> >
> Usually, commutator operator of TID is defined on initdb time, however,
> nobody can guarantee a mad superuser doesn't drop it.
> So, I added elog(ERROR,...) here.
>
>
> > Though, it seems things will fall apart anyway if ctid_quals isn't
> exactly
> > what we're expecting; I don't know if that's OK or not.
> >
> No worry, it was already checked on planning stage.
>
> > + /* disk costs --- assume each tuple on a different page */
> > + run_cost += spc_random_page_cost * ntuples;
> > Isn't that extremely pessimistic?
> >
> Probably. I follow the manner of SeqScan.
>
> > I'm not familiar enough with the custom-scan stuff to really comment past
> > this point, and I could certainly be missing some details about planning
> > and execution.
> >
> > I do have some concerns about the regression test, but perhaps I'm just
> > being paranoid:
> >
> > - The EXPLAIN tests don't cover >. They do cover <= and >= via BETWEEN.
> > - Nothing tests the case of '...'::tid op ctid; only lvalue cases are
> tested.
> >
> Both are added.
>
> > Also, it seems that we don't handle joining on a ctid qual... is that
> > intentional?
> >
> Yep. This module does not intend to handle joining, because custom-/
> foreign-join interface is still under the discussion.
> https://commitfest.postgresql.org/action/patch_view?id=1653
>
> Hanada-san makes an enhancement of postgres_fdw on this enhancement.
> https://commitfest.postgresql.org/action/patch_view?id=1674
>
> > I know that sounds silly, but you'd probably want to do that
> > if you're trying to move tuples off the end of a bloated table. You could
> > work around it by constructing a dynamic SQL string, but it'd be easier
> > to do something like:
> >
> > UPDATE table1 SET ...
> > WHERE ctid >= (SELECT '(' || relpages || ',0)' FROM pg_class WHERE oid
> > = 'table1'::regclass) ;
> >
> This example noticed me that the previous version didn't support the case
> of "ctid (operator) Param".
> So, I enhanced to support above case, and update the regression test also.
>

Moved this patch to next CF 2015-02 because of lack of review(ers).
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-02-13 07:41:40 Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Previous Message Michael Paquier 2015-02-13 07:37:42 Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)