Re: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ctidscan as an example of custom-scan (Re: [v9.5] Custom Plan API)
Date: 2014-12-30 03:10:56
Message-ID: 54A217C0.8050904@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/24/14, 7:54 PM, Kouhei Kaigai wrote:
>>> On Mon, Dec 15, 2014 at 4:55 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
>> wrote:
>>>> I'm not certain whether we should have this functionality in contrib
>>>> from the perspective of workload that can help, but its major worth
>>>> is for an example of custom-scan interface.
>>> worker_spi is now in src/test/modules. We may add it there as well, no?
>>>
>> Hmm, it makes sense for me. Does it also make sense to add a test-case to
>> the core regression test cases?
>>
> The attached patch adds ctidscan module at test/module instead of contrib.
> Basic portion is not changed from the previous post, but file locations
> and test cases in regression test are changed.

First, I'm glad for this. It will be VERY valuable for anyone trying to clean up the end of a majorly bloated table.

Here's a partial review...

+++ b/src/test/modules/ctidscan/ctidscan.c

+/* missing declaration in pg_proc.h */
+#ifndef TIDGreaterOperator
+#define TIDGreaterOperator 2800
...
If we're calling that out here, should we have a corresponding comment in pg_proc.h, in case these ever get renumbered?

+CTidQualFromExpr(Node *expr, int varno)
...
+ if (IsCTIDVar(arg1, varno))
+ other = arg2;
+ else if (IsCTIDVar(arg2, varno))
+ other = arg1;
+ else
+ return NULL;
+ if (exprType(other) != TIDOID)
+ return NULL; /* should not happen */
+ /* The other argument must be a pseudoconstant */
+ if (!is_pseudo_constant_clause(other))
+ return NULL;

I think this needs some additional blank lines...

+ if (IsCTIDVar(arg1, varno))
+ other = arg2;
+ else if (IsCTIDVar(arg2, varno))
+ other = arg1;
+ else
+ return NULL;
+
+ if (exprType(other) != TIDOID)
+ return NULL; /* should not happen */
+
+ /* The other argument must be a pseudoconstant */
+ if (!is_pseudo_constant_clause(other))
+ return NULL;

+ * CTidEstimateCosts
+ *
+ * It estimates cost to scan the target relation according to the given
+ * restriction clauses. Its logic to scan relations are almost same as
+ * SeqScan doing, because it uses regular heap_getnext(), except for
+ * the number of tuples to be scanned if restriction clauses work well.

<grammar>That should read "same as what SeqScan is doing"... however, what actual function are you talking about? I couldn't find SeqScanEstimateCosts (or anything ending EstimateCosts).

BTW, there's other grammar issues but it'd be best to handle those all at once after all the code stuff is done.

+ opno = get_commutator(op->opno);
What happens if there's no commutator? Perhaps best to Assert(opno != InvalidOid) at the end of that if block.

Though, it seems things will fall apart anyway if ctid_quals isn't exactly what we're expecting; I don't know if that's OK or not.

+ /* disk costs --- assume each tuple on a different page */
+ run_cost += spc_random_page_cost * ntuples;
Isn't that extremely pessimistic?

I'm not familiar enough with the custom-scan stuff to really comment past this point, and I could certainly be missing some details about planning and execution.

I do have some concerns about the regression test, but perhaps I'm just being paranoid:

- The EXPLAIN tests don't cover >. They do cover <= and >= via BETWEEN.
- Nothing tests the case of '...'::tid op ctid; only lvalue cases are tested.

Also, it seems that we don't handle joining on a ctid qual... is that intentional? I know that sounds silly, but you'd probably want to do that if you're trying to move tuples off the end of a bloated table. You could work around it by constructing a dynamic SQL string, but it'd be easier to do something like:

UPDATE table1 SET ...
WHERE ctid >= (SELECT '(' || relpages || ',0)' FROM pg_class WHERE oid = 'table1'::regclass)
;

in some kind of loop.

Obviously better to only handle what you already are then not get this in at all though... :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-12-30 05:12:50 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Previous Message Kevin Grittner 2014-12-30 03:00:43 Re: BUG #12330: ACID is broken for unique constraints