Re: Tid scan improvements

From: Edmund Horner <ejrh00(at)gmail(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tid scan improvements
Date: 2018-11-09 02:01:33
Message-ID: CAMyN-kCLxCo=rbysFRM=Ouarp+xd_b=oP26+tr1yXe3C9k89OQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 6 Nov 2018 at 16:40, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> I've been looking over 0001 to 0003. I ran out of steam before 0004.

Hi David, thanks for another big review with lots of improvements.

> I like the design of the new patch. From what I threw so far at the
> selectivity estimation code, it seems pretty good. I also quite like
> the design in nodeTidscan.c for range scans.

> I didn't quite manage to wrap my head around the code that removes
> redundant quals from the tidquals. For example, with:
>
> postgres=# explain select * from t1 where ctid <= '(0,10)' and a = 0;
> QUERY PLAN
> --------------------------------------------------
> Tid Scan on t1 (cost=0.00..3.19 rows=1 width=4)
> TID Cond: (ctid <= '(0,10)'::tid)
> Filter: (a = 0)
> (3 rows)
>
> and:
>
> postgres=# explain select * from t1 where ctid <= '(0,10)' or a = 20
> and ctid >= '(0,0)';
> QUERY PLAN
> ------------------------------------------------------------------------------
> Tid Scan on t1 (cost=0.01..176.18 rows=12 width=4)
> TID Cond: ((ctid <= '(0,10)'::tid) OR (ctid >= '(0,0)'::tid))
> Filter: ((ctid <= '(0,10)'::tid) OR ((a = 20) AND (ctid >= '(0,0)'::tid)))
> (3 rows)
>
> I understand why the 2nd query didn't remove the ctid quals from the
> filter, and I understand why the first query could. I just didn't
> manage to convince myself that the code behaves correctly for all
> cases.

I agree it's not obvious.

1. We extract a set of tidquals that can be directly implemented by
the Tid scan. This set is of the form: "(CTID op ? AND ...) OR
(...)" (with some limitations).
2. If they happened to come verbatim from the original RestrictInfos,
then they will be found in scan_clauses, and we can remove them.
3. If they're not verbatim, i.e. the original RestrictInfos have
additional criteria that the Tid scan can't use, then tidquals won't
match anything in scan_clauses, and hence scan_clauses will be
unchanged.
4. We do a bit of extra work for the common and useful case of "(CTID
op ? AND ...)". Since the top-level operation of the input quals is
an AND, it will typically be split into multiple RestrictInfo items.
We remove each part from scan_clauses.

> 1. I see a few instances of:
>
> #define DatumGetItemPointer(X) ((ItemPointer) DatumGetPointer(X))
> #define ItemPointerGetDatum(X) PointerGetDatum(X)
>
> in both tid.c and ginfuncs.c, and I see you have:
>
> + itemptr = (ItemPointer) DatumGetPointer(constval);
>
> Do you think it would be worth moving the macros out of tid.c and
> ginfuncs.c into postgres.h and use that macro instead?
>
> (I see the code in this file already did this, so it might not matter
> about this)

I'm not sure about this one - - I think it's better as a separate
patch, since we'd also change ginfuncs.c. I have left it alone for
now.

> 2. In TidCompoundRangeQualFromExpr() rlst is not really needed. You
> can just return MakeTidRangeQuals(found_quals); or return NIL.

Yup, gone.

> 3. Can you explain why this only needs to take place when list_length() == 1?
>
> /*
> * In the case of a compound qual such as "ctid > ? AND ctid < ? AND ...",
> * the various parts will have come from different RestrictInfos. So
> * remove each part separately.
> */
> ...

I've tried to improve the comment.

> 4. Accidental change?
>
> - tidquals);
> + tidquals
> + );
>
> 5. Shouldn't this comment get changed?
>
> - * NumTids number of tids in this scan
> + * NumRanges number of tids in this scan
>
> 6. There's no longer a field named NumTids
>
> - * TidList evaluated item pointers (array of size NumTids)
> + * TidRanges evaluated item pointers (array of size NumTids)
>
> 7. The following field is not documented in TidScanState:
>
> + bool tss_inScan;
>
> 8. Can you name this exprtype instead?
>
> + TidExprType type; /* type of op */
>
> "type" is used by Node types to indicate their type.

Yup, yup, yup, yup, yup.

> 9. It would be neater this:
>
> if (expr->opno == TIDLessOperator || expr->opno == TIDLessEqOperator)
> tidopexpr->type = invert ? TIDEXPR_LOWER_BOUND : TIDEXPR_UPPER_BOUND;
> else if (expr->opno == TIDGreaterOperator || expr->opno == TIDGreaterEqOperator)
> tidopexpr->type = invert ? TIDEXPR_UPPER_BOUND : TIDEXPR_LOWER_BOUND;
> else
> tidopexpr->type = TIDEXPR_EQ;
>
> tidopexpr->exprstate = exprstate;
>
> tidopexpr->inclusive = expr->opno == TIDLessEqOperator || expr->opno
> == TIDGreaterEqOperator;
>
> as a switch: ...

Yup, I think the switch is a bit nicer.

> 10. I don't quite understand this comment:
>
> + * Create an ExprState corresponding to the value part of a TID comparison,
> + * and wrap it in a TidOpExpr. Set the type and inclusivity of the TidOpExpr
> + * appropriately, depending on the operator and position of the its arguments.
>
> I don't quite see how the code sets the inclusivity depending on the
> position of the arguments.
>
> Maybe the comment should be:
>
> + * For the given 'expr' build and return an appropriate TidOpExpr taking into
> + * account the expr's operator and operand order.

I'll go with your wording.

> 11. ScalarArrayOpExpr are commonly named "saop": ...

Yup.

> 12. You need to code SetTidLowerBound() with similar wraparound logic
> you have in SetTidUpperBound().
>
> It's perhaps unlikely, but the following shows incorrect results.
>
> postgres=# select ctid from t1 where ctid > '(0,65535)' limit 1;
> ctid
> -------
> (0,1)
> (1 row)
>
> -- the following is fine.
>
> Time: 1.652 ms
> postgres=# select ctid from t1 where ctid >= '(0,65535)' limit 1;
> ctid
> -------
> (1,1)
> (1 row)
>
> Likely you can just upgrade to the next block when the offset is >
> MaxOffsetNumber.

This is important, thanks for spotting it.

I've tried to add some code to handle this case (and also that of
"ctid < '(0,0)'") with a couple of tests too.

> 13. It looks like the previous code didn't make the assumption you're making in:
>
> + * A current-of TidExpr only exists by itself, and we should
> + * already have allocated a tidList entry for it. We don't
> + * need to check whether the tidList array needs to be
> + * resized.
>
> I'm not sure if it's a good idea to lock the executor code into what
> the grammar currently says is possible. The previous code didn't
> assume that.

Fair enough, I've restored the previous code without the assumption.

> 14. we pass 'false' to what?

Obsolete comment (see reply to Alvaro).

I've applied most of these, and I'll post a new patch soon.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-11-09 02:04:39 Re: Copy data to DSA area
Previous Message Michael Paquier 2018-11-09 01:27:29 Re: Adding a TAP test checking data consistency on standby with minRecoveryPoint