Re: Tid scan improvements

From: Edmund Horner <ejrh00(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tid scan improvements
Date: 2019-01-18 04:14:48
Message-ID: CAMyN-kA-cdPN5+W8Nx5tC=pyHpxhXOngO_rQ0+PPzGgf6ZGghw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

I am a bit stuck and I think it's best to try to explain where.

I'm still rebasing the patches for the changes Tom made to support
parameterised TID paths for joins. While the addition of join support
itself does not really touch the same code, the modernisation -- in
particular, returning a list of RestrictInfos rather than raw quals -- does
rewrite quite a bit of tidpath.c.

The original code returned:

List (with OR semantics)
CTID = ? or CTID = ANY (...) or IS CURRENT OF
(more items)

That changed recently to return:

List (with OR semantics)
RestrictInfo
CTID = ? or ...
(more items)

My last set of patches extended the tidqual extraction to pull out lists
(with AND semantics) of range quals of the form CTID < ?, etc. Each list
of more than one item was converted into an AND clause before being added
to the tidqual list; a single range qual can be added to tidquals as is.

This required looking across multiple RestrictInfos at the top level, for
example:

- "WHERE ctid > ? AND ctid < ?" would arrive at tidpath as a list of two
RestrictInfos, from which we extract a single tidqual in the form of an AND
clause.
- "WHERE ctid = ? OR (ctid > ? AND ctid < ?)" arrives as only one
RestrictInfo, but we extract two tidquals (an OpExpr, and an AND clause).

The code could also ignore additional unusable quals from a list of
top-level RestrictInfos, or from a list of quals from an AND clause, for
example:

- "WHERE foo = ? AND ctid > ? AND ctid < ?" gives us the single tidqual
"ctid > ? AND ctid < ?".
- "WHERE (ctid = ? AND bar = ?) OR (foo = ? AND ctid > ? AND ctid < ?)"
gives us the two tidquals "ctid = ?" and "ctid > ? AND ctid < ?".

As the extracted tidquals no longer match the original query quals, they
aren't removed from scan_clauses in createplan.c, and hence are correctly
checked by the filter.

Aside: The analogous situation with an indexed user attribute "x" behaves a
bit differently:
- "WHERE x = ? OR (x > ? AND x < ?)", won't use a regular index scan, but
might use a bitmap index scan.

My patch uses the same path type and executor for all extractable tidquals.

This worked pretty well, but I am finding it difficult to reimplement it in
the new tidpath.c code.

In the query information given to the path generator, there is no existing
RestrictInfo relating to the whole expression "ctid > ? AND ctid < ?". I
am still learning about RestrictInfos, but my understanding is it doesn't
make sense to have a RestrictInfo for an AND clause, anyway; you're
supposed to have them for the sub-expressions of it.

And it doesn't seem a good idea to try to create new RestrictInfos in the
path generation just to pass the tidquals back to plan creation. They're
complicated objects.

There's also the generation of scan_clauses in create_tidscan_plan
(createplan.c:3107). This now uses RestrictInfos -- I'd image we'd need
each AND clause to be wrapped in a RestrictInfo to be able to check it
properly.

To summarise, I'm not sure what kind of structure I should add to the
tidquals list to represent a compound range expression. Maybe it's better
to create a different path (either a new path type, or a flag in TidPath to
say what kind of quals are attached) ?

Edmund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2019-01-18 04:34:03 Re: draft patch for strtof()
Previous Message David Rowley 2019-01-18 04:13:48 Re: pg_dump multi VALUES INSERT