RE: Index Skip Scan (new UniqueKeys)

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Dilip Kumar" <dilipbalaut(at)gmail(dot)com>
Subject: RE: Index Skip Scan (new UniqueKeys)
Date: 2020-07-14 18:18:50
Message-ID: 536657c24bcf49eb8215e84969f2b345@opammb0561.comp.optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> > - the uniquekeys that is built, still contains some redundant keys, that are
> normally eliminated from the path keys lists.
>
> I guess you're talking about:
>
> + if (EC_MUST_BE_REDUNDANT(ec))
> + continue;
>
> Can you add some test cases to your changes to show the effect of it? It
> seem to me redundant keys are already eliminated at this point by either
> make_pathkeys_for_uniquekeys or even earlier for distinct on, but could be
> I've missed something.
>

The build_uniquekeys function calls make_pathkeys_for_uniquekeys, which checks for uniqueness using pathkey_is_unique, but not for constantness. Consider a query like:
SELECT DISTINCT ON (a,b) * FROM t1 WHERE a=1 ORDER BY a,b,c

All the regular path keys filter out 'a' for constantness here - so they would end up with a distinct pathkeys of (b) and sort path keys of (b,c).
The unique keys would end up with (a,b) though. In the previous patch, it'd compared in create_distinct_paths, the pathkeys to the unique keys, so it wouldn't consider the skip scan.
Due to the other changes I made in create_distinct_paths/query_has_uniquekeys_for, it will choose a correct plan now, even without the EC_MUST_BE_REDUNDANT check though, so it's difficult to give an actual failing test case now. However, since all code filters out constant keys, I think uniqueness should do it too - otherwise you could get into problems later on. It's also more consistent. If you already know something is unique by just (b), it doesn't make sense to store that it's unique by (a,b). Now that I think of it, the best place to do this EC_MUST_BE_REDUNDANT check is probably inside make_pathkeys_for_uniquekeys, rather than build_uniquekeys though. It's probably good to move it there.

> Along the lines I'm also curious about this part:
>
> - ListCell *k;
> - List *exprs = NIL;
> -
> - foreach(k, ec->ec_members)
> - {
> - EquivalenceMember *mem = (EquivalenceMember *)
> lfirst(k);
> - exprs = lappend(exprs, mem->em_expr);
> - }
> -
> - result = lappend(result, makeUniqueKey(exprs, false, false));
> + EquivalenceMember *mem = (EquivalenceMember*)
> +lfirst(list_head(ec->ec_members));
>
> I'm curious about this myself, maybe someone can clarify. It looks like
> generaly speaking there could be more than one member (if not
> ec_has_volatile), which "representing knowledge that multiple items are
> effectively equal". Is this information is not interesting enough to preserve it
> in unique keys?

Yeah, that's a good question. Hence my question about the choice for Expr rather than EquivalenceClass for the Unique Keys patch to Andy/David. When storing just Expr, it is rather difficult to check equivalence in joins for example. Suppose, later on we decide to add join support to the distinct skip scan. Consider a query like this:
SELECT DISTINCT t1.a FROM t1 JOIN t2 ON t1.a=t2.a
As far as my understanding goes (I didn't look into it in detail though), I think here the distinct_pathkey will have an EqClass {t1.a, t2.a}. That results in a UniqueKey with expr (t1.a) (because currently we only take the first Expr in the list to construct the UniqueKey). We could also construct *two* unique keys, one with Expr (t1.a) and one with Expr (t2.a), but I don't think that's the correct approach either, as it will explode when you have multiple pathkeys, each having multiple Expr inside their EqClasses.
That makes it difficult to check if we can perform the DISTINCT skip scan on table t2 as well (theoretically we could, but for that we need to check equivalence classes rather than expressions).

>
> > - the distinct_pathkeys may be NULL, even though there's a possibility for
> skipping. But it wouldn't create the uniquekeys in this case. This makes the
> planner not choose skip scans even though it could. For example in queries
> that do SELECT DISTINCT ON (a) * FROM t1 WHERE a=1 ORDER BY a,b; Since a
> is constant, it's eliminated from regular pathkeys.
>
> What would be the point of skipping if it's a constant?

For the query: SELECT DISTINCT ON (a) * FROM t1 WHERE a=1 ORDER BY a,b;
There may be 1000s of records with a=1. We're only interested in the first one though. The traditional non-skip approach would still scan all records with a=1. Skip would just fetch the first one with a=1 and then skip to the next prefix and stop the scan.

>
> > - to combat the issues mentioned earlier, there's now a check in
> build_index_paths that checks if the query_pathkeys matches the
> useful_pathkeys. Note that we have to use the path keys here rather than
> any of the unique keys. The unique keys are only Expr nodes - they do not
> contain the necessary information about ordering. Due to elimination of
> some constant path keys, we have to search the attributes of the index to
> find the correct prefix to use in skipping.
>
> IIUC here you mean this function, right?
>
> + prefix = find_index_prefix_for_pathkey(root,
> +
> index,
> +
> BackwardScanDirection,
> +
> llast_node(PathKey,
> +
> root->distinct_pathkeys));
>
> Doesn't it duplicate the job already done in build_index_pathkeys by building
> those pathkeys again? If yes, probably it's possible to reuse useful_pathkeys.
> Not sure about unordered indexes, but looks like query_pathkeys should
> also match in this case.
>

Yeah, there's definitely some double work there, but the actual impact may be limited - it doesn't actually allocate a new path key, but it looks it up in root->canon_pathkeys and returns that path key.
I wrote it like this, because I couldn't find a way to identify from a certain PathKey the actual location in the index of that column. The constructed path keys list filters out all redundant path keys. An index on (a,a,b,a,b) becomes path keys (a,b). Now if we skip on (a,b) we actually need to use prefix=3. But how to get from PathKey=b to that number 3, I didn't find a solid way except doing this. Maybe there is though?

-Floris

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-07-14 18:36:37 Re: Binary support for pgoutput plugin
Previous Message Dave Cramer 2020-07-14 18:08:53 Re: Binary support for pgoutput plugin