Re: Index Skip Scan (new UniqueKeys)

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Floris Van Nee <florisvannee(at)Optiver(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: Index Skip Scan (new UniqueKeys)
Date: 2020-07-23 09:53:51
Message-ID: 20200723095351.dbboo3fymuzwqmic@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Tue, Jul 14, 2020 at 06:18:50PM +0000, Floris Van Nee wrote:
>
> Due to the other changes I made in create_distinct_paths/query_has_uniquekeys_for, it will choose a correct plan now, even without the EC_MUST_BE_REDUNDANT check though, so it's difficult to give an actual failing test case now. However, since all code filters out constant keys, I think uniqueness should do it too - otherwise you could get into problems later on. It's also more consistent. If you already know something is unique by just (b), it doesn't make sense to store that it's unique by (a,b). Now that I think of it, the best place to do this EC_MUST_BE_REDUNDANT check is probably inside make_pathkeys_for_uniquekeys, rather than build_uniquekeys though. It's probably good to move it there.

That would be my suggestion as well.

> > Along the lines I'm also curious about this part:
> >
> > - ListCell *k;
> > - List *exprs = NIL;
> > -
> > - foreach(k, ec->ec_members)
> > - {
> > - EquivalenceMember *mem = (EquivalenceMember *)
> > lfirst(k);
> > - exprs = lappend(exprs, mem->em_expr);
> > - }
> > -
> > - result = lappend(result, makeUniqueKey(exprs, false, false));
> > + EquivalenceMember *mem = (EquivalenceMember*)
> > +lfirst(list_head(ec->ec_members));
> >
> > I'm curious about this myself, maybe someone can clarify. It looks like
> > generaly speaking there could be more than one member (if not
> > ec_has_volatile), which "representing knowledge that multiple items are
> > effectively equal". Is this information is not interesting enough to preserve it
> > in unique keys?
>
> Yeah, that's a good question. Hence my question about the choice for Expr rather than EquivalenceClass for the Unique Keys patch to Andy/David. When storing just Expr, it is rather difficult to check equivalence in joins for example. Suppose, later on we decide to add join support to the distinct skip scan. Consider a query like this:
> SELECT DISTINCT t1.a FROM t1 JOIN t2 ON t1.a=t2.a
> As far as my understanding goes (I didn't look into it in detail though), I think here the distinct_pathkey will have an EqClass {t1.a, t2.a}. That results in a UniqueKey with expr (t1.a) (because currently we only take the first Expr in the list to construct the UniqueKey). We could also construct *two* unique keys, one with Expr (t1.a) and one with Expr (t2.a), but I don't think that's the correct approach either, as it will explode when you have multiple pathkeys, each having multiple Expr inside their EqClasses.

One UniqueKey can have multiple corresponding expressions, which gives
us also possibility of having one unique key with (t1.a, t2.a) and it
looks now similar to EquivalenceClass.

> > > - the distinct_pathkeys may be NULL, even though there's a possibility for
> > skipping. But it wouldn't create the uniquekeys in this case. This makes the
> > planner not choose skip scans even though it could. For example in queries
> > that do SELECT DISTINCT ON (a) * FROM t1 WHERE a=1 ORDER BY a,b; Since a
> > is constant, it's eliminated from regular pathkeys.
> >
> > What would be the point of skipping if it's a constant?
>
> For the query: SELECT DISTINCT ON (a) * FROM t1 WHERE a=1 ORDER BY a,b;
> There may be 1000s of records with a=1. We're only interested in the first one though. The traditional non-skip approach would still scan all records with a=1. Skip would just fetch the first one with a=1 and then skip to the next prefix and stop the scan.

The idea behind this query sounds questionable to me, more transparent
would be to do this without distinct, skipping will actually do exactly
the same stuff just under another name. But if allowing skipping on
constants do not bring significant changes in the code probably it's
fine.

> > > - to combat the issues mentioned earlier, there's now a check in
> > build_index_paths that checks if the query_pathkeys matches the
> > useful_pathkeys. Note that we have to use the path keys here rather than
> > any of the unique keys. The unique keys are only Expr nodes - they do not
> > contain the necessary information about ordering. Due to elimination of
> > some constant path keys, we have to search the attributes of the index to
> > find the correct prefix to use in skipping.
> >
> > IIUC here you mean this function, right?
> >
> > + prefix = find_index_prefix_for_pathkey(root,
> > +
> > index,
> > +
> > BackwardScanDirection,
> > +
> > llast_node(PathKey,
> > +
> > root->distinct_pathkeys));
> >
> > Doesn't it duplicate the job already done in build_index_pathkeys by building
> > those pathkeys again? If yes, probably it's possible to reuse useful_pathkeys.
> > Not sure about unordered indexes, but looks like query_pathkeys should
> > also match in this case.
> >
>
> Yeah, there's definitely some double work there, but the actual impact may be limited - it doesn't actually allocate a new path key, but it looks it up in root->canon_pathkeys and returns that path key.
> I wrote it like this, because I couldn't find a way to identify from a certain PathKey the actual location in the index of that column. The constructed path keys list filters out all redundant path keys. An index on (a,a,b,a,b) becomes path keys (a,b). Now if we skip on (a,b) we actually need to use prefix=3. But how to get from PathKey=b to that number 3, I didn't find a solid way except doing this. Maybe there is though?

I don't think there is a direct way, but why not modify
build_index_paths to also provide this information, or compare
index_pathkeys expressions with indextlist without actually create those
pathkeys again?

And couple of words about this thread [1]. It looks to me like a strange
way of interacting with the community. Are you going to duplicate there
everything, or what are your plans? At the very least you could try to
include everyone involved in the recipients list, not exclude some of
the authors.

[1]: https://www.postgresql.org/message-id/flat/e4b623692a1447d4a13ac80fa271c8e6%40opammb0561.comp.optiver.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul Förster 2020-07-23 10:01:36 Building 12.3 from source on Mac
Previous Message Ahsan Hadi 2020-07-23 07:34:23 Re: Transactions involving multiple postgres foreign servers, take 2