Re: Index Skip Scan

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Floris Van Nee <florisvannee(at)optiver(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Index Skip Scan
Date: 2019-07-10 02:14:06
Message-ID: CA+hUKGKmdMABJ9k1BbSDgeVsDM3_nLx091mVGA1NzyT06G6GXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 10, 2019 at 1:32 AM Jesper Pedersen
<jesper(dot)pedersen(at)redhat(dot)com> wrote:
> > While updating the Loose Index Scan wiki page with links to other
> > products' terminology on this subject, I noticed that MySQL can
> > skip-scan MIN() and MAX() in the same query. Hmm. That seems quite
> > desirable. I think it requires a new kind of skipping: I think you
> > have to be able to skip to the first AND last key that has each
> > distinct prefix, and then stick a regular agg on top to collapse them
> > into one row. Such a path would not be so neatly describable by
> > UniqueKeys, or indeed by the amskip() interface in the current patch.
> > I mention all this stuff not because I want us to run before we can
> > walk, but because to be ready to commit the basic distinct skip scan
> > feature, I think we should know approximately how it'll handle the
> > future stuff we'll need.
>
> Thomas, do you have any ideas for this ? I can see that MySQL did the
> functionality in two change sets (base and function support), but like
> you said we shouldn't paint ourselves into a corner.

I think amskip() could be augmented by later patches to take a
parameter 'skip mode' that can take values SKIP_FIRST, SKIP_LAST and
SKIP_FIRST | SKIP_LAST (meaning please give me both). What we have in
the current patch is SKIP_FIRST behaviour. Being able to choose
SKIP_FIRST or SKIP_LAST allows you do handle i, MAX(j) GROUP BY (i)
ORDER BY i (ie forward scan for the order, but wanting the highest key
for each distinct prefix), and being able to choose both allows for i,
MIN(j), MAX(j) GROUP BY i. Earlier I thought that this scheme that
allows you to ask for both might be incompatible with David's
suggestion of tracking UniqueKeys in paths, but now I see that it
isn't: it's just that SKIP_FIRST | SKIP_LAST would have no UniqueKeys
and therefore require a regular agg on top. I suspect that's OK: this
min/max transformation stuff is highly specialised and can have
whatever magic path-building logic is required in
preprocess_minmax_aggregates().

--
Thomas Munro
https://enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Takuma Hoshiai 2019-07-10 02:29:38 Re: Implementing Incremental View Maintenance
Previous Message Tom Lane 2019-07-10 02:09:39 Re: coypu: "FATAL: sorry, too many clients already"