Re: pgsql: Support partition pruning at execution time

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Support partition pruning at execution time
Date: 2018-04-10 16:29:12
Message-ID: 20180410162912.ungqwaqk53whnd33@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Robert Haas wrote:
> On Tue, Apr 10, 2018 at 11:14 AM, Alvaro Herrera
> <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > Questions:
> > 1. Do we want to back-patch this to 10? I suppose (without checking)
> > that EXPLAIN ANALYZE is already reporting bogus numbers for parallel
> > index-only scans, so I think we should do that.
>
> I haven't looked at this closely, but have you considered adding
> bespoke code for IndexOnlyScan that works like
> ExecSortRetrieveInstrumentation and ExecHashRetrieveInstrumentation
> already do rather than jamming this into struct Instrumentation?

Thanks for pointing these out -- I hadn't come across these.

My initial impression is that those two are about transferring
instrumentation state that's quite a bit more complicated than what my
patch proposes for indexonly scan, which is just a single integer
counter. For example, in the case of Sort, for each worker we display
separately the method and type and amount of memory. In the hash case,
we aggregate all of them together for some reason (though I'm not clear
about this one:
/*
* In a parallel-aware hash join, for now we report the
* maximum peak memory reported by any worker.
*/
hinstrument.space_peak =
Max(hinstrument.space_peak, worker_hi->space_peak);
-- why shouldn't we sum these values?)

In contrast, in an indexonly scan you have a single counter and it
doesn't really matter the distribution of fetches done by workers, so it
seems okay to aggregate them all in a single counter. And it being so
simple, it seems reasonable to me to put it in Instrumentation rather
than have a dedicated struct.

> I'm inclined to view any node-specific instrumentation that's not
> being pulled back to the leader as a rough edge to be filed down when
> it bothers somebody more than an outright bug, but perhaps that is an
> unduly lenient view. Still, if we take the view that it's an outright
> bug, I suspect we find that there may be at least a few more of those.

OK. As Tom indicated, it's not possible to backpatch this anyway.
Given that nobody has complained to date, it seems okay to be lenient
about this. Seeking a backpatchable solution seems more trouble than is
worth.

> I was pretty much oblivious to this problem during the initial
> parallel query development and mistakenly assumed that bringing over
> struct Instrumentation was good enough. It emerged late in the game
> that this wasn't really the case, but holding up the whole feature
> because some nodes might have details not reported on a per-worker
> basis didn't really seem to make sense. Whether that was the right
> call is obviously arguable.

I certainly don't blame you for that.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2018-04-10 16:38:05 Re: pgsql: Support partition pruning at execution time
Previous Message Julien Rouhaud 2018-04-10 16:13:01 Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-04-10 16:30:21 Re: Boolean partitions syntax
Previous Message Robert Haas 2018-04-10 16:18:46 Re: Online enabling of checksums