Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-03-08 02:07:55
Message-ID: CAAaqYe-BV=kuMmwHCnOfdAG_mKjCo=41MeFrdtBjB+iXd6-FJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 7, 2020 at 5:47 PM James Coleman <jtc331(at)gmail(dot)com> wrote:

> On Tue, Jan 21, 2020 at 9:37 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
>> That being said, the patch also needs some more work on improving
>> EXPLAIN ANALYZE output (perhaps min/max/mean or median of
>> memory usage number of groups in each sort mode), and I think it's far
>> more feasible that I can tackle that piecemeal before the next CF.
>>
>> James
>>
>
> I'm attaching a rebased patch revision + a new commit that reworks EXPLAIN
> output. I left that patch as separate for now so that it was easy enough to
> see the difference, and so that as Tomas is working on stuff in parallel I
> don't unnecessarily cause merge conflicts for now, but next patch revision
> (assuming the EXPLAIN change looks good) can just incorporate it into the
> base patch.
>
> Here's what I've changed:
>
> - The stats necessary for ANALYZE are now only kept if the PlanState has a
> non-null instrument field (thanks to Tom for pointing out this as the
> correct way to check that ANALYZE is in flight). I did leave lines like
> `node->incsort_info.fullsortGroupInfo.groupCount++;` unguarded by that `if`
> since it seems like practically zero overhead (and almost equal to check
> the condition), but if anyone disagrees, I'm happy to change it.
> Additionally those lines (if ANALYZE is not in flight) are technically
> operating on variables that haven't explicitly been initialized in the Init
> function; please tell me if that's actually an issue given the are counters
> and we won't be using them in that case.
>

And...I discovered that I need to do this anyway. Basically, the originally
patch stored per-worker instrumentation information on every tuple fetch,
which is unnecessary, and in my haste refactoring I'd replaced that spot
with my code to better record stats. The original patch just looked at the
last tuple sort state, as I'd mentioned previously, so didn't have any
special instrumentation in non-parallel workers other than incrementing
group counters.

But obviously we don't want to record stats from every tuple, we want to
record sort info every time we finalize a sort. And so I've replaced the
group counter increment lines with calls to a newly broken out function to
record stats for the appropriate fullsort/prefixsort group info.

I came across while adding tests for EXPLAIN ANALYZE and saw a result with
the reported average memory usage higher than the max--this happened since
I was adding the memory used each time through the loop rather than once
when finalizing the sort.

> - A good bit of cleanup on how parallel workers are output (I believe
> there was some duplicative group opening and also inconsistent text output
> with other multi-worker explain nodes). I haven't had a chance to test this
> yet, thought, so there could be bugs.
>

Note: I still haven't had time to test parallel plans with the updated
EXPLAIN, so there aren't tests for that either.

- I also left a TODO wondering if we should break out the instrumentation
into a separate function; it seems like a decent sized chunk of cleanly
extractable code; I suppose that's always a bit of personal preference, so
anyone who wants to weigh in gets a vote :)

I ended up having to do this anyway, for reasons described above.

See new version attached (still with EXPLAIN changes as a separate patch
file).

James

Attachment Content-Type Size
v34-0001-Consider-low-startup-cost-when-adding-partial-pa.patch text/x-patch 3.2 KB
v34-0003-Rework-EXPLAIN-for-incremental-sort.patch text/x-patch 31.9 KB
v34-0002-Implement-incremental-sort.patch text/x-patch 137.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Coleman 2020-03-08 02:19:06 Re: Allow to_date() and to_timestamp() to accept localized names
Previous Message Peter Geoghegan 2020-03-08 00:45:22 pgsql: pageinspect: Fix types used for bt_metap() columns.