Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
Date: 2020-07-31 01:39:35
Message-ID: CAAaqYe97a3xf=5Q0ZjPcQQGX-rAPL9CNFkfupqzCt7g4TrVLrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 30, 2020 at 8:22 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:

> On Wed, Jul 29, 2020 at 09:18:44PM -0700, Peter Geoghegan wrote:
> > On Wed, Jul 29, 2020 at 9:05 PM Justin Pryzby <pryzby(at)telsasoft(dot)com>
> wrote:
> > > So my 2ndary suggestion is to conditionalize based on the method
> rather than
> > > value of space used.
> >
> > What's the advantage of doing it that way?
>
> Because filtering out zero values is exactly what's intended to be avoided
> for
> nontext output.
>
> I think checking whether the method was used should result in the same
> output,
> without the literal check for zero value (which itself sets a bad example).
>
> --- a/src/backend/commands/explain.c
> +++ b/src/backend/commands/explain.c
> @@ -2824,13 +2824,13 @@
> show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo,
> appendStringInfo(&groupName, "%s Groups", groupLabel);
> ExplainOpenGroup("Incremental Sort Groups",
> groupName.data, true, es);
> ExplainPropertyInteger("Group Count", NULL,
> groupInfo->groupCount, es);
>
> ExplainPropertyList("Sort Methods Used", methodNames, es);
>
> - if (groupInfo->maxMemorySpaceUsed > 0)
> + if (groupInfo->sortMethods & SORT_TYPE_QUICKSORT)
> {
> long avgSpace =
> groupInfo->totalMemorySpaceUsed / groupInfo->groupCount;
> const char *spaceTypeName;
> StringInfoData memoryName;
>
> spaceTypeName =
> tuplesort_space_type_name(SORT_SPACE_TYPE_MEMORY);
> @@ -2841,13 +2841,13 @@
> show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo,
> ExplainPropertyInteger("Average Sort Space Used",
> "kB", avgSpace, es);
> ExplainPropertyInteger("Peak Sort Space Used",
> "kB",
>
> groupInfo->maxMemorySpaceUsed, es);
>
> ExplainCloseGroup("Sort Spaces", memoryName.data,
> true, es);
> }
> - if (groupInfo->maxDiskSpaceUsed > 0)
> + if (groupInfo->sortMethods & SORT_TYPE_EXTERNAL_SORT)
> {
> long avgSpace =
> groupInfo->totalDiskSpaceUsed / groupInfo->groupCount;
> const char *spaceTypeName;
> StringInfoData diskName;
>
> spaceTypeName =
> tuplesort_space_type_name(SORT_SPACE_TYPE_DISK);
>

I very much do not like this approach, and I think it's actually
fundamentally wrong, at least for the memory check. Quicksort is not the
only option that uses memory. For now, there's only one option that spills
to disk (external merge sort), but there's no reason it has to remain that
way. And in the future we might accurately report memory consumed even when
we've eventually spilled to disk also, so memory used would be relevant
potentially even if no in-memory sort was ever performed.

So I'm pretty confident checking the space used is the correct way to do
this.

James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-07-31 01:40:27 Re: HashAgg's batching counter starts at 0, but Hash's starts at 1. (now: incremental sort)
Previous Message Mark Dilger 2020-07-31 01:38:43 Re: new heapcheck contrib module