Re: Some revises in adding sorting path

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Some revises in adding sorting path
Date: 2023-02-14 02:53:40
Message-ID: CAApHDvotJVLhEycX_tfqw8grJBnqXpfirwxQa64R7bCYXAEuLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I looked at the three patches and have some thoughts:

0001:

Does the newly added test have to be this complex? I think it might
be better to just come up with the most simple test you can that uses
an incremental sort. I really can't think why the test requires a FOR
UPDATE, to test incremental sort, for example. The danger with making
a test more complex than it needs to be is that it frequently gets
broken by unrelated changes. The complexity makes it harder for
people to understand the test's intentions and that increases the risk
that the test eventually does not test what it was originally meant to
test as the underlying code changes and the expected output is
updated.

0002:

I think the following existing comment does not seem to be true any longer:

> However, there's one more
> * possibility: it may make sense to sort the cheapest partial path
> * according to the required output order and then use Gather Merge.

You've removed the comment that talks about trying Incremental Sort ->
Gather Merge paths yet the code is still doing that, the two are just
more consolidated now, so perhaps you need to come up with a new
comment to explain what we're trying to achieve.

> * already (no need to deal with paths which have presorted
> * keys when incremental sort is disabled unless it's the
> * cheapest input path).

I think "input path" should be "partial path". (I maybe didn't get
that right in all places in 4a29eabd1).

0003:

Looking at gather_grouping_paths(), I see it calls
generate_useful_gather_paths() which generates a bunch of Gather Merge
paths after sorting the cheapest path and incrementally sorting any
partially sorted paths. Then back in gather_grouping_paths(), we go
and create Gather Merge paths again, but this time according to the
group_pathkeys instead of the query_pathkeys. I know you're not really
changing anything here, but as I'm struggling to understand why
exactly we're creating two sets of Gather Merge paths, it makes it a
bit scary to consider changing anything in this area. I've not really
found any comments that can explain to me sufficiently well enough so
that I understand why this needs to be done. Do you know?

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message wangw.fnst@fujitsu.com 2023-02-14 03:36:52 RE: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Previous Message Andres Freund 2023-02-14 02:43:35 Re: Buffer usage detailed by RelKind in EXPLAIN ANALYZE BUFFERS