Re: Removing useless DISTINCT clauses

From: "Finnerty, Jim" <jfinnert(at)amazon(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Removing useless DISTINCT clauses
Date: 2018-08-24 16:05:21
Message-ID: 4E453212-55C9-41DF-9977-840FDFB4CD31@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I feel strongly that eliminating the entire DISTINCT or GROUP BY clause (when there are no aggs) is an important optimization, especially when the incremental cost to test for it is so tiny. I'm happy to submit that as a separate thread.

My goal here was to move the original proposal along and contribute a little something back to the community in the process. DISTINCT optimization is currently quite poor compared to the leading commercial RDBMS alternatives, and doing unnecessary DISTINCT in the single-table case is an example of that. There are other missing DISTINCT optimizations.

I'll explore a proper way to test that it's in the single-relation case, and will post a separate thread for the 'remove unnecessary DISTINCT' optimization.

Cheers,

/Jim

On 8/23/18, 11:12 PM, "David Rowley" <david(dot)rowley(at)2ndquadrant(dot)com> wrote:

You might be confusing #1 and #2. My concern is with #2. The existing
GROUP BY clause optimisation is almost identical to #1. I just wanted
to also apply it to the DISTINCT clause.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-08-24 16:10:34 Re: Windows vs C99 (was Re: C99 compliance for src/port/snprintf.c)
Previous Message Andres Freund 2018-08-24 16:01:26 Re: remove ATTRIBUTE_FIXED_PART_SIZE