Re: DISTINCT/Optimizer question

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Beth Jen <raelys(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DISTINCT/Optimizer question
Date: 2006-07-07 21:18:45
Message-ID: 20060707211845.GH7485@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 07, 2006 at 01:25:53PM -0400, Beth Jen wrote:
> Right now, the distinct clause adds its targets to the sort clause list when
> it is parsed. This causes an automatic insertion of the sort node into the
> query plan before the application of the unique node. The hash-based
> implementation however is meant to bypass the need to sort. I could just
> remove this action, but the optimizer should only consider using the

<snip>

My laymans opinion suggests that this needs a new specific "distinct
clause" which looks a lot like a sort clause only isn't. And then in
the planner this clause would either be converted to your new node type
or the traditional sort node.

> What are your suggestions for going about this? Are these approaches
> feasible without a significant restructuring of the code? Are there any
> other approaches I should consider?

I think it should be possible without too much changes, since much
would be shared. For example you could have the distinct node look
exactly like the sort, so they could share code. Or perhaps just a
flag to distinguish them. I admit I havn't looked carefully though...

Have you considered how your code interacts with DISTINCT ON ()?
Perhaps a clue lies there...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2006-07-08 04:21:27 Re: DISTINCT/Optimizer question
Previous Message Jim Nasby 2006-07-07 21:09:56 Re: set search_path in dump output considered harmful