Skip site navigation (1) Skip section navigation (2)

Problem with ORDER BY and DISTINCT ON

From: Steve Midgley <public(at)misuse(dot)org>
To: pgsql-sql(at)postgresql(dot)org
Subject: Problem with ORDER BY and DISTINCT ON
Date: 2008-07-16 07:39:47
Message-ID: 20080716073955.EB1DF64FCBD@postgresql.org (view raw or flat)
Thread:
Lists: pgsql-sql
Hi,

I'm a little baffled. I'm trying to generate a SQL statement that 
issues a DISTINCT ON using the same values as my ORDER BY statement. 
I'm using a somewhat complex CASE statement in my ORDER BY clause. I'm 
on Pg 8.2. Here is some SQL to get you started at seeing my problem:

------------------

drop table if exists property;
create table property
( id serial,
   state varchar(255),
   search_rate_max decimal(8,2),
   data_priority_code varchar(255)
);

SELECT DISTINCT ON
("property"."state",
  CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max",
  CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max",
  property.id)
  property.id
FROM property WHERE (("property"."data_priority_code" IS NOT NULL))
ORDER BY
   "property"."state",
   CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max",
   CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max",
   property.id
LIMIT 10 OFFSET 0

----------------
RESULTS: ERROR: SELECT DISTINCT ON expressions must match initial ORDER 
BY expressions
SQL state: 42P10
----------------

Now if you run this statement it works

------------------

SELECT DISTINCT ON
("property"."state",
  property.id)
  property.id
FROM property WHERE (("property"."data_priority_code" IS NOT NULL))
ORDER BY
   "property"."state",
   property.id
LIMIT 10 OFFSET 0

------------------

However if you run this statement it ALSO works, which tells me it's 
not just my CASE statements that are messing things up (note in this 
example, I just removed the primary key "property.id" from the ORDER BY 
and DISTINCT ON clauses:

-------------------

SELECT DISTINCT ON
("property"."state",
  CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max",
  CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max"
  )
  property.id
FROM property WHERE (("property"."data_priority_code" IS NOT NULL))
ORDER BY
   "property"."state",
   CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max",
   CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max"
LIMIT 10 OFFSET 0

--------------------
RESULTS: ERROR: SELECT DISTINCT ON expressions must match initial ORDER 
BY expressions
SQL state: 42P10
--------------------

Finally, if you run this statement it works fine (removing one of the 
duplicate search_rate_max statements):

--------------------

SELECT DISTINCT ON
("property"."state",
  CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max",
  property.id)
  property.id
FROM property WHERE (("property"."data_priority_code" IS NOT NULL))
ORDER BY
   "property"."state",
   CASE WHEN ("search_rate_max" IS NOT NULL) THEN 1 ELSE 2 
END,"search_rate_max",
   property.id LIMIT 10 OFFSET 0

--------------------

What's going on here? Am I doing something that isn't legitimate SQL? I 
can't see why having a duplicate CASE statement should foul things up 
like this? It's pretty clear (from additional testing not included in 
this email) that the duplicate "search_rate_max" CASE is causing the 
problem.

Thanks for any advice or suggestions on how to get this to run 
correctly. Is this a bug?

Basically I'm doing this as an optimization - I can get much better 
performance running the DISTINCT ON in some circumstances than using 
DISTINCT, but the edge case above is breaking my tests and preventing 
me from implementing the idea. The code is generated by an application 
layer which is not really paying attention to whether or not the two 
CASE statements apply to the same field or not (sometimes they do 
sometimes they don't)..

Thanks!

Steve


Responses

pgsql-sql by date

Next:From: Ivan Sergio BorgonovoDate: 2008-07-16 09:06:11
Subject: integrity check and visibility was: COPY equivalent for updates
Previous:From: A. KretschmerDate: 2008-07-16 05:39:11
Subject: Re: How to GROUP results BY month

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group