Re: How do I bump a row to the front of sort efficiently

From: BladeOfLight16 <bladeoflight16(at)gmail(dot)com>
To: Sam Saffron <sam(dot)saffron(at)gmail(dot)com>
Cc: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: How do I bump a row to the front of sort efficiently
Date: 2015-02-04 10:15:07
Message-ID: CA+=1U=X+nWxAPztwALPyr48QVhSwwCX8x0Gui9sCh6E=ezAVRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Tue, Feb 3, 2015 at 11:28 PM, Sam Saffron <sam(dot)saffron(at)gmail(dot)com> wrote:

> Note: I still consider this a bug/missing feature of sorts since the
> planner could do better here, and there is no real clean way of
> structuring a query to perform efficiently here, which is why I
> erroneously cross posted this to hacker initially:
>

No, it should not be considered a bug or a deficiency. You're telling the
system to use a *computed value* that uses an *arbitrary logic construct *for
*sorting*, and that's *before* you actually give it a filter to work with.
You're also trying to abuse ORDER BY to make LIMIT do filtering. How do you
expect that to go? I would expect the planner to do exactly what you told
it you wanted: sort the entire table by that computed value, and it would
have to do a sequential scan to compute the value for every row before it
knew which ones came first for the LIMIT.

CASE is well known to cause optimization problems; arbitrary conditional
logic isn't especially conducive to a planner optimizing things. In
general, the planner has *no* idea what the logic really means, so the
planner would have to have some kind of special logic trying to pick up on
this case. Your particular use case is uncommon; why should the planner
code be junked up with a thousand little optimizations for uncommon
situations like this (which would make it unmaintainable) when you already
have a reasonable alternative? PG is great for a reason: the devs have made
a lot of fantastic choices in designing it. Trying to keep the planner
relatively simple is one of them, if I recall correctly.

What you want is well represented by a UNION query: you want it to fetch
one particular row by ID, and then you want to tack on 29 other particular
rows based on a particular sort order and possibly an offset. These are two
completely disparate ways of fetching data; of course the most optimal way
is going to be to essentially write two queries and put the results
together. That's also going to be the *clearest* way of writing the query,
so the next person to work on it knows what you were doing.

And that aside, you don't even need to use CASE. You could've just used (id
= 1), which would give you a boolean result. You can most certainly sort by
a boolean. (I believe PG sorts TRUE first.) You should see if that gets
optimized at all. (If it doesn't, I won't be surprised and everything else
I said still holds.)

By the way, you can do better on your UNION query:

select * from topic
where id = 1000
union all
(select * from topic
where id <> 1000
order by bumped_at desc
limit 29)

I imagine your original would be at risk of LIMITing out the very row you
seek to get at the "top", since you don't have an ORDER BY to tell it which
ones to keep during the outer LIMIT.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sam Saffron 2015-02-04 10:53:52 Re: How do I bump a row to the front of sort efficiently
Previous Message Sam Saffron 2015-02-04 04:28:51 Re: How do I bump a row to the front of sort efficiently

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2015-02-04 10:22:39 Re: pg_basebackup may fail to send feedbacks.
Previous Message Fujii Masao 2015-02-04 10:02:25 Re: Docs: CREATE TABLESPACE minor markup fix