Re: How do I bump a row to the front of sort efficiently

From: Sam Saffron <sam(dot)saffron(at)gmail(dot)com>
To: BladeOfLight16 <bladeoflight16(at)gmail(dot)com>
Cc: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: How do I bump a row to the front of sort efficiently
Date: 2015-02-04 10:53:52
Message-ID: CAAtdryN9qKaspTuGO4UTMRqiO38kgwgTOiQGR=3yqdDQyNFQ3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

The union hack may be able to work, but what I want is slightly more complex

I want this to work efficiently, even without the case it chokes:

explain select * from testing order by id in (100,2,-1) desc, id limit 30;
QUERY PLAN
---------------------------------------------------------------------------
Limit (cost=4869.45..4869.52 rows=30 width=9)
-> Sort (cost=4869.45..5119.45 rows=100000 width=9)
Sort Key: ((id = ANY ('{100,2,-1}'::integer[]))), id
-> Seq Scan on testing (cost=0.00..1916.00 rows=100000 width=9)
(4 rows)

I need to be able to offset and limit the union hack in a view, which
is proving very tricky.

On Wed, Feb 4, 2015 at 9:15 PM, BladeOfLight16 <bladeoflight16(at)gmail(dot)com> wrote:
> On Tue, Feb 3, 2015 at 11:28 PM, Sam Saffron <sam(dot)saffron(at)gmail(dot)com> wrote:
>>
>> Note: I still consider this a bug/missing feature of sorts since the
>> planner could do better here, and there is no real clean way of
>> structuring a query to perform efficiently here, which is why I
>> erroneously cross posted this to hacker initially:
>
>
> No, it should not be considered a bug or a deficiency. You're telling the
> system to use a computed value that uses an arbitrary logic construct for
> sorting, and that's before you actually give it a filter to work with.
> You're also trying to abuse ORDER BY to make LIMIT do filtering. How do you
> expect that to go? I would expect the planner to do exactly what you told it
> you wanted: sort the entire table by that computed value, and it would have
> to do a sequential scan to compute the value for every row before it knew
> which ones came first for the LIMIT.
>
> CASE is well known to cause optimization problems; arbitrary conditional
> logic isn't especially conducive to a planner optimizing things. In general,
> the planner has no idea what the logic really means, so the planner would
> have to have some kind of special logic trying to pick up on this case. Your
> particular use case is uncommon; why should the planner code be junked up
> with a thousand little optimizations for uncommon situations like this
> (which would make it unmaintainable) when you already have a reasonable
> alternative? PG is great for a reason: the devs have made a lot of fantastic
> choices in designing it. Trying to keep the planner relatively simple is one
> of them, if I recall correctly.
>
> What you want is well represented by a UNION query: you want it to fetch one
> particular row by ID, and then you want to tack on 29 other particular rows
> based on a particular sort order and possibly an offset. These are two
> completely disparate ways of fetching data; of course the most optimal way
> is going to be to essentially write two queries and put the results
> together. That's also going to be the clearest way of writing the query, so
> the next person to work on it knows what you were doing.
>
> And that aside, you don't even need to use CASE. You could've just used (id
> = 1), which would give you a boolean result. You can most certainly sort by
> a boolean. (I believe PG sorts TRUE first.) You should see if that gets
> optimized at all. (If it doesn't, I won't be surprised and everything else I
> said still holds.)
>
> By the way, you can do better on your UNION query:
>
> select * from topic
> where id = 1000
> union all
> (select * from topic
> where id <> 1000
> order by bumped_at desc
> limit 29)
>
> I imagine your original would be at risk of LIMITing out the very row you
> seek to get at the "top", since you don't have an ORDER BY to tell it which
> ones to keep during the outer LIMIT.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2015-02-04 16:25:42 Searching postgres soruces (was: Re: array in a store procedure in C)
Previous Message BladeOfLight16 2015-02-04 10:15:07 Re: How do I bump a row to the front of sort efficiently

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen R. van den Berg 2015-02-04 11:02:41 Re: Table description in the data file (Re: pg_rawdump)
Previous Message Fujii Masao 2015-02-04 10:22:39 Re: pg_basebackup may fail to send feedbacks.