Re: tricky query

From: John A Meinel <john(at)arbash-meinel(dot)com>
To: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>
Cc: Bruno Wolff III <bruno(at)wolff(dot)to>, Postgresql Performance <pgsql-performance(at)postgresql(dot)org>, elein <elein(at)varlena(dot)com>
Subject: Re: tricky query
Date: 2005-06-28 19:42:21
Message-ID: 42C1A81D.1060507@arbash-meinel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Merlin Moncure wrote:

>>On Tue, Jun 28, 2005 at 12:02:09 -0400,
>> Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com> wrote:
>>
>>
>>>Confirmed. Hats off to you, the above some really wicked querying.
>>>IIRC I posted the same question several months ago with no response
>>>
>>>
>and
>
>
>>>had given up on it. I think your solution (smallest X1 not in X) is
>>>
>>>
>a
>
>
>>>good candidate for general bits, so I'm passing this to varlena for
>>>review :)
>>>
>>>SELECT t1.id+1 as id_new FROM id_test t1
>>> WHERE NOT EXISTS
>>> (SELECT t2.id FROM id_test t2 WHERE t2.id = t1.id+1)
>>> ORDER BY t1.id LIMIT 1;
>>>
>>>
>>You need to rework this to check to see if row '1' is missing. The
>>above returns the start of the first gap after the first row that
>>isn't missing.
>>
>>
>
>Correct.
>
>In fact, I left out a detail in my original request in that I had a
>starting value (easily supplied with where clause)...so what I was
>really looking for was a query which started at a supplied value and
>looped forwards looking for an empty slot. John's supplied query is a
>drop in replacement for a plpgsql routine which does exactly this.
>
>The main problem with the generate_series approach is that there is no
>convenient way to determine a supplied upper bound. Also, in some
>corner cases of my problem domain the performance was not good.
>
>Merlin
>
>
Actually, if you already have a lower bound, then you can change it to:

SELECT t1.id+1 as id_new FROM id_test t1
WHERE t1.id > id_min
AND NOT EXISTS
(SELECT t2.id FROM id_test t2 WHERE t2.id = t1.id+1)
ORDER BY t1.id LIMIT 1;

This would actually really help performance if you have a large table
and then empty entries start late.

On my system, where the first entry is 64k, doing where id > 60000
speeds it up back to 80ms instead of 1000ms.
John
=:->

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2005-06-28 19:45:53 Re: tricky query
Previous Message Merlin Moncure 2005-06-28 19:36:29 Re: tricky query