Re: tricky query

From: Cosimo Streppone <cosimo(at)streppone(dot)it>
To: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>
Cc: Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: tricky query
Date: 2005-06-28 15:20:13
Message-ID: 42C16AAD.4070903@streppone.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Merlin Moncure wrote:

> I need a fast way (sql only preferred) to solve the following problem:
> I need the smallest integer that is greater than zero that is not in the
> column of a table.
>
> I've already worked out a query using generate_series (not scalable) and
> pl/pgsql. An SQL only solution would be preferred, am I missing
> something obvious?

Probably not, but I thought about this "brute-force" approach... :-)
This should work well provided that:

- you have a finite number of integers. Your column should have a biggest
integer value with a reasonable maximum like 100,000 or 1,000,000.
#define YOUR_MAX 99999

- you can accept that query execution time depends on smallest integer found.
The bigger the found integer, the slower execution you get.

Ok, so:

Create a relation "integers" (or whatever) with every single integer from 1 to
YOUR_MAX:

CREATE TABLE integers (id integer primary key);
INSERT INTO integers (id) VALUES (1);
INSERT INTO integers (id) VALUES (2);
...
INSERT INTO integers (id) VALUES (YOUR_MAX);

Create your relation:

CREATE TABLE merlin (id integer primary key);
<and fill it with values>

Query is simple now:

SELECT a.id FROM integers a
LEFT JOIN merlin b ON a.id=b.id
WHERE b.id IS NULL
ORDER BY a.id LIMIT 1;

Execution times with 100k tuples in "integers" and
99,999 tuples in "merlin":

>\timing
Timing is on.
>select i.id from integers i left join merlin s on i.id=s.id where s.id is
null order by i.id limit 1;
99999

Time: 233.618 ms
>insert into merlin (id) values (99999);
INSERT 86266614 1
Time: 0.579 ms
>delete from merlin where id=241;
DELETE 1
Time: 0.726 ms
>select i.id from integers i left join merlin s on i.id=s.id where s.id is
null order by i.id limit 1;
241

Time: 1.336 ms
>

--
Cosimo

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Michael Stone 2005-06-28 15:27:50 read block size
Previous Message Bruno Wolff III 2005-06-28 15:12:46 Re: tricky query