Re: tricky query

From: Dawid Kuroczko <qnex42(at)gmail(dot)com>
To: John A Meinel <john(at)arbash-meinel(dot)com>
Cc: Merlin Moncure <merlin(dot)moncure(at)rcsonline(dot)com>, Bruno Wolff III <bruno(at)wolff(dot)to>, Postgresql Performance <pgsql-performance(at)postgresql(dot)org>, elein <elein(at)varlena(dot)com>
Subject: Re: tricky query
Date: 2005-06-29 06:36:52
Message-ID: 758d5e7f05062823361c8840f4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 6/28/05, John A Meinel <john(at)arbash-meinel(dot)com> wrote:

> Actually, if you already have a lower bound, then you can change it to:
>
> SELECT t1.id+1 as id_new FROM id_test t1
> WHERE t1.id > id_min
> AND NOT EXISTS
> (SELECT t2.id FROM id_test t2 WHERE t2.id = t1.id+1)
> ORDER BY t1.id LIMIT 1;
>
> This would actually really help performance if you have a large table
> and then empty entries start late.

You can also boost performance by creating a functional index!

CREATE UNIQUE INDEX id_test_id1_index ON id_test ((id+1));

...and then joining two tables and filtering results. PostgreSQL (8.x)
will do Merge Full Join which will use both the indexes:

SELECT t2.id+1 FROM id_test t1 FULL OUTER JOIN id_test t2 ON (t1.id =
t2.id+1) WHERE t1.id IS NULL LIMIT 1;

Limit (cost=0.00..1.52 rows=1 width=4)
-> Merge Full Join (cost=0.00..1523122.73 rows=999974 width=4)
Merge Cond: ("outer".id = ("inner".id + 1))
Filter: ("outer".id IS NULL)
-> Index Scan using id_test_pkey on id_test t1
(cost=0.00..18455.71 rows=999974 width=4)
-> Index Scan using id_test_id1_index on id_test t2
(cost=0.00..1482167.60 rows=999974 width=4)
(6 rows)

...the only drawback is having to keep two indexes instead of just one.
But for large tables I think it is really worth it

For my test case, the times are (1-1000000 range with 26 missing
rows):
NOT EXISTS -- 670ms
NOT IN -- 1800ms
indexed FULL OUTER -- 267ms

Regards,
Dawid

PS: Does it qualify for General Bits? ;-)))

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Martin Fandel 2005-06-29 09:02:49 Re: could not receive data from client: Connection timed out Error
Previous Message Matthew Nuzum 2005-06-29 02:54:47 Re: optimized counting of web statistics