Re: How to speed up min/max(id) in 50M rows table?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: henk de wit <henk53602(at)hotmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to speed up min/max(id) in 50M rows table?
Date: 2007-10-12 22:03:46
Message-ID: 5754.1192226626@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

henk de wit <henk53602(at)hotmail(dot)com> writes:
> The plan looks like this:

> "Result (cost=3D0.37..0.38 rows=3D1 width=3D0) (actual time=3D184231.636..=
> 184231.638 rows=3D1 loops=3D1)"
> " InitPlan"
> " -> Limit (cost=3D0.00..0.37 rows=3D1 width=3D8) (actual time=3D18423=
> 1.620..184231.622 rows=3D1 loops=3D1)"
> " -> Index Scan Backward using trans_payment_id_index on transact=
> ions (cost=3D0.00..19144690.58 rows=3D51122691 width=3D8) (actual time=3D1=
> 84231.613..184231.613 rows=3D1 loops=3D1)"
> " Filter: (payment_id IS NOT NULL)"
> "Total runtime: 184231.755 ms"

The only way I can see for that to be so slow is if you have a very
large number of rows where payment_id is null --- is that the case?

There's not a lot you could do about that in existing releases :-(.
In 8.3 it'll be possible to declare the index as NULLS FIRST, which
moves the performance problem from the max end to the min end ...

> select min(time) from transactions where payment_id =3D 67

> There are indexes on both the time (a timestamp with time zone) and payment=
> _id (a bigint) columns.

Creating indexes at random with no thought about how the system could
use them is not a recipe for speeding up your queries. What you'd need
to make this query fast is a double-column index on (payment_id, time)
so that a forward scan on the items with payment_id = 67 would
immediately find the minimum time entry. Neither of the single-column
indexes offers any way to find the desired entry without scanning over
lots of unrelated entries.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alan Hodgson 2007-10-12 22:04:40 Re: How to speed up min/max(id) in 50M rows table?
Previous Message henk de wit 2007-10-12 21:48:59 Re: Huge amount of memory consumed during transaction