Re: speeding up big query lookup

From: "Silvela, Jaime \(Exchange\)" <JSilvela(at)Bear(dot)com>
To: "macgillivary" <macgillivary(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: speeding up big query lookup
Date: 2006-08-29 13:29:04
Message-ID: B0D2EF413B7344489985137E6DDB6C430BF701@whexchmb14.bsna.bsroot.bear.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for the tips, am

Actually, your suggestion is equivalent to JOINing the table with a
GROUP BY copy of itself, and EXPLAIN shows both versions using the same
index and aggregates. Just a matter of style. Your previous suggestion
from the book works well too, but I actually prefer the JOIN method,
since that allows me to set the object_id and/or object_val_type values
in only one place.

Tom's method is faster, but has against it a bit of obscurity - it's
very fine tuned to a very specific behavior of DISTINCT ON and is less
easy to read than the others.

I fully agree that it is annoying to keep another table with triggers.
And of course, that table needs to be indexed too, or it's worthless.
I'm wondering how much extra time the db spends running all those
indexes and triggers, and I'll probably dismantle that in favor of the
composite index and the queries suggested so far.

I'll definitely check that book, I've been looking for something like
that.

Thanks
Jaime

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org] On Behalf Of macgillivary
Sent: Monday, August 28, 2006 10:14 PM
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] speeding up big query lookup

Just for fun, another approach since I believe pg supports it:

select whateverFields
from object_val as outer
where (outer.object_id,
outer.object_val_type_id,outer.observation_date) IN
(select inner.object_id,
inner.object_val_type,max(inner.observation_date)
from object_val as inner
where inner.object_id = somevalueForObjectX
and inner.object_val_type = someValueForTypeA
and inner.observation_date <= yourReferenceDate
group by inner.object_id, inner.object_val_type)

The reason these subqueries should run quickly is because the
object_id,object_val_type,oberservation_date make up a composite key,
so the subquery should execute extremely fast, thus eliminating the
majority of the data when you want to display or act on other fields
from the object_val (as outer). I suppose if you don't need any
further information from object_val, and you are happy with the speeds,
Tom's method is smooth.

Adding the order by clause will take you out of the 'relational world'
and thus slow you down. My fear with the triggers and the separate
snapshot is that the delays are spread out and add questionable
complexity, and potentially uneccessary overhead to the application.
Something to consider (although admittedly it is arguably a weak
consideration in some circumstances) is the extra space, indexes, and
other factors such as additional time for backup routines (and
restoration) the extra table creates.

Best of luck,
am

"Silvela, Jaime (Exchange)" wrote:
> No, you can make this work just fine if you JOIN right.
> You're way is a more concise way of expressing it, though.
>
> Tom's trick
>
> SELECT DISTINCT ON (object_id, object_val_type_id) * from object_val
> ORDER BY object_id DESC, object_val_type_id DESC, observation_date
> DESC
>
> Runs about twice as fast as the GROUP BY ... HAVING, but definitely
not
> as fast as keeping a separate table with only the latest observations,
> updated by triggers. I'll be testing out the differences in overall
> performance for my applications.
>
> Thanks for the suggestions,
> Jaime
>

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

***********************************************************************
Bear Stearns is not responsible for any recommendation, solicitation,
offer or agreement or any information about any transaction, customer
account or account activity contained in this communication.

Bear Stearns does not provide tax, legal or accounting advice. You
should consult your own tax, legal and accounting advisors before
engaging in any transaction. In order for Bear Stearns to comply with
Internal Revenue Service Circular 230 (if applicable), you are notified
that any discussion of U.S. federal tax issues contained or referred to
herein is not intended or written to be used, and cannot be used, for
the purpose of: (A) avoiding penalties that may be imposed under the
Internal Revenue Code; nor (B) promoting, marketing or recommending to
another party any transaction or matter addressed herein.
***********************************************************************

Browse pgsql-general by date

  From Date Subject
Next Message Joshua D. Drake 2006-08-29 14:29:24 Re: Deathly slow performance on SMP red-hat system
Previous Message Tom Lane 2006-08-29 13:15:31 Re: transaction isolation level