Re: Picking out the most recent row using a time stamp column

From: Dave Crooke <dcrooke(at)gmail(dot)com>
To: Dave Johansen <davejohansen(at)gmail(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Picking out the most recent row using a time stamp column
Date: 2011-02-25 20:45:23
Message-ID: AANLkTi=N0LJp43Zvsyfj=Kd8f0wbF+4-ex8jy1zddWzP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Dave

Yes, 100% the best solution .... I did the same thing a while back, I just
have a separate copy of the data in a "latest" table and the Java code just
runs a second SQL statement to update it when writing a new record (I've
never been a trigger fan).

I found myself looking at the "find the latest" query again though in the
process of building a "demo mode" into our application, which will replay a
finite set of data on a rolling loop by moving it forward in time, and also
has to simulate the continuous updating of the "latest" table so the the
business logic will be appropriately fooled.

My next tweak will be to cache the "latest" table in the Java layer ;-)

Cheers
Dave

On Fri, Feb 25, 2011 at 10:50 AM, Dave Johansen <davejohansen(at)gmail(dot)com>wrote:

> On Thu, Feb 24, 2011 at 4:38 PM, Dave Crooke <dcrooke(at)gmail(dot)com> wrote:
>
>> Thanks to all .... I had a tickling feeling at the back of my mind that
>> there was a neater answer here. For the record, times (all from in-memory
>> cached data, averaged over a bunch of runs):
>>
>> Dependent subquery = 117.9 seconds
>> Join to temp table = 2.7 sec
>> DISTINCT ON = 2.7 sec
>>
>> So the DISTINCT ON may not be quicker, but it sure is tidier.
>>
>> Cheers
>> Dave
>
>
> I'm using 8.3.3 and I have a similar sort of setup and just thought I'd add
> another point of reference, here's the timing from doing the same sort of
> queries on my dataset of ~700,000 records with ~10,000 unique "id_key"s.
>
> I also added a 4th version that uses a permanent table that's
> auto-populated by a trigger with the rid of the most recent entry from the
> main table, so it's a simple join to get the latest entries.
>
> Dependent subquery = (killed it after it ran for over 10 minutes)
> Join on temp table = 1.5 seconds
> DISTINCT ON = 2.9 seconds
> Join on auto-populated table = 0.8 seconds
>
> Dave
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bhakti Ghatkar 2011-02-26 07:00:58 Vacuum problem due to temp tables
Previous Message Dave Johansen 2011-02-25 16:50:40 Re: Picking out the most recent row using a time stamp column