Re: Update with last known location?

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: James David Smith <james(dot)david(dot)smith(at)gmail(dot)com>
Cc: Erik Darling <edarling80(at)gmail(dot)com>, PGSQL-Novice <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Update with last known location?
Date: 2014-01-30 20:08:30
Message-ID: 1391112510.13646.YahooMailNeo@web122305.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

James David Smith <james(dot)david(dot)smith(at)gmail(dot)com> wrote:
> On 30 January 2014 16:45, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:

>> update test_data x
>>   set the_geom = y.the_geom
>>   from test_data y
>>   where x.the_geom is null
>>     and y.ppid = x.ppid
>>     and y.the_geom is not null
>>     and y.point_time < x.point_time
>>     and not exists
>>         (
>>           select * from test_data z
>>             where z.ppid = y.ppid
>>               and z.the_geom is not null
>>               and z.point_time > y.point_time
>>               and z.point_time < x.point_time
>>         )
>> ;

> I can't run it on my actual data at the moment, or rather I don't want
> too, as I set the below query going an hour or so ago and thought I
> should let it finish really.
>
> If it hasn't finished when I come into work tomorrow (I'm leaving the
> office shortly) then I'll cancel it and give yours a crack instead. To
> my novice eye, your code looks like it'll be quicker than the below
> anyway.

> SELECT
>   data.ppid,
>   data.point_time,
>   CASE
>     WHEN data.the_geom IS NULL
>     THEN (
>       --Get all locations with an earlier time stamp for that ppid
>       SELECT geom.the_geom
>       FROM hybrid_location geom
>       WHERE data.ppid = geom.ppid
>       AND geom.point_time < data.point_time
>       AND geom.the_geom IS NOT NULL
>       AND NOT EXISTS (
>         -- Cull all but the most recent one
>         SELECT *
>         FROM hybrid_location cull
>         WHERE cull.ppid = geom.ppid
>         AND geom.the_geom IS NOT NULL
>         AND cull.point_time < data.point_time
>         AND cull.point_time > geom.point_time
>         AND cull.the_geom IS NOT NULL
>         )
>     )
>   ELSE data.the_geom
>   end
> FROM hybrid_location data;

Yeah, that's basically the same approach, but it uses a subquery
which I don't think can get pulled up -- so I think it will need to
do a lot of the work once for each row with a NULL the_geom column
where my version can do it once, period.

If you haven't tuned your configuration, you can probably speed up
any of these versions with a few tweaks to memory allocation and
cost factors.  The most significant for this query would probably
be to set work_mem to something around 25% of machine RAM divided
by the number of active connections you can have.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message James David Smith 2014-01-31 14:50:21 Re: Update with last known location?
Previous Message Jim Nasby 2014-01-30 19:51:57 Re: Fwd: Request for error explaination || Adding a new integer in indextupleData Structure