Re: Index impact on update?

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Index impact on update?
Date: 2017-01-05 00:18:57
Message-ID: dfbdbe34-9037-48cc-1623-ab7fdc8fcf29@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 01/04/2017 05:59 PM, Israel Brewster wrote:
> Short version:
> Do indexes impact the speed of an UPDATE, even when the indexed columns
> aren't changing?
>

They shouldn't, as long as the updated tuple can be updated on the same
page (8kB chunk of data). In that case we can do a HOT update for the
row, without updating the index(es).

But as you're updating the whole table, that would require about 50% of
all pages to be free, which is unlikely to be true. So perhaps some
updates can proceed without touching indexes, but most can't.

> Details:
> I have a table containing geographical data (Latitude, longitude, and
> elevation) with 406,833,705 records. The Latitude and Longitude columns
> are indexed. In order to better utilize the data, I've been looking into
> PostGIS, and decided I wanted to add a "Location" column with PostGIS
> type "GEOGRAPHY(point)". I then tried to populate it from the existing
> latitude/longitude data using the following query:
>
> UPDATE data SET
> location=ST_GeographyFromText('SRID=4326;POINT('||lng::text||'
> '||lat::text||')');
>
> I expected this update to take quite a while, since it has 406 million
> rows to update, but at this point it's been over 19 hours since I
> started the query, and it still hasn't completed.
>
> I'm wondering if the presence of the indexes could be slowing things
> down even though the indexed columns aren't being updated? Would I be
> better off canceling the update query, dropping the indexes, and trying
> again? Or is more likely that the update query is "almost" done, and it
> would be better to just let it run it's course? Or is there an even
> better option, such as perhaps exporting the data, adding the additional
> column in a text editor, and re-importing the data with a COPY command?
>

As explained above, it's likely that such full-table update has to
modify the indexes anyway, making it much more expensive. Without
additional information it's however impossible to confirm that's what's
causing the long update in this case - there may be other bits slowing
it down - e.g. foreign keys checks, triggers.

CREATE TABLE AS SELECT would not pay any of those costs, of course.
Also, if you're running with wal_level=minimal, it would not have to
write the changes into WAL, while the regular UPDATE has to do that.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Gwork 2017-01-05 01:23:43 Postgresql query HAVING do not work
Previous Message DrakoRod 2017-01-05 00:13:49 Re: could not load library "$libdir/sslutils": in pg_upgrade process