Re: Time to move table to new tablespace

From: Jason Buberel <jason(at)altosresearch(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: PostgreSQL <pgsql-general(at)postgresql(dot)org>, "Tom Sparks (Applied Operations)" <tsparks(at)appliedops(dot)net>
Subject: Re: Time to move table to new tablespace
Date: 2012-01-09 19:10:56
Message-ID: CAAPEApHmfdDN=mkJwzcv5R9e3y6At9T8YjpSbXZkkfTmD0rX5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Craig,

Good suggestion - I have experimented with table partitioning in the past,
but the impact on performance was considerable.

In our case, we have one large table with real estate statistics arranged
in time (weekly for the last five years) and geography (every zip in the
US). You could imagine a date-range partition strategy (each year is a
separate partition, for example) or a spatial (10 partitioned tables based
on the modulus 10 of ZIP code, perhaps) approach.

However, our query patterns span both data dimensions:

"Select median price for every zip code as of 2012-01-06" (customer exports)
"Select median price for 94086 from 2005-01-01 through 2012-01-06"
(charting apps)

So by partitioning in one dimension we impact queries in the other.

If you have a brilliant solution to problems like this, we'd be happy to
hire you (not a joke).

Cheers,
Jason

On Sun, Jan 8, 2012 at 5:12 PM, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au> wrote:

> On 7/01/2012 10:52 PM, Jason Buberel wrote:
>
> I'm considering the migration of an existing large (2.3TB) table to a new
> tablespace. The table size, according to the '\dt+' command:
>
> public | city_summary | table | altosresearch | 2345 GB |
>
> Are there any considerations - besides the usual disk and network IO
> constraints - that I need to take into account when estimating the amount
> of time that would be required for the following commands to complete?
>
> psql> create tablespace 'newstorage' location '/some/new/path';
> psql> alter table city_summary set tablespace = 'newstorage';
>
> Any other recommendations are welcome, such as "You are an idiot for not
> using partitioning in the first place" :)
>
>
> Maybe you should create a set of partitioned tables in the new tablespace,
> copy your data over, then drop the old table and rename the partition to
> the old table's name instead? If the big table keeps on accumulating data
> (via INSERTs) you can add a trigger that mirrors all updates to the
> partition while the copy runs.
>
> This won't work so well if the big table has UPDATEs and DELETEs too,
> since you can't delete or update records from a trigger before they've been
> copied to the new table and committed.
>
> --
> Craig Ringer
>

--
Jason L. Buberel
CTO, Altos Research
http://www.altosresearch.com/
650.603.0907

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message akp geek 2012-01-09 19:16:46 Re: queries timeout during backup postgres database
Previous Message rama 2012-01-09 18:45:40 Re: How do you change the size of the WAL files?