Re: Re: best practice for moving millions of rows to child table when setting up partitioning?

From: "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
To: Mark Stosberg <mark(at)summersault(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Re: best practice for moving millions of rows to child table when setting up partitioning?
Date: 2011-04-27 20:50:56
Message-ID: 20110427205056.GA5868@staff-mud-56-27.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, Apr 27, 2011 at 04:17:16PM -0400, Mark Stosberg wrote:
> On 04/27/2011 10:48 AM, Mark Stosberg wrote:
> >
> > Hello,
> >
> > I'm working on moving a table with over 30 million to rows to be
> > partitioned. The table seeing several inserts per second. It's
> > essentially an activity log that only sees insert activity and is
> > lightly used for reporting, such that queries against it can safely be
> > disabled during a transition.
> >
> > I'm looking for recommendations for a way to do this that will be least
> > disruptive to the flow of inserts statements that will continue to
> > stream in.
> >
> > Here's the plan which seems best to me at the moment. Is it is
> > reasonable?
>
> I revised my plan based on feedback and mentioned resources here, and
> also due to my own concerns about the resource and performance impact of
> using the INSERT .. SELECT pattern on millions of rows at a time.
>
> Here's my revised plan for the transition, which avoids using large
> INSERT..SELECT statements, and only requires a minimal amount of the
> transition to work to happen once the application has started to insert
> data into the child tables. Seem reasonable?
>
> New Plan
> ---------
>
> 1. Create the child tables targeted to contain data from the parent as
> standalone tables (including archive tables and the current month).
>
> 2. Take a full database backup from this morning and extract the COPY
> statement for the parent table. Manually split it up by date to create
> multiple copy statements, one for each partition.
>
> 3. Run the COPY statements to load the data into each child table. So
> far, everything has happened outside of application access.
>
> 4. Find the max ID that has been inserted in the current child table.
>
> 5. INSERT .. SELECT the missing rows from the last backup from the parent
> table to the current child table to be. Again, note the max ID.
>
> 6. Now, during a maintenance window:
> - alter the child tables to inherit the parent
> - Set up the trigger which starts redirecting inserts
> from the parent table to the child table.
> - INSERT .. SELECT the file the final few missing rows from
> the parent to the current child
> - TRUNCATE then CLUSTER the parent table.
>

Hi Mark,

I used a similar process to migrate to a partitioned table from a
non-partitioned table. However, I the future partitions first and
put them into place. Then I updated the trigger to push to the
child tables. Then once the in-use "daily" or "weekly" table rolled
I back-filled the existing tables from the big table. Anyway, my
two cents.

Regards,
Ken

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Scott Marlowe 2011-04-28 00:26:14 Re: Re: best practice for moving millions of rows to child table when setting up partitioning?
Previous Message ramon cruel 2011-04-27 20:29:10 Please, i want exit here