Re: Switching Primary Keys to BigInt

From: Mohamed Wael Khobalatte <mkhobalatte(at)grubhub(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Switching Primary Keys to BigInt
Date: 2020-07-26 18:15:18
Message-ID: CABZeWdy+9uKiW9DiC4tk=FyO8YpJX-+yOsfFvdWJRB9Egre_2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Jul 21, 2020 at 11:30 AM Mohamed Wael Khobalatte <
mkhobalatte(at)grubhub(dot)com> wrote:

> Hi all,
>
> We are running 9.6, and we are planning to move some primary keys from int
> to bigint because we are approaching the type limit. We understand this
> requires some downtime, but we want to know if there are things we can do
> to limit it.
>
> Here are our steps, with questions at the end.
>
> ALTER TABLE some_table ADD COLUMN new_id bigint;
> /* in batches, we update all the rows to new_id = id */
> CREATE UNIQUE INDEX CONCURRENTLY some_table_pkey_new ON
> some_table(new_id);
> /* take the apps down */
> BEGIN;
> LOCK TABLE some_table;
> UPDATE some_table SET new_id = id WHERE new_id IS NULL;
> ALTER SEQUENCE some_table_id_seq OWNED BY some_table.new_id;
> ALTER TABLE some_table ALTER COLUMN new_id SET DEFAULT
> nextval('some_table_id_seq'::regclass);
> ALTER TABLE some_table DROP CONSTRAINT some_table_pkey;
> ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING
> INDEX some_table_pkey_new;
> ALTER TABLE some_table DROP COLUMN id;
> ALTER TABLE some_table RENAME COLUMN new_id to id;
> COMMIT;
>
> We are concerned with this step:
>
> > ALTER TABLE some_table ADD CONSTRAINT some_table_pkey PRIMARY KEY USING
> INDEX some_table_pkey_new;
>
> which requires a table scan. Is there a way to avoid that? Would a not
> null constraint on new_id that is created as invalid first, then validated
> later help us? I tried on a table with about 50 million records, and I see
> a drop from 19 seconds spent on the alter to 8 seconds, which is
> inconclusive (both after restarts for cold cache). Is there another way to
> tell? Or does PG just have to do a sequential scan?
>
> If the constraint idea works, we would probably need to add a trigger to
> update new_id, but that's TBD.
>

The above process I have outlined worked beautifully. Downtime was exactly
what I thought it would be, i.e. equal to a sequential scan of the table in
question (almost down to the second). I am writing this in case someone out
there wants to adopt a similar mechanism.

Thank you all for your valuable inputs.

On Wed, Jul 22, 2020 at 4:52 PM Mohamed Wael Khobalatte <
mkhobalatte(at)grubhub(dot)com> wrote:

>
>
> On Wed, Jul 22, 2020 at 4:45 PM Ireneusz Pluta/wp.pl <ipluta(at)wp(dot)pl> wrote:
>
>> W dniu 2020-07-21 o 17:30, Mohamed Wael Khobalatte pisze:
>> > we are planning to move some primary keys from int to bigint because we
>> are approaching the type limit
>> If that does not break your business logic, you might arrange to use the
>> negative half of the ::int
>> value range. Ugly, but this might at least buy you some time before
>> finding the definite and elegant
>> way, if you are under some pressure. I do not recommend this, but this is
>> what once saved my life
>> (or at least one night), after I realized that my PK already reached the
>> limit :-).
>>
>
> Very clever. I think we are set with the current approach. The issue was
> more how much downtime, not how fast we are approaching the limit (which is
> also a real issue but not of concern in this thread).
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Susan Hurst 2020-07-26 18:25:34 shp2pgsql is missing
Previous Message Surya Widyanto 2020-07-26 01:17:01 Re: [SOLUTION] Slow or Cannot Connect to PostgreSQL Instance Service on Windows 10