Re: ALTER TABLE speed (adding foreign keys)

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Shaw Terwilliger <sterwill(at)sourcegear(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ALTER TABLE speed (adding foreign keys)
Date: 2001-02-15 15:57:44
Message-ID: Pine.BSF.4.21.0102150749170.6101-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 15 Feb 2001, Shaw Terwilliger wrote:

> I have a table, contacts, with two fields, id (int), contactid (int).
> I have another table, users, with a bunch of fields, but one important
> one, id (SERIAL), which is the table key.
>
> I needed to test my database setup (and software) with 1.5 million users,
> each of whom will have (on average), 10 contacts. So for each user in
> the users table, I'll need ten records in the contacts table.
>
> I wrote a little script to spew some sample data into a file for
> COPY into the tables. Since 1.5 million * 10 contacts is 15 million
> rows, I created my tables without foreign keys (both id and contactid
> will eventually be foreign keys into users, since I often select on both
> of them) and constraints (CHECK id <> contactid).
>
> The COPY goes much, much faster with these constraints absent from my tables.
> So I go to ALTER TABLE contacts, to add these foreign keys. One backend
> fires up, allocates a few MB of RAM, and takes a long, long time to check
> all the existing (15 million) rows. At least, that's what I think it's doing.
>
> postgres 394 98.1 1.6 7752 4300 pts/0 R< Feb14 632:41 /usr/lib/postgresql/bin/postgres localhost sterwill im ALTER
>
> The process has taken approximately 10 hours of CPU time so far.
> I understand placing a foreign key constraint on an existing table can be
> a very expensive operation, but I just wanted to make sure I won't be
> "waiting forever" on this process. I've got more tables to ALTER, and if
> they all take this long, I'll just wipe the database, load the schema
> again (_with_ the constraints), and just let the COPY take a day or two.

Currently the alter table implementation effectively means you're just
deferring all the trigger calls, so it's basically going to be doing the
15 million trigger calls. It should finish, but I'm really uncertain of
how long it might take. I've considered trying to do it as one query
which should be faster but requires a second copy of the logic to figure
out what the fk constraint means so I haven't done it since we're not
quite done with specifying the constraints fully yet.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2001-02-15 15:59:11 Re: regular expression substittion function?
Previous Message Michael Ansley 2001-02-15 15:54:07 RE: DSN-less connection to Postgres-database