Skip site navigation (1) Skip section navigation (2)

Alter/update large tables - VERRRY annoying behaviour!

From: Dmitry Tkach <dmitry(at)openratings(dot)com>
To: pgsql-admin(at)postgresql(dot)org(dot)pgsql-bugs(at)postgresql(dot)org(dot)pgsql-general(at)postgresql(dot)org(dot)pgsql-patches(at)postgresql(dot)org
Subject: Alter/update large tables - VERRRY annoying behaviour!
Date: 2002-04-15 17:07:20
Message-ID: 3CBB08C8.7000705@openratings.com (view raw or flat)
Thread:
Lists: pgsql-adminpgsql-bugspgsql-generalpgsql-patches
Hi, everybody!

I was wonderring if it is a known "feature" to begin with, and, if there are any plans to fix
it in future?
I had two very large tables in my database (about 30 million rows each), connected by a foreign key,
and wanted to merge them together...

Something like this:

create table a
(
    id int primary key,
    some_data int
);
create table b
(
     id int unique references a,
     other_data int
);

So, what I did was:

alter table a add other_data int;
update a set other_data = b.other_data from b where b.id=a.id;

This took me awfully long, but worked (I guess).
I say 'I guess', because I wasn't able so far to verify that - when I triued to do

select * from a limit 1;

It just hungs on me ... at least, it looks like it does.

Lucky me, I have compiled the backend from sources with full debug info, because if I hadn't done that,
(as most users), I would certainly had thought, that my database is hopelessly corrupted, and would have to
recreate it from scratch :-(
Instead, I loaded the whole thing into a debugger, because that seems to be the only way to figure out what
the hell it is thinking about...
So, what I found out was that it seems to have recreated my entire table when I updated it, and left all the
old tuples in it as well, so, my 'select *...limit 1' query was cycling through 30 million deleted tuples, trying
to find the first one that was still valid, and that's waht was taking that long time...

First of all, a question for you - is ANY update to a table equivalent (in this respect) to a delete+insert?
Or is my problem specific to the fact that I have altered the table and add new columns?

Now, I understand, that, if I vacuum'ed it, the problem would have been resolved... The problems I have with it
though are:

- As I said, the behaviour was so unexpected, I would have trashed the whole databse, if I wasn't able to debug
it... If there is no other possible solution, I think, at the very least, it should give the user some indication
that it's not hopelessly hung, when doing that query...

- Vacuum, isn't the speediest thing in the world too (it's been running for a hour now, and still has not finished).
I was hoping to complete modifying my schema first, and then just vacuum everything once. So, it would be
REALLY, REALLY helpful for situations like that, if PG was smart enough to keep track somehow of those deleted
tuples, to avoid having to scan through them all every time...
In my particular situation, the solution would be trivial (just remembering the address of the first valid
tuple would suffice - because the entire table was updated)... I am not familiar enough with internals to suggest
anything more general than this, but EVEN fixing only this particular scenario, would, I believe, be extremely
useful....

Do you agree?

Thanks a lot!

Dima.


Responses

pgsql-patches by date

Next:From: Marc G. FournierDate: 2002-04-15 17:42:02
Subject: Re: Commands/ directory reorganisation
Previous:From: Rod TaylorDate: 2002-04-15 14:56:25
Subject: Doc fix for INSERT ... (DEFAULT, ...)

pgsql-admin by date

Next:From: Tom LaneDate: 2002-04-15 17:32:10
Subject: Re: Extending table name max length
Previous:From: Tom LaneDate: 2002-04-15 15:19:44
Subject: Re: Max Number of Databases on a Server.

pgsql-bugs by date

Next:From: Neil ConwayDate: 2002-04-15 18:24:51
Subject: Re: Alter/update large tables - VERRRY annoying behaviour!
Previous:From: Michael LoftisDate: 2002-04-15 07:48:09
Subject: Re: Bug #630: date/time storage problem: timestamp parsed

pgsql-general by date

Next:From: Andrew GouldDate: 2002-04-15 17:25:09
Subject: Re: Export From Postresql to a DBF File Format
Previous:From: TarabasDate: 2002-04-15 17:02:16
Subject: Strange Update-Bug in postgres (is it a feature?) ?!

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group