Quick Links

Alter/update large tables - VERRRY annoying behaviour!

From:	Dmitry Tkach <dmitry(at)openratings(dot)com>
To:	pgsql-admin(at)postgresql(dot)org(dot)pgsql-bugs(at)postgresql(dot)org(dot)pgsql-general(at)postgresql(dot)org(dot)pgsql-patches(at)postgresql(dot)org
Subject:	Alter/update large tables - VERRRY annoying behaviour!
Date:	2002-04-15 17:07:20
Message-ID:	3CBB08C8.7000705@openratings.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin pgsql-bugs pgsql-general pgsql-patches

Hi, everybody!

I was wonderring if it is a known "feature" to begin with, and, if there are any plans to fix
it in future?
I had two very large tables in my database (about 30 million rows each), connected by a foreign key,
and wanted to merge them together...

Something like this:

create table a
(
id int primary key,
some_data int
);
create table b
(
id int unique references a,
other_data int
);

So, what I did was:

alter table a add other_data int;
update a set other_data = b.other_data from b where b.id=a.id;

This took me awfully long, but worked (I guess).
I say 'I guess', because I wasn't able so far to verify that - when I triued to do

select * from a limit 1;

It just hungs on me ... at least, it looks like it does.

Lucky me, I have compiled the backend from sources with full debug info, because if I hadn't done that,
(as most users), I would certainly had thought, that my database is hopelessly corrupted, and would have to
recreate it from scratch :-(
Instead, I loaded the whole thing into a debugger, because that seems to be the only way to figure out what
the hell it is thinking about...
So, what I found out was that it seems to have recreated my entire table when I updated it, and left all the
old tuples in it as well, so, my 'select *...limit 1' query was cycling through 30 million deleted tuples, trying
to find the first one that was still valid, and that's waht was taking that long time...

First of all, a question for you - is ANY update to a table equivalent (in this respect) to a delete+insert?
Or is my problem specific to the fact that I have altered the table and add new columns?

Now, I understand, that, if I vacuum'ed it, the problem would have been resolved... The problems I have with it
though are:

- As I said, the behaviour was so unexpected, I would have trashed the whole databse, if I wasn't able to debug
it... If there is no other possible solution, I think, at the very least, it should give the user some indication
that it's not hopelessly hung, when doing that query...

- Vacuum, isn't the speediest thing in the world too (it's been running for a hour now, and still has not finished).
I was hoping to complete modifying my schema first, and then just vacuum everything once. So, it would be
REALLY, REALLY helpful for situations like that, if PG was smart enough to keep track somehow of those deleted
tuples, to avoid having to scan through them all every time...
In my particular situation, the solution would be trivial (just remembering the address of the first valid
tuple would suffice - because the entire table was updated)... I am not familiar enough with internals to suggest
anything more general than this, but EVEN fixing only this particular scenario, would, I believe, be extremely
useful....

Do you agree?

Thanks a lot!

Dima.

Responses

Re: Alter/update large tables - VERRRY annoying behaviour! at 2002-04-15 18:24:51 from Neil Conway

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Tom Lane	2002-04-15 17:32:10	Re: Extending table name max length
Previous Message	Tom Lane	2002-04-15 15:19:44	Re: Max Number of Databases on a Server.

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Neil Conway	2002-04-15 18:24:51	Re: Alter/update large tables - VERRRY annoying behaviour!
Previous Message	Michael Loftis	2002-04-15 07:48:09	Re: Bug #630: date/time storage problem: timestamp parsed

Browse pgsql-general by date

	From	Date	Subject
Next Message	Andrew Gould	2002-04-15 17:25:09	Re: Export From Postresql to a DBF File Format
Previous Message	Tarabas	2002-04-15 17:02:16	Strange Update-Bug in postgres (is it a feature?) ?!

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Marc G. Fournier	2002-04-15 17:42:02	Re: Commands/ directory reorganisation
Previous Message	Rod Taylor	2002-04-15 14:56:25	Doc fix for INSERT ... (DEFAULT, ...)