Skip site navigation (1) Skip section navigation (2)

Help, 400 million rows, query slow, I stopped after 5 days

From: "Christian Hofmann" <christian(dot)hofmann(at)gmx(dot)de>
To: <pgsql-novice(at)postgresql(dot)org>
Subject: Help, 400 million rows, query slow, I stopped after 5 days
Date: 2006-01-30 15:31:07
Message-ID: 00b801c625b2$30d04600$9000a8c0@taschenrechner (view raw or flat)
Thread:
Lists: pgsql-novice
Hello,

I have a little table with 400,000,000 rows of text (about 40 chars each).
I want to delete values that are exisiting more than two times.

So I created a new table (table2), the same than the first, but with an
unique index.
Then i put in the following query:

INSERT INTO table2 (my_col) SELECT DISTINCT my_col from table1

But this query was so slow that I stopped it. I let it run for five days!

In the Server there is only 1 GB RAM. My_col is using a bitmap index.

Maybe I should creating a trigger on table2 that will check before inserting
if this value is already in the table2.


Or is it possible to just use the unique index in table2 and do something
like:

INSERT INTO table2 (my_col) SELECT my_col from table1

And ignoring there errors that are thrown because there are duplicate
values?

I hope you can help me,

Thank you,

Christian



Responses

pgsql-novice by date

Next:From: OlegDate: 2006-01-30 15:32:04
Subject: Re: wrong objects order using pg_dump
Previous:From: Tom LaneDate: 2006-01-30 15:16:50
Subject: Re: wrong objects order using pg_dump

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group