Quick Links

Storage/Performance and splitting a table

From:	"Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Storage/Performance and splitting a table
Date:	2005-11-19 17:54:23
Message-ID:	437F66CF.7080402@modgraph-usa.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

In a recent thread, several people pointed out that UPDATE = DELETE+INSERT. This got me to wondering.

I have a table that, roughly, looks like this:

create table doc (
id integer primary key,
document text,
keywords tsvector
);

where "keywords" has a GIST index. There are about 10 million rows in the table, and an average of 20 keywords per document. I have two questions.

First, I occasionally rebuild the keywords, after which the VACUUM FULL ANALYZE takes a LONG time - like 24 hours. Given the UPDATE = DELETE+INSERT, it sounds like I'd be better off with something like this:

create table doc (
id integer primary key,
document text,
);
create table keywords (
id integer primary key,
keywords tsvector
);

Then I could just drop the GIST index, truncate the keywords table, rebuild the keywords, and reindex. My suspicion is that VACUUM FULL ANALYZE would be quick -- there would be no garbage to collect, so all it would to do is the ANALYZE part.

My second question: With the doc and keywords split into two tables, would the tsearch2/GIST performance be faster? The second schema's "keywords" table has just pure keywords (no documents); does that translate to fewer blocks being read during a tsearch2/GIST query? Or are the "document" and "keywords" columns of the first schema already stored separately on disk so that the size of the "document" data doesn't affect the "keywords" search performance?

Thanks,
Craig

In response to

Re: Perl DBD and an alarming problem at 2005-11-18 03:46:31 from Michael Fuhr

Responses

Re: Storage/Performance and splitting a table at 2005-11-19 19:03:04 from Michael Stone

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Michael Stone	2005-11-19 19:03:04	Re: Storage/Performance and splitting a table
Previous Message	Luke Lonergan	2005-11-19 16:15:29	Re: Hardware/OS recommendations for large databases (