Re: Thoughts on maintaining 7.3

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Thoughts on maintaining 7.3
Date: 2003-10-05 21:41:59
Message-ID: 200310052141.h95LfxA08210@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera wrote:
> > Yeah, I think that's exactly the important point. These days there's
> > not a lot of reason to do VACUUM FULL unless you have a major amount of
> > restructuring to do. I would once have favored maintaining two code
> > paths with two strategies, but now I doubt it's worth the trouble.
> > (Or I should say, we have two code paths, the other being lazy VACUUM
> > --- do we need three?)
>
> There are two points that could be made here:
>
> 1. We do not want users having to think too hard about what kind of
> VACUUM they want. This probably botches Bruce's idea of an additional
> VACUUM DATA command.
>
> 2. We do not want to expose the VACUUM command family at all. The
> decisions about what code paths should be taken are best left to the
> backend-integrated vacuum daemon, which has probably much better
> information than users.

Agreed. We need to head in a direction where vacuum is automatic. I
guess the question is whether an automatic method would ever user VACUUM
DATA?

I just did a simple test. I did:

test=> CREATE TABLE test (x INT, y TEXT);
CREATE TABLE
test=> INSERT INTO test VALUES (1, 'lk;jasdflkjlkjawsiopfjqwerfokjasdflkj');
INSERT 17147 1

test=> INSERT INTO test SELECT * FROM test;

{ repeat until 65k rows are inserted, so there are 131k rows}

test=> INSERT INTO test SELECT 2, y FROM test;
INSERT 0 131072
test=> DELETE FROM test WHERE x=1;
DELETE 131072
test=> \timing
Timing is on.
test=> VACUUM FULL;
VACUUM
Time: 4661.82 ms
test=> INSERT INTO test SELECT 3, y FROM test;
INSERT 0 131072
Time: 7925.57 ms
test=> CREATE INDEX i ON test(x);
CREATE INDEX
Time: 3337.96 ms
test=> DELETE FROM test WHERE x=2;
DELETE 131072
Time: 3204.18 ms
test=> VACUUM FULL;
VACUUM
Time: 10523.69 ms
test=> REINDEX TABLE test;
REINDEX
Time: 2193.14 ms

Now, as I understand it, this is the worst-case for VACUUM FULL. What
we have here is 4661.82 for VACUUM FULL without an index, and 10523.69
for VACUUM FULL with an index, and REINDEX takes 2193.14. If we assume
VACUUM FULL with REINDEX will equal the time of VACUUM without the index
plus the REINDEX time, we have 4661.82 + 2193.14, or 6854.96 vs.
10523.69, so clearly VACUUM REINDEX is a win for this case. What I
don't know is what percentage of a table has to be expired for REINDEX
to be a win. I assume if only one row is expired, you get 4661.82 +
2193.14 vs. just 4661.82, roughly.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrizio Mazzoni 2003-10-05 21:44:41 psql italian translation
Previous Message Neil Conway 2003-10-05 21:37:29 Re: Learning PostgreSQL