Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Kevin Grittner <kgrittn(at)mail(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL
Date: 2013-11-28 22:17:07
Message-ID: CA+TgmoYBnQfS4ACVmAq9i4KQ+EOJo053oHhsfBkWEnLNCbsu5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 27, 2013 at 4:33 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Sat, Jan 12, 2013 at 02:14:03PM -0500, Kevin Grittner wrote:
>> Amit Kapila wrote:
>> > On Thursday, January 10, 2013 6:09 AM Josh Berkus wrote:
>>
>> >> Surely VACUUM FULL should rebuild the visibility map, and make
>> >> tuples in the new relation all-visible, no?
>>
>> Certainly it seems odd to me that VACUUM FULL leaves the the table
>> in a less-well maintained state in terms of visibility than a
>> "normal" vacuum. VACUUM FULL should not need to be followed by
>> another VACUUM.
>>
>> > I think it cannot made all visible.
>>
>> I don't think all tuples in the relation are necessarily visible to
>> all transactions, but the ones which are should probably be flagged
>> that way.
>
> I have developed the attached proof-of-concept patch to fix the problem
> of having no visibility map after CLUSTER or VACUUM FULL. I tested with
> these queries:
>
> CREATE TABLE test(x INT PRIMARY KEY);
> INSERT INTO test VALUES (1);
> VACUUM FULL test; -- or CLUSTER
> SELECT relfilenode FROM pg_class WHERE relname = 'test';
> relfilenode
> -------------
> 16399
>
> Then 'ls -l data/base/16384/16399*' to see the *_vm file. I am not sure
> how to test that the vm contents are valid.
>
> This patch is fairly tricky because our CLUSTER/VACUUM FULL behavior
> does not do writes through the shared buffer cache, as outlined in this
> C comment block:
>
> * We can't use the normal heap_insert function to insert into the new
> * heap, because heap_insert overwrites the visibility information.
> * We use a special-purpose raw_heap_insert function instead, which
> * is optimized for bulk inserting a lot of tuples, knowing that we have
> * exclusive access to the heap. raw_heap_insert builds new pages in
> * local storage. When a page is full, or at the end of the process,
> * we insert it to WAL as a single record and then write it to disk
> * directly through smgr. Note, however, that any data sent to the new
> * heap's TOAST table will go through the normal bufmgr.
>
> I originally tried to do this higher up in the stack but ran into
> problems because I couldn't access the new heap page so I had to do it
> at the non-shared-buffer page level. I reused the lazy vacuum routines.
>
> I need to know this is the right approach, and need to know what things
> are wrong or missing.

The fact that you've needed to modify SetHintBits() to make this work
is pretty nasty. I'm not exactly sure what to do about that, but it
doesn't seem good.

This patch also has the desirable but surprising consequence that hint
bits will be set as a side effect of update_page_vm(), which means
that you'd better do that BEFORE checksumming the page.

I wonder if we ought to mark each page as all-visible in
raw_heap_insert() when we first initialize it, and then clear the flag
when we come across a tuple that isn't all-visible. We could try to
set hint bits on the tuple before placing it on the page, too, though
I'm not sure of the details.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2013-11-28 22:18:28 Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block
Previous Message Robert Haas 2013-11-28 22:00:05 Re: new unicode table border styles for psql