| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> | 
|---|---|
| To: | David Steele <david(at)pgmasters(dot)net> | 
| Cc: | Peter Geoghegan <pg(at)bowt(dot)ie>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> | 
| Subject: | Re: GUC for cleanup indexes threshold. | 
| Date: | 2017-03-29 06:23:41 | 
| Message-ID: | CAD21AoAXVn57qo=yXghWEa+a2zX0SgZft2et2LNjvyDZvcYMHQ@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Wed, Mar 29, 2017 at 12:23 AM, David Steele <david(at)pgmasters(dot)net> wrote:
> On 3/23/17 1:54 AM, Masahiko Sawada wrote:
>>
>> On Wed, Mar 15, 2017 at 7:51 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>>>
>>> On Tue, Mar 14, 2017 at 3:10 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>>>>
>>>> We already have BTPageOpaqueData.btpo, a union whose contained type
>>>> varies based on the page being dead. We could just do the same with
>>>> some other field in that struct, and then store epoch there. Clearly
>>>> nobody really cares about most data that remains on the page. Index
>>>> scans just need to be able to land on it to determine that it's dead,
>>>> and VACUUM needs to be able to determine whether or not there could
>>>> possibly be such an index scan at the time it considers recycling..
>>>
>>>
>>> ISTM that we need all of the fields within BTPageOpaqueData even for
>>> dead pages, actually. The left links and right links still need to be
>>> sane, and the flag bits are needed. Plus, the field that stores an XID
>>> already is clearly necessary. Even if they weren't needed, it would
>>> probably still be a good idea to keep them around for forensic
>>> purposes. However, the page header field pd_prune_xid is currently
>>> unused for indexes, and is the same width as CheckPoint.nextXidEpoch
>>> (the extra thing we might want to store -- the epoch).
>>>
>>> Maybe you could store the epoch within that field when B-Tree VACUUM
>>> deletes a page, and then compare that within _bt_page_recyclable(). It
>>> would come before the existing XID comparison in that function. One
>>> nice thing about this idea is that pd_prune_xid will be all-zero for
>>> index pages from the current format, so there is no need to take
>>> special care to make sure that databases that have undergone
>>> pg_upgrade don't break.
>>>
>>
>> Thank you for the suggestion!
>> If we store the poch within union field, I think we will not be able
>> to use BTPageOpaqueData.btpo.xact at the same time. Since comparing
>> btpo.xact is still necessary to determine if that page is recyclable
>> we cannot store the epoch into the same union field. And if we store
>> it into BTPageOpaqueData, it would break disk compatibility.
>
>
> I have marked this patch "Waiting for Author".
>
> This thread has been idle for five days.  Please respond with a new patch by
> 2017-03-30 00:00 AoE (UTC-12) or this submission will be marked "Returned
> with Feedback".
>
I was thinking that the status of this patch is still "Needs review"
because I sent latest version patch[1]. We're still under discussion
how safely we can skip to the cleanup vacuum on btree index. That
patch has some restrictions but it would be good for first step.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Masahiko Sawada | 2017-03-29 06:31:32 | Re: pgsql: Allow vacuums to report oldestxmin | 
| Previous Message | Michael Paquier | 2017-03-29 06:22:32 | Re: [patch] reorder tablespaces in basebackup tar stream for backup_label |