Re: pageinspect and hash indexes

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pageinspect and hash indexes
Date: 2017-03-20 13:23:40
Message-ID: CAE9k0P=vtL4+30b5E=OkgRay-T+0LidRTR4p1jSKhPTd2y3BTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 20, 2017 at 9:31 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Sat, Mar 18, 2017 at 5:13 PM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>> On Sat, Mar 18, 2017 at 1:34 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> On Sat, Mar 18, 2017 at 12:12 AM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>>>> On Fri, Mar 17, 2017 at 10:54 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>>>> While trying to figure out some bloating in the newly logged hash indexes,
>>>>> I'm looking into the type of each page in the index. But I get an error:
>>>>>
>>>>> psql -p 9876 -c "select hash_page_type(get_raw_page('foo_index_idx',x)) from
>>>>> generate_series(1650,1650) f(x)"
>>>>>
>>>>> ERROR: page is not a hash page
>>>>> DETAIL: Expected 0000ff80, got 00000000.
>>>>>
>>>>> The contents of the page are:
>>>>>
>>>>> \xa4000000d8f203bf65c900001800f01ff01f0420...
>>>>>
>>>>> (where the elided characters at the end are all zero)
>>>>>
>>>>> What kind of page is that actually?
>>>>
>>>> it is basically either a newly allocated bucket page or a freed overflow page.
>>>>
>>>
>>> What makes you think that it can be a newly allocated page?
>>> Basically, we always initialize the special space of newly allocated
>>> page, so not sure what makes you deduce that it can be newly allocated
>>> page.
>>
>> I came to know this from the following experiment.
>>
>> I created a hash index and kept on inserting data in it till the split happens.
>>
>> When split happened, I could see following values for firstblock and
>> lastblock in _hash_alloc_buckets()
>>
>> Breakpoint 1, _hash_alloc_buckets (rel=0x7f6ac951ee30, firstblock=34,
>> nblocks=32) at hashpage.c:993
>> (gdb) n
>> (gdb) p firstblock
>> $15 = 34
>> (gdb) p nblocks
>> $16 = 32
>> (gdb) n
>> (gdb) p lastblock
>> $17 = 65
>>
>> AFAIU, this bucket split resulted in creation of new bucket pages from
>> block number 34 to 65.
>>
>> The contents for metap are as follows,
>>
>> (gdb) p *metap
>> $18 = {hashm_magic = 105121344, hashm_version = 2, hashm_ntuples =
>> 2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
>> hashm_bmshift = 15,
>> hashm_maxbucket = 32, hashm_highmask = 63, hashm_lowmask = 31,
>> hashm_ovflpoint = 6, hashm_firstfree = 0, hashm_nmaps = 1,
>> hashm_procid = 450,
>> hashm_spares = {0, 0, 0, 0, 0, 1, 1, 0 <repeats 25 times>},
>> hashm_mapp = {33, 0 <repeats 127 times>}}
>>
>> Now, if i try to check the page type for block number 65, this is what i see,
>>
>> test=# select * from hash_page_type(get_raw_page('con_hash_index', 65));
>> ERROR: page is not a hash page
>> DETAIL: Expected 0000ff80, got 00000000.
>> test=#
>>
>
> The contents of such a page should be zero and Jeff has reported some
> valid-looking contents of the page. If you see this page contents as
> zero, then we can conclude what Jeff is seeing was an freed overflow
> page.

As shown in the mail thread-[1], the contents of metapage are,

(gdb) p *metap
$18 = {hashm_magic = 105121344, hashm_version = 2, hashm_ntuples
=2593, hashm_ffactor = 81, hashm_bsize = 8152, hashm_bmsize = 4096,
hashm_bmshift = 15, hashm_maxbucket = 32, hashm_highmask = 63,
hashm_lowmask = 31, hashm_ovflpoint = 6, hashm_firstfree = 0,
hashm_nmaps = 1,
hashm_procid = 450, hashm_spares = {0, 0, 0, 0, 0, 1, 1, 0 <repeats
25 times>}, hashm_mapp = {33, 0 <repeats 127 times>}}

postgres=# select spares from
hash_metapage_info(get_raw_page('con_hash_index', 0));
spares
-------------------------------------------------------------------
{0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
(1 row)

Here, if you see the spare page count is just 1 which corresponds to
bitmap page. So, that means there is no overflow page in hash index
table and neither I have ran any DELETE statements in my experiment
that would result in freeing an overflow page.

Also, the page header content for such a page is,

$9 = {pd_lsn = {xlogid = 0, xrecoff = 23638120}, pd_checksum = 0,
pd_flags = 0, pd_lower = 24, pd_upper = 8176, pd_special = 8176,
pd_pagesize_version = 8196, pd_prune_xid = 0, pd_linp = 0x1f3aa88}

From values of pd_lower and pd_upper it is clear that it is an empty page.

The content of this page is,

\x00000000b0308a01000000001800f01ff01f042000.....

I think it is not just happening for freed overflow but also for newly
allocated bucket page. It would be good if we could mark freed
overflow page as UNUSED page rather than just initialising it's header
portion and leaving the page type in special area as it is. Attached
is the patch '0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch' that
marks a freed overflow page as an unused page.

Also, I have now changed pageinspect module to handle unused and empty
hash index page. Attached is the patch
(0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch) for
the same.

[1] - https://www.postgresql.org/message-id/CAE9k0P%3DN%2BJjzqnHqrURE7ZQMgySRpv%3DBkjafbz%3DpeD4cbCgbhA%40mail.gmail.com

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

Attachment Content-Type Size
0001-mark_freed_ovflpage_as_UNUSED_pagetype.patch application/x-download 1.4 KB
0002-allow_pageinspect_handle_UNUSED_OR_EMPTY_hash_pages.patch application/x-download 774 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-03-20 13:30:46 Re: [COMMITTERS] pgsql: Improve pg_dump regression tests and code coverage
Previous Message Stephen Frost 2017-03-20 13:16:56 Re: Inadequate traces in TAP tests