Re: pageinspect: Hash index support

From: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pageinspect: Hash index support
Date: 2016-09-21 12:21:55
Message-ID: 5d64a536-cc90-ce96-b68f-5b6bd3206476@redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 09/21/2016 07:24 AM, Ashutosh Sharma wrote:
>> The development branch is @
>> https://github.com/jesperpedersen/postgres/tree/pageinspect_hash
>
> I am working on centOS 7. I am still facing the issue when applying
> your patch using "git apply" command but if i use 'patch' command it
> works fine however i am seeing some warning messages with 'patch'
> command as well.
>

git apply w/ v4 works here, so you will have to investigate what happens
on your side.

> I continued reviewing your v4 patch and here are some more comments:
>
> 1). Got below error if i pass meta page to hash_page_items(). Does
> hash_page_items() accept only bucket or bitmap page as input?
>
> postgres=# SELECT * FROM hash_page_items(get_raw_page('hash_i4_index', 0));
> ERROR: invalid memory alloc request size 18446744073709551593
>

page_stats / page_items should not be used on the metadata page.

As these functions are marked as superuser only it is expected that
people provides the correct input, especially since the raw page
structure is used as the input.

> 2). Server crashed when below query was executed on a code that was
> build with cassert-enabled.
>
> postgres=# SELECT * FROM hash_page_stats(get_raw_page('hash_i4_index', 0));
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !> \q
>
> The callstack is as follows:
>
> (gdb) bt
> #0 0x00007fef794f15f7 in raise () from /lib64/libc.so.6
> #1 0x00007fef794f2ce8 in abort () from /lib64/libc.so.6
> #2 0x0000000000967381 in ExceptionalCondition
> (conditionName=0x7fef699c942d "!(((id)->lp_len != 0))",
> errorType=0x7fef699c92d4 "FailedAssertion",
> fileName=0x7fef699c9397 "hashfuncs.c", lineNumber=123) at assert.c:54
> #3 0x00007fef699c6ed6 in GetHashPageStatistics (page=0x1060c2c "",
> stat=0x7ffd76846230) at hashfuncs.c:123
> #4 0x00007fef699c70a4 in hash_page_stats (fcinfo=0x7ffd768463e0) at
> hashfuncs.c:169
> #5 0x0000000000682e6b in ExecMakeTableFunctionResult
> (funcexpr=0x10457f0, econtext=0x1044e28, argContext=0x104ee38,
> expectedDesc=0x1047640,
> randomAccess=0 '\000') at execQual.c:2216
> #6 0x00000000006a88e2 in FunctionNext (node=0x1044d10) at nodeFunctionscan.c:94
> #7 0x000000000068a3d9 in ExecScanFetch (node=0x1044d10,
> accessMtd=0x6a882b <FunctionNext>, recheckMtd=0x6a8c10
> <FunctionRecheck>) at execScan.c:95
> #8 0x000000000068a448 in ExecScan (node=0x1044d10, accessMtd=0x6a882b
> <FunctionNext>, recheckMtd=0x6a8c10 <FunctionRecheck>) at
> execScan.c:145
> #9 0x00000000006a8c45 in ExecFunctionScan (node=0x1044d10) at
> nodeFunctionscan.c:268
> #10 0x000000000067f0cf in ExecProcNode (node=0x1044d10) at execProcnode.c:449
> #11 0x000000000067b40b in ExecutePlan (estate=0x1044bf8,
> planstate=0x1044d10, use_parallel_mode=0 '\000', operation=CMD_SELECT,
> sendTuples=1 '\001',
> numberTuples=0, direction=ForwardScanDirection, dest=0x10444c8) at
> execMain.c:1567
> #12 0x0000000000679527 in standard_ExecutorRun (queryDesc=0xfac778,
> direction=ForwardScanDirection, count=0) at execMain.c:338
> #13 0x00000000006793ab in ExecutorRun (queryDesc=0xfac778,
> direction=ForwardScanDirection, count=0) at execMain.c:286
> #14 0x0000000000816b3e in PortalRunSelect (portal=0xfa49e8, forward=1
> '\001', count=0, dest=0x10444c8) at pquery.c:948
> #15 0x00000000008167d1 in PortalRun (portal=0xfa49e8,
> count=9223372036854775807, isTopLevel=1 '\001', dest=0x10444c8,
> altdest=0x10444c8,
> completionTag=0x7ffd76846c60 "") at pquery.c:789
> #16 0x0000000000810a27 in exec_simple_query (query_string=0x1007018
> "SELECT * FROM hash_page_stats(get_raw_page('hash_i4_index', 0));") at
> postgres.c:1094
> #17 0x0000000000814b5e in PostgresMain (argc=1, argv=0xfb1d08,
> dbname=0xf873d8 "postgres", username=0xfb1b70 "edb") at
> postgres.c:4070
> #18 0x000000000078990c in BackendRun (port=0xfacb50) at postmaster.c:4260
>
>

Same deal here.

>
> 3). Could you please let me know, what does the hard coded values '*5'
> and '+1' represents in the below lines of code. You have used them
> when allocating memory for storing spare pages allocated before
> certain split point and block number of bitmap pages inside
> hash_metap().
>
> spares = palloc0(HASH_MAX_SPLITPOINTS * 5 + 1);
>
> mapp = palloc0(HASH_MAX_BITMAPS * 5 + 1);
>
> I guess it would be better to add some comments if using any hard coded values.
>

It is the space needed to output the values.

> 4). Inside hash_page_stats(), you are first calling verify_hash_page()
> to verify if it is a hash page or not and

No, we assume that the page is a valid hash page structure, since there
is an explicit cast w/o any further checks.

> then calling
> GetHashPageStatistics() to get the page stats wherein you are trying
> to check what is the type of hash page. Below is the code snippet for
> this,
>
> + /* page type (flags) */
> + if (opaque->hasho_flag & LH_META_PAGE)
> + stat->type = 'm';
> + else if (opaque->hasho_flag & LH_OVERFLOW_PAGE)
> + stat->type = 'v';
> + else if (opaque->hasho_flag & LH_BUCKET_PAGE)
> + stat->type = 'b';
> + else if (opaque->hasho_flag & LH_BITMAP_PAGE)
> + stat->type = 'i';
> + else
> + stat->type = 'u';
>
> My question is, can we have a hash page that does not fall under the
> category of Metapage, overflow page, bitmap page, bucket page? I guess
> we can't have such type of hash page and incase if we found such type
> of undefined page i guess we should error out instead of reading the
> page because it is quite possible that the page would be corrupted.
> Please let me know your thoughts on this.
>

The "if" statement will need updating once the CHI patch goes in, as it
defines new constants for page types.

However, as the other pageinspect function it assume that the operator
is passing valid data.

> 5). I think we have added enough functions to show the page level
> statistics but not the index level statistics like the total number of
> overflow pages , bucket pages, number of free overflow pages, number
> of bitmap pages etc. in the hash index. How about adding a function
> that would store the index level statistics?
>

Sure, additional functions could be of benefit later on. Feel free to
investigate that possibility for a future CommitFest.

Thanks for your feedback !

Best regards,
Jesper

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesper Pedersen 2016-09-21 12:27:06 Re: pageinspect: Hash index support
Previous Message Robert Haas 2016-09-21 12:12:20 Re: increasing the default WAL segment size