Re: pageinspect: Hash index support

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pageinspect: Hash index support
Date: 2017-01-11 20:16:07
Message-ID: CAE9k0P=MVtGPJzdSNGy0hwVZp=VE2ghf7cQWhvgqLVv0ANcahw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> + values[j++] = UInt16GetDatum(uargs->offset);
> + values[j++] = CStringGetTextDatum(psprintf("(%u,%u)",
> +
> BlockIdGetBlockNumber(&(itup->t_tid.ip_blkid)),
> + itup->t_tid.ip_posid));
> +
> + ptr = (char *) itup + IndexInfoFindDataOffset(itup->t_info);
> + dlen = IndexTupleSize(itup) - IndexInfoFindDataOffset(itup->t_info);
>
> It seems like this could be used to index off the end of the page, if
> you feed it invalid data.

okay, I have handled it in the attached patch.

>
> + dump = palloc0(dlen * 3 + 1);
>
> This is wasteful. Just use palloc and install a terminating NUL byte instead.
>

fixed. Please check the attached patch.

> + sprintf(dump, "%02x", *(ptr + off) & 0xff);
>
> *(ptr + off) is normally written ptr[off].
>

corrected.

> + if (pageopaque->hasho_flag != LH_OVERFLOW_PAGE)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("page is not an overflow page"),
> + errdetail("Expected %08x, got %08x.",
> + LH_OVERFLOW_PAGE, pageopaque->hasho_flag)));
>
> I think this is an unnecessary test given that you've already called
> verify_hash_page().
>

Yes, it is not required. I have removed it in the attached patch.

> + if (bitmappage >= metap->hashm_nmaps)
> + elog(ERROR, "invalid overflow bit number %u", ovflbitno);
>
> I think this should be an ereport(), because it's reachable given a
> bogus page which a user might construct (or a corrupted page).
>

okay, I have corrected it.

> +test=# SELECT * FROM hash_page_items(get_raw_page('con_hash_index', 1));
> + itemoffset | ctid | data
> +------------+-----------------+-------------------------
> + 1 | (3145728,14376) | 00 c0 ca 3e 00 00 00 00
> + 2 | (3145728,14376) | 00 c0 ca 3e 00 00 00 00
> + 3 | (3407872,14376) | 00 c0 ca 3e 00 00 00 00
>
> Won't the first 4 bytes always be a hash code and the second 4 bytes
> always 0? Should we just return the hash code as an int4 or int8
> instead of pretending it's a bunch of uninterpretable binary data?
>

Yes, the first 4 bytes represents a hash code and the second 4 bytes
is always zero. Now, returning the hash code as int4.

> +test=# SELECT * FROM hash_bitmap_info('con_hash_index', 2050);
> + bitmapblkno | bitmapbit
> +-------------+-----------
> + 65 | 1
> +</screen>
>
> I find this hard to understand. This says that it returns
> information, but the nature of the returned information is unspecified
> and in my opinion unclear.
>

I have rephrased it to make it more clear.

Attachment Content-Type Size
0001-Add-support-for-hash-index-in-pageinspect-contrib-mo.patch invalid/octet-stream 26.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2017-01-11 20:22:50 Re: Packages: Again
Previous Message Pavel Stehule 2017-01-11 20:09:20 Re: Packages: Again