Re: Doc chapter for Hash Indexes

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Doc chapter for Hash Indexes
Date: 2021-06-21 22:54:51
Message-ID: 20210621225451.GD29179@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 21, 2021 at 02:08:12PM +0100, Simon Riggs wrote:
> New chapter for Hash Indexes, designed to help users understand how
> they work and when to use them.
>
> Mostly newly written, but a few paras lifted from README when they were helpful.

+ <para>
+ PostgreSQL includes an implementation of persistent on-disk hash indexes,
+ which are now fully crash recoverable. Any data type can be indexed by a

I don't see any need to mention that they're "now" crash safe.

+ Each hash index tuple stores just the 4-byte hash value, not the actual
+ column value. As a result, hash indexes may be much smaller than B-trees
+ when indexing longer data items such as UUIDs, URLs etc.. The absence of

comma:
URLs, etc.

+ the column value also makes all hash index scans lossy. Hash indexes may
+ take part in bitmap index scans and backward scans.

Isn't it more correct to say that it must use a bitmap scan?

+ through the tree until the leaf page is found. In tables with millions
+ of rows this descent can increase access time to data. The equivalent

rows comma

+ that hash value. When scanning a hash bucket during queries we need to

queries comma

+ <para>
+ As a result of the overflow cases, we can say that hash indexes are
+ most suitable for unique, nearly unique data or data with a low number
+ of rows per hash bucket will be suitable for hash indexes. One

The beginning and end of the sentence duplicate "suitable".

+ Each row in the table indexed is represented by a single index tuple in
+ the hash index. Hash index tuples are stored in the bucket pages, and if
+ they exist, the overflow pages.

"the overflow pages" didn't sound right, but I was confused by the comma.
I think it should say ".. in bucket pages and overflow pages, if any."

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2021-06-21 23:19:27 Assertion failure in HEAD and 13 after calling COMMIT in a stored proc
Previous Message Peter Geoghegan 2021-06-21 21:26:33 Re: disfavoring unparameterized nested loops