Re: [POC] A better way to expand hash indexes.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] A better way to expand hash indexes.
Date: 2017-03-29 10:39:49
Message-ID: CAA4eK1+cmLjKOtQX7Rg8p22e8MgPAy8DgfRcP=z5-cJ2C7w76A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 29, 2017 at 12:51 PM, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> wrote:
> On Wed, Mar 29, 2017 at 10:12 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Few other comments:
>> +/*
>> + * This is just a trick to save a division operation. If you look into the
>> + * bitmap of 0-based bucket_num 2nd and 3rd most significant bit will indicate
>> + * which phase of allocation the bucket_num belongs to with in the group. This
>> + * is because at every splitpoint group we allocate (2 ^ x) buckets and we have
>> + * divided the allocation process into 4 equal phases. This macro returns value
>> + * from 0 to 3.
>> + */
>>
>> +#define SPLITPOINT_PHASES_WITHIN_GROUP(sp_g, bucket_num) \
>> + (((bucket_num) >> (sp_g - SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE)) & \
>> + SPLITPOINT_PHASE_MASK)
>> This won't work if we change SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE to
>> number other than 3. I think you should change it so that it can work
>> with any value of SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE.
>
> Fixed, using SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE was accidental. All
> I need is most significant 3 bits hence should be subtracted by 3
> always.
>

Okay, your current patch looks good to me apart from minor comments,
so marked as Read For Committer. Please either merge the
sort_hash_b_2.patch with main patch or submit it along with next
revision for easier reference.

Few minor comments:
1.
+splitpoint phase S. The hashm_spares[0] is always 0, so that buckets 0 and 1
+(which belong to splitpoint group 0's phase 1 and phase 2 respectively) always
+appear at block numbers 1 and 2, just after the meta page.

This explanation doesn't seem to be right as with current patch we
start phased allocation only after splitpoint group 9.

2.
-#define HASH_MAX_SPLITPOINTS 32
#define HASH_MAX_BITMAPS 128

+#define SPLITPOINT_PHASES_PER_GRP 4
+#define SPLITPOINT_PHASE_MASK (SPLITPOINT_PHASES_PER_GRP - 1)
+#define SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE 10
+#define HASH_MAX_SPLITPOINTS \
+ ((32 - SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE) * \
+ SPLITPOINT_PHASES_PER_GRP) + \
+ SPLITPOINT_GROUPS_WITH_ONLY_ONE_PHASE

You have changed the value of HASH_MAX_SPLITPOINTS, but the comments
explaining that value are still unchanged. Refer below text.
"The limitation on the size of spares[] comes from the fact that there's
* no point in having more than 2^32 buckets with only uint32 hashcodes."

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2017-03-29 11:05:54 Re: gitlab post-mortem: pg_basebackup waiting for checkpoint
Previous Message Karsten Hilbert 2017-03-29 09:37:15 Re: Postgres Permissions Article