From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> |
Cc: | David Steele <david(at)pgmasters(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [POC] A better way to expand hash indexes. |
Date: | 2017-03-25 13:33:08 |
Message-ID: | CAA4eK1+ZTGywcWGFCXQFeZfQvfERpHECN64zM2dtCyunawOxog@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Mar 25, 2017 at 10:13 AM, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com> wrote:
> On Tue, Mar 21, 2017 at 8:16 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Sure, I was telling you based on that. If you are implicitly treating
>> it as 2-dimensional array, it might be easier to compute the array
>>offsets.
>
> I think calculating spares offset will not become anyway much simpler
> we still need to calculate split group and split phase separately. I
> have added a patch to show how a 2-dimensional spares code looks like
> and where all we need changes.
>
I think one-dimensional patch has fewer places to touch, so that looks
better to me. However, I think there is still hard coding and
assumptions in code which we should try to improve.
1.
+ /*
+ * The first 4 bucket belongs to first splitpoint group 0. And since group
+ * 0 have 4 = 2^2 buckets, we double them in group 1. So total buckets
+ * after group 1 is 8 = 2^3. Then again at group 2 we add another 2^3
+ * buckets to double the total number of buckets, which will become 2^4. I
+ * think by this time we can see a pattern which say if num_bucket > 4
+ * splitpoint group = log2(num_bucket) - 2
+ */
+ if (num_bucket <= 4)
+ splitpoint_group = 0; /* converted to base 0. */
+ else
+ splitpoint_group = _hash_log2(num_bucket) - 2;
This patch defines split point group zero has four buckets and based
on that above calculation is done. I feel you can define it like
#define Buckets_First_Split_Group 4 and then use it in above code.
Also, in else part number 2 looks awkward, can we define it as
log2_buckets_first_group = _hash_log2(Buckets_First_Split_Group) or
some thing like that. I think that way code will look neat. I don't
like the way above comment is worded even though it is helpful to
understand the calculation. If you want, then you can add such a
comment in file header, here it looks out of place.
2.
+power-of-2 groups, called "split points" in the code. That means on every new
+split points we double the existing number of buckets.
split points/split point
3.
+ * which phase of allocation the bucket_num belogs to with in the group.
/belogs/belongs
I have still not completely reviewed the patch as I have ran out of
time for today.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2017-03-25 14:10:38 | Re: Logical decoding on standby |
Previous Message | Peter Eisentraut | 2017-03-25 13:31:55 | Re: crashes due to setting max_parallel_workers=0 |