Re: [HACKERS] UPDATE of partition key

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject: Re: [HACKERS] UPDATE of partition key
Date: 2018-01-14 11:57:12
Message-ID: CAJ3gD9ejff2oTorUz+VPeCmtcW3NCE2ni-5-JpzwM4tPybn89Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13 January 2018 at 02:56, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jan 12, 2018 at 12:23 PM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>>> (1) if they need it by subplan index, first use
>>> subplan_partition_offsets to convert it to a per-leaf index
>>
>> Before that, we need to check if there *is* an offset array. If there
>> are no partitions, there is only going to be a per-subplan array,
>> there won't be an offsets array. But I guess, you are saying : "do the
>> on-demand allocation only for leaf partitions; if there are no
>> partitions, the per-subplan maps will always be allocated for each of
>> the subplans from the beginning" . So if there is no offset array,
>> just return mtstate->mt_per_subplan_tupconv_maps[subplan_index]
>> without any further checks.
>
> Oops. I forgot that there might not be partitions. I was assuming
> that mt_per_subplan_tupconv_maps wouldn't exist at all, and we'd
> always use subplan_partition_offsets. Both that won't work in the
> inheritance case.
>
>> But after that, I am not sure then why is mt_per_sub_plan_maps[] array
>> needed ? We are always going to convert the subplan index into leaf
>> index, so per-subplan map array will not come into picture. Or are you
>> saying, it will be allocated and used only when there are no
>> partitions ? From one of your earlier replies, you did mention about
>> trying to share the maps between the two arrays, that means you were
>> considering both arrays being used at the same time.
>
> We'd use them both at the same time if we didn't have, or didn't use,
> subplan_partition_offsets, but if we have subplan_partition_offsets
> and can use it then we don't need mt_per_sub_plan_maps.
>
> I guess I'm inclined to keep mt_per_sub_plan_maps for the case where
> there are no partitions, but not use it when partitions are present.
> What do you think about that?

Even where partitions are present, in the usual case where there are
no transition tables we won't require per-leaf map at all [1]. So I
think we should keep mt_per_sub_plan_maps only for the case where
per-leaf map is not allocated. And we will not allocate
mt_per_sub_plan_maps when mt_per_leaf_maps is needed. In other words,
exactly one of the two maps will be allocated.

This is turning out to be close to what's already there in the last
patch versions: use a single map array, and an offsets array. The
difference is : in the patch I am using the *same* variable for the
two maps. Where as, now we are talking about two different array
variables for maps, but only allocating one of them.

Are you ok with this ? I think the thing you were against was to have
a common *variable* for two purposes. But above, I am saying we have
two variables but assign a map array to only *one* of them and leave
the other unused.

---------

Regarding the on-demand map allocation ....
Where mt_per_sub_plan_maps is allocated, we won't have the on-demand
allocation: all the maps will be allocated initially. The reason is
becaues the map_is_required array is only per-leaf. Or else, again, we
need to keep another map_is_required array for per-subplan. May be we
can support the on-demand stuff for subplan maps also, but only as a
separate change after we are done with update-partition-key.

---------

Regarding mt_per_leaf_tupconv_required, I am thinking we can make it a
bool array, and name it : mt_per_leaf_map_not_required. When it is
true for a given index, it means, we have already called
convert_tuples_by_name() and it returned NULL; i.e. it means we are
sure that map is not required. A false value means we need to call
convert_tuples_by_name() if it is NULL, and then set
mt_per_leaf_map_not_required to (map == NULL).

Instead of a bool array, we can even make it a Bitmapset. But I think
access would become slower as compared to array, particularly because
it is going to be a heavily used function.

---------

[1] - For update-tuple-routing, only per-subplan access is required;
- For transition tables, per-subplan access is required,
and additionally per-leaf access is required when tuples are
update-routed
- So if both update-tuple-routing and transition tables are
required, both of the maps are needed.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2018-01-14 12:15:54 Re: WIP: a way forward on bootstrap data
Previous Message Michael Paquier 2018-01-14 10:44:45 Re: PQHost() undefined behavior if connecting string contains both host and hostaddr types