Re: move PartitionBoundInfo creation code

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: move PartitionBoundInfo creation code
Date: 2018-11-07 07:37:26
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 07, 2018 at 03:34:38PM +0900, Amit Langote wrote:
> On 2018/11/05 16:21, Michael Paquier wrote:
>> On Thu, Nov 01, 2018 at 01:03:00PM +0900, Amit Langote wrote:
>>> Done a few moments ago. :)
>> From the file size this move is actually negative. From what I can see
>> partcache decreases to 400 lines, while partbounds increases to 3k
>> lines.
> Hmm, is that because partbounds.c contains more functionality?

The move you are doing here makes sense, now refactoring is better if we
could avoid transforming a large file into a larger one and preserve
some more balance in the force.

> I think we can design the interface of partition_bounds_create such that
> it returns information needed to re-arrange OIDs to be in the canonical
> partition bound order, so the actual re-ordering of OIDs can be done by
> the caller itself. (It may not always be OIDs, could be something else
> like a struct to represent fake partitions.)

Interesting point. Do we have already in the code a new case for fake
partitions or a case where the partition OIDs are not used? I was
thinking why the partitions OIDs are not actually directly embedded in
each PartitionBoundInfo, removing at the same time the index numbers
used for the mapping. It looks sensible to return the mapping based
on your argument.

>> The first phase building the bounds should switch to a switch/case like
>> the second phase.
> That bothered me too, so I looked at whether it'd be a good idea to have
> just one switch () block and put all the code for different partitioning
> methods into it. The resulting code seems more readable to me as one
> doesn't have to shuffle between looking at the first block that generates
> Datums from PartitionBoundSpecs and the second block that assigns them to
> the Datum array in PartitionBoundInfo, for each partitioning method
> respectively.

Yeah, merging the first and second parts makes the most sense.

> I've broken this down this into 3 patches for ease of review:
> 1. Local re-arrangement of the code in RelationBuildPartitionDesc such
> that the code that creates PartitionBoundInfo is clearly separated. I've
> also applied some of the comments above to that piece of code such as
> unifying all of that code in one switch() block. I also found an
> opportunity to simplify the code that maps canonical partition indexes to
> original partition indexes.
> 2. Re-arrangement local to partcache.c that moves out the separated code
> into partition_bounds_create() with the interface mentioned above.
> 3. Move partition_bounds_create to partbounds.c and make some functions
> that are not needed outside partbounds.c static to that file.

That actually helps a lot in seeing how you handle the move, thanks.
What do you think about splitting the code in partition_bounds_create
even a bit more by having one routine per strategy?

Thinking crazy, we could also create a subfolder partitioning/bounds/
which includes three files with this refactoring:x
- range.c
- values.c
- hash.c
Then we keep partbounds.c which is the entry point used by partcache.c,
and partbounds.c relies on the stuff in each other sub-file when
building a strategy. This move could be interesting in the long term as
there are comparison routines and structures for each strategy if more
strategies are added.

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Higuchi, Daisuke 2018-11-07 08:22:34 RE: [Bug Fix]ECPG: cancellation of significant digits on ECPG
Previous Message Andres Freund 2018-11-07 07:24:08 Re: Reduce maintenance burden of alternative output files with \if \quit