Re: Skip partition tuple routing with constant partition key

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Skip partition tuple routing with constant partition key
Date: 2022-07-26 23:22:22
Message-ID: CALNJ-vQa0ov_9T746rTSy+rQOBmBpq4VhCybie2H-=rG+Zz-7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 26, 2022 at 3:28 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:

> Thank for looking at this.
>
> On Sat, 23 Jul 2022 at 01:23, Amit Langote <amitlangote09(at)gmail(dot)com>
> wrote:
> > + /*
> > + * The Datum has changed. Zero the number of times
> we've
> > + * found last_found_datum_index in a row.
> > + */
> > + partdesc->last_found_count = 0;
> >
> > + /* Zero the "winning streak" on the cache hit count
> */
> > + partdesc->last_found_count = 0;
> >
> > Might it be better for the two comments to say the same thing? Also,
> > I wonder which one do you intend as the resetting of last_found_count:
> > setting it to 0 or 1? I can see that the stanza at the end of the
> > function sets to 1 to start a new cycle.
>
> I think I've addressed all of your comments. The above one in
> particular caused me to make some larger changes.
>
> The reason I was zeroing the last_found_count in LIST partitioned
> tables when the Datum was not equal to the previous found Datum was
> due to the fact that the code at the end of the function was only
> checking the partition indexes matched rather than the bound_offset vs
> last_found_datum_index. The reason I wanted to zero this was that if
> you had a partition FOR VALUES IN(1,2), and you received rows with
> values alternating between 1 and 2 then we'd match to the same
> partition each time, however the equality test with the current
> 'values' and the Datum at last_found_datum_index would have been false
> each time. If we didn't zero the last_found_count we'd have kept
> using the cache path even though the Datum and last Datum wouldn't
> have been equal each time. That would have resulted in always doing
> the cache check and failing, then doing the binary search anyway.
>
> I've now changed the code so that instead of checking the last found
> partition is the same as the last one, I'm now checking if
> bound_offset is the same as last_found_datum_index. This will be
> false in the "values alternating between 1 and 2" case from above.
> This caused me to have to change how the caching works for LIST
> partitions with a NULL partition which is receiving NULL values. I've
> coded things now to just skip the cache for that case. Finding the
> correct LIST partition for a NULL value is cheap and no need to cache
> that. I've also moved all the code which updates the cache fields to
> the bottom of get_partition_for_tuple(). I'm only expecting to do that
> when bound_offset is set by the lookup code in the switch statement.
> Any paths, e.g. HASH partitioning lookup and LIST or RANGE with NULL
> values shouldn't reach the code which updates the partition fields.
> I've added an Assert(bound_offset >= 0) to ensure that stays true.
>
> There's probably a bit more to optimise here too, but not much. I
> don't think the partdesc->last_found_part_index = -1; is needed when
> we're in the code block that does return boundinfo->default_index;
> However, that only might very slightly speedup the case when we're
> inserting continuously into the DEFAULT partition. That code path is
> also used when we fail to find any matching partition. That's not one
> we need to worry about making go faster.
>
> I also ran the benchmarks again and saw that most of the use of
> likely() and unlikely() no longer did what I found them to do earlier.
> So the weirdness we saw there most likely was just down to random code
> layout changes. In this patch, I just dropped the use of either of
> those two macros.
>
> David
>
Hi,

+ return boundinfo->indexes[last_datum_offset + 1];
+
+ else if (cmpval < 0 && last_datum_offset + 1 <
boundinfo->ndatums)

nit: the `else` keyword is not needed.

Cheers

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-07-26 23:22:52 Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
Previous Message Justin Pryzby 2022-07-26 23:20:19 Re: [Commitfest 2022-07] Patch Triage: Waiting on Author