Re: [HACKERS] path toward faster partition pruning

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2017-11-14 04:00:12
Message-ID: CAKJS1f-1Lc_b=Y1iicPQzvUgSn1keHSgmRqLuOGq_VR6M==zbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 13 November 2017 at 22:46, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2017/11/10 12:30, Kyotaro HORIGUCHI wrote:
>> In 0002, bms_add_range has a bit naive-looking loop
>>
>> + while (wordnum <= uwordnum)
>> + {
>> + bitmapword mask = (bitmapword) ~0;
>> +
>> + /* If working on the lower word, zero out bits below 'lower'. */
>> + if (wordnum == lwordnum)
>> + {
>> + int lbitnum = BITNUM(lower);
>> + mask >>= lbitnum;
>> + mask <<= lbitnum;
>> + }
>> +
>> + /* Likewise, if working on the upper word, zero bits above 'upper' */
>> + if (wordnum == uwordnum)
>> + {
>> + int ushiftbits = BITS_PER_BITMAPWORD - (BITNUM(upper) + 1);
>> + mask <<= ushiftbits;
>> + mask >>= ushiftbits;
>> + }
>> +
>> + a->words[wordnum++] |= mask;
>> + }
>>
>> Without some aggressive optimization, the loop takes most of the
>> time to check-and-jump for nothing especially with many
>> partitions and somewhat unintuitive.
>>
>> The following uses a bit tricky bitmap operation but
>> is straightforward as a whole.
>>
>> =====
>> /* fill the bits upper from BITNUM(lower) (0-based) of the first word */
>> a->workds[wordnum++] += ~(bitmapword)((1 << BITNUM(lower)) - 1);
>>
>> /* fill up intermediate words */
>> while (wordnum < uwordnum)
>> a->words[wordnum++] = ~(bitmapword) 0;
>>
>> /* fill up to BITNUM(upper) bit (0-based) of the last word */
>> a->workds[wordnum++] |=
>> (~(bitmapword) 0) >> (BITS_PER_BITMAPWORD - (BITNUM(upper) - 1));
>> =====
>
> Considering also the David's comment downthread, I will try to incorporate
> this into bms_add_range().

I've attached an implementation of the patch using this method.

I've also attached bitset.c which runs each through their paces. I'd
have expected Kyotaro's method to be faster, but gcc 7.2 with -O2
generates very slightly slower code. I didn't really check why. clang
seems to do a better job with it.

$ gcc -O2 bitset.c -o bitset && ./bitset
bms_add_range in 0.694254 (6.94254 ns per loop)
bms_add_range2 in 0.726643 (7.26643 ns per loop)
11111111111111111111111111111110
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
-------------
11111111111111111111111111111110
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
$ gcc --version
gcc (Ubuntu 7.2.0-8ubuntu3) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ clang -O2 bitset.c -o bitset && ./bitset
bms_add_range in 0.866554 (8.66554 ns per loop)
bms_add_range2 in 0.467138 (4.67138 ns per loop)
11111111111111111111111111111110
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
-------------
11111111111111111111111111111110
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
11111111111111111111111111111111
$ clang --version
clang version 4.0.1-6 (tags/RELEASE_401/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Probably just go with Kyotaro's idea (v2). I don't think this is worth
debating, I just wanted to show it's not that clear-cut.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
bms_add_range_v2.patch application/octet-stream 2.9 KB
bitset.c text/x-csrc 3.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-11-14 04:07:01 Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Previous Message Connor Wolf 2017-11-14 03:08:28 Re: [HACKERS] How to implement a SP-GiST index as a extension module?