Re: DRAFT: Pass sk_attno to consistent function

From: Michał Kłeczek <michal(at)kleczek(dot)org>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: DRAFT: Pass sk_attno to consistent function
Date: 2024-03-18 14:17:54
Message-ID: 78D9054A-B4EC-475D-8118-C434C6D3261D@kleczek.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Wrong file in the previous message - sorry for the noise.

Attached is a fixed patch.

Thanks,
Michal

> On 18 Mar 2024, at 15:14, Michał Kłeczek <michal(at)kleczek(dot)org> wrote:
>
> I realised it might be enough to pass sk_attno to consistent function as that should be enough to lookup metadata if needed.
>
> Attached is a draft patch that does this.
>
> —
> Michal
>
> <0001-Pass-key-sk_attno-to-consistent-function.patch>
>
>> On 18 Mar 2024, at 14:31, Michał Kłeczek <michal(at)kleczek(dot)org> wrote:
>>
>> Hi All,
>>
>>> On 11 Mar 2024, at 18:58, Michał Kłeczek <michal(at)kleczek(dot)org> wrote:
>>>
>>> Hi All,
>>>
>>> During my journey of designing Pg based solution at my work I was severely hit by several shortcomings in GiST.
>>> The most severe one is the lack of support for SAOP filters as it makes it difficult to have partition pruning and index (only) scans working together.
>>>
>>> To overcome the difficulties I implemented a simple extension:
>>> https://github.com/mkleczek/btree_gist_extra
>>>
>>> Since it provides a separate operator class from btree_gist it requires re-indexing the data.
>>> So I thought maybe it would be a good idea to incorporate it into btree_gist.
>>>
>>
>> While working on supporting (sort of) SAOP support for Gist I was stuck with the interplay between Gist consistent function and partition pruning.
>> Not sure how it applies to SAOP handling in general though.
>>
>> I’ve implemented an operator class/family that supports Gist index scan for the following query:
>>
>> SELECT * FROM tbl WHERE col ||= ARRAY[‘a’, ‘b’, ‘c’];
>>
>> It all works well except cases where tbl is partitioned by “col” column.
>> In this case index scan unnecessarily scans pages for values that are not in the partition.
>>
>> I am not sure if it works as expected (ie. no unnecessary scans) in case of ANY = (ARRAY[]) and nbtree.
>> In case of Gist the only place where pre-processing of queries can be performed is consistent function.
>> But right now there is no possibility to access any scan related information as it is not passed to consistent function.
>> The only thing available is GISTENTRY and it does not contain any metadata.
>>
>> As a workaround I’ve added options to the op family that allows (redundantly) specifying MODULUS/REMAINDER for the index:
>>
>> CREATE INDEX idx ON tbl_partition_01 USING gist ( col gist_text_extra_ops (modulus=4, remainder=2) )
>>
>> and use the values to filter out query array passed to consistent function.
>>
>> This is of course not ideal:
>> - the information is redundant
>> - changing partitioning scheme requires re-indexing
>>
>> I am wondering if it is possible to extend consistent function API so that either ScanKey or even the whole IndexScanDesc is passed as additional parameter.
>>
>> —
>> Michal
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-03-18 14:23:32 Re: Regression tests fail with musl libc because libpq.so can't be loaded
Previous Message Michał Kłeczek 2024-03-18 14:14:03 DRAFT: Pass sk_attno to consistent function