Re: [PATCH] fix GIN index search sometimes losing results

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Artur Zakirov <zaartur(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] fix GIN index search sometimes losing results
Date: 2020-07-02 16:23:02
Message-ID: CALT9ZEEODyQD2gp1e04EUuwU-ppCqz61Ld9wLOAVEHjer3kNFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

чт, 2 июл. 2020 г. в 19:38, Artur Zakirov <zaartur(at)gmail(dot)com>:

> Hello,
>
> On Thu, Jul 2, 2020 at 8:23 PM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
> wrote:
> >
> > ср, 1 июл. 2020 г. в 23:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> >>
> >> Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> writes:
> >> > Below is my variant how to patch Gin-Gist weights issue:
> >>
> >> I looked at this patch, but I'm unimpressed, because it's buggy.
> >
> >
> > Thank you, i'd noticed and made minor corrections in the patch. Now it
> should work
> > correctly,
> >
> > As for preserving the option to use legacy bool-style calls, personally
> I see much
> > value of not changing API ad hoc to fix something. This may not harm
> vanilla reseases
> > but can break many possible side things like RUM index etc which I think
> are abundant
> > around there. Furthermore if we leave legacy bool callback along with
> newly proposed and
> > recommended for further use it will cost nothing.
> >
> > So I've attached a corrected patch. Also I wrote some comments to the
> code and added
> > your test as a part of apatch. Again thank you for sharing your thoughts
> and advice.
> >
> > As always I'd appreciate everyone's opinion on the bugfix.
>
> I haven't looked at any of the patches carefully yet. But I tried both of
> them.
>
> I tried Tom's patch. To compile the RUM extension I've made few
> changes to use new
> TS_execute(). Speaking about backward compatibility. I also think that
> it is not so important
> here. And RUM alreadyhas a number of "#if PG_VERSION_NUM" directives. API
> breaks
> from time to time and it seems inevitable.
>
> I also tried "gin-gist-weight-patch-v4.diff". And it didn't require
> changes into RUM. But as
> Tom said above TS_execute() is broken already. Here is the example with
> "gin-gist-weight-patch-v4.diff" and RUM:
>
> =# create extension rum;
> =# create table test (a tsvector);
> =# insert into test values ('wd:1A wr:2D'), ('wd:1A wr:2D');
> =# create index on test using rum (a);
> =# select a from test where a @@ '!wd:D';
> a
> ----------------
> 'wd':1A 'wr':2
> 'wd':1A 'wr':2
> (2 rows)
> =# set enable_seqscan to off;
> =# select a from test where a @@ '!wd:D';
> a
> ---
> (0 rows)
>
> So it seems we are losing some results with RUM as well.
>
> --
> Artur
>
For me it is 100% predictable that unmodified RUM is still losing results
as it is still using binary callback.
The main my goal of saving binary legacy callback is that side callers like
RUM will not break immediately but remain in
existing state (i.e. losing results in some queries). To fix the issue
completely it is needed to make ternary logic in
Postgres Tsearch AND engage this ternary logic in RUM and other side
modules.

Thank you for your consideration!

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaka Jančar 2020-07-02 16:30:51 Sync vs Flush
Previous Message James Coleman 2020-07-02 16:01:21 Re: Use of "long" in incremental sort code