Re: Speeding up GIST index creation for tsvectors

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Speeding up GIST index creation for tsvectors
Date: 2021-08-02 10:26:37
Message-ID: CAFBsxsHBJsuCqe1kd2OVm-M4i=0eEQ=h0ESRp3fvrU_S9FXsUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 1, 2021 at 11:41 PM Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
wrote:
>
> > FWIW, I anticipate some push back from the community because of the
fact that the optimization relies on statistical phenomena.
>
> I dug into this issue for tsvector type. Found out that it's the way
> in which the sign array elements are arranged that is causing the
pointers to
> be misaligned:
[...]
> If siglen is not a multiple of 8 (say 700), cache[j].sign will in some
> cases point to non-8-byte-aligned addresses, as you can see in the
> above code snippet.
>
> Replacing siglen by MAXALIGN64(siglen) in the above snippet gets rid
> of the misalignment. This change applied over the 0001-v3 patch gives
> additional ~15% benefit. MAXALIGN64(siglen) will cause a bit more
> space, but for not-so-small siglens, this looks worth doing. Haven't
> yet checked into types other than tsvector.

Sounds good.

> Will get back with your other review comments. I thought, meanwhile, I
> can post the above update first.

Thinking some more, my discomfort with inline functions that call a global
function doesn't make logical sense, so feel free to do it that way if you
like.

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-08-02 11:17:48 Re: Use generation context to speed up tuplesorts
Previous Message tanghy.fnst@fujitsu.com 2021-08-02 10:01:24 [PATCH]Comment improvement in publication.sql