Re: [PATCH] Opclass parameters

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nikolay Shaplov <dhyan(at)nataraj(dot)su>, Oleg Bartunov <obartunov(at)gmail(dot)com>
Subject: Re: [PATCH] Opclass parameters
Date: 2020-03-31 02:44:19
Message-ID: 20200331024419.GB14618@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 28, 2020 at 06:05:51PM +0300, Alexander Korotkov wrote:
> On Wed, Mar 18, 2020 at 3:28 AM Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> wrote:
> > Attached new version of reordered patches.
>
> I'm going to push this if no objections.

Find attached patch with editorial corrections to docs for this commit.

--word-diff to follow.

commit d3f077b813efa90b25a162bf8d227f3e4218c248
Author: Justin Pryzby <pryzbyj(at)telsasoft(dot)com>
Date: Mon Mar 30 20:55:06 2020 -0500

Doc review: Implement operator class parameters

commit 911e70207703799605f5a0e8aad9f06cff067c63
Author: Alexander Korotkov <akorotkov(at)postgresql(dot)org>

diff --git a/doc/src/sgml/hstore.sgml b/doc/src/sgml/hstore.sgml
index f1f2b08cd7..b2e04d0815 100644
--- a/doc/src/sgml/hstore.sgml
+++ b/doc/src/sgml/hstore.sgml
@@ -468,13 +468,13 @@ CREATE INDEX hidx ON testhstore USING GIN (h);
</programlisting>

<para>
<literal>gist_hstore_ops</literal> GiST opclass approximates {+a+} set of
key/value pairs as a bitmap signature. [-Optional-]{+Its optional+} integer parameter
<literal>siglen</literal>[-of <literal>gist_hstore_ops</literal>-] determines {+the+}
signature length in bytes. [-Default signature-]{+The default+} length is 16 bytes.
Valid values of signature length are between 1 and 2024 bytes. Longer
signatures [-leads-]{+lead+} to {+a+} more precise search [-(scan less-]{+(scanning a smaller+} fraction of [-index, scan-]
[- less-]{+the index and+}
{+ fewer+} heap pages), [-but-]{+at the cost of a+} larger index.
</para>

<para>
diff --git a/doc/src/sgml/intarray.sgml b/doc/src/sgml/intarray.sgml
index 72b4b23c15..7956a746a6 100644
--- a/doc/src/sgml/intarray.sgml
+++ b/doc/src/sgml/intarray.sgml
@@ -265,7 +265,7 @@
</para>

<para>
Two [-parametrized-]{+parameterized+} GiST index operator classes are provided:
<literal>gist__int_ops</literal> (used by default) is suitable for
small- to medium-size data sets, while
<literal>gist__intbig_ops</literal> uses a larger signature and is more
@@ -276,22 +276,23 @@
</para>

<para>
<literal>gist__int_ops</literal> approximates {+an+} integer set as an array of
integer ranges. [-Optional-]{+Its optional+} integer parameter <literal>numranges</literal>[-of-]
[- <literal>gist__int_ops</literal>-]
determines {+the+} maximum number of ranges in
one index key. [-Default-]{+The default+} value of <literal>numranges</literal> is 100.
Valid values are between 1 and 253. Using larger arrays as GiST index
keys leads to {+a+} more precise search [-(scan less-]{+(scaning a smaller+} fraction of [-index, scan less-]{+the index and+}
{+ fewer+} heap pages), [-but-]{+at the cost of a+} larger index.
</para>

<para>
<literal>gist__intbig_ops</literal> approximates {+an+} integer set as a bitmap
[-signature. Optional-]{+signature XXX. Its optional+} integer parameter <literal>siglen</literal>[-of-]
[- <literal>gist__intbig_ops</literal>-]
determines {+the+} signature length in bytes.
[-Default-]{+The default+} signature length is 16 bytes. Valid values of signature length
are between 1 and 2024 bytes. Longer signatures [-leads-]{+lead+} to {+a+} more precise
search [-(scan less-]{+(scanning a smaller+} fraction of [-index, scan less-]{+the index and fewer+} heap pages), [-but-]{+at+}
{+ the cost of a+} larger index.
</para>

<para>
diff --git a/doc/src/sgml/ltree.sgml b/doc/src/sgml/ltree.sgml
index ae4b33ec85..4971b71524 100644
--- a/doc/src/sgml/ltree.sgml
+++ b/doc/src/sgml/ltree.sgml
@@ -506,16 +506,16 @@ Europe &amp; Russia*@ &amp; !Transportation
<literal>@</literal>, <literal>~</literal>, <literal>?</literal>
</para>
<para>
<literal>gist_ltree_ops</literal> GiST opclass approximates {+a+} set of
path labels as a bitmap signature. [-Optional-]{+Its optional+} integer parameter
<literal>siglen</literal>[-of <literal>gist_ltree_ops</literal>-] determines {+the+}
signature length in bytes. [-Default-]{+The default+} signature length is 8 bytes.
Valid values of signature length are between 1 and 2024 bytes. Longer
signatures [-leads-]{+lead+} to {+a+} more precise search [-(scan less-]{+(scanning a smaller+} fraction of [-index, scan-]
[- less-]{+the index and+}
{+ fewer+} heap pages), [-but-]{+at the cost of a+} larger index.
</para>
<para>
Example of creating such an index with [-a-]{+the+} default signature length of 8 bytes:
</para>
<programlisting>
CREATE INDEX path_gist_idx ON test USING GIST (path);
@@ -535,13 +535,13 @@ CREATE INDEX path_gist_idx ON test USING GIST (path gist_ltree_ops(siglen=100));
<literal>@</literal>, <literal>~</literal>, <literal>?</literal>
</para>
<para>
<literal>gist__ltree_ops</literal> GiST opclass works [-similar-]{+similarly+} to
<literal>gist_ltree_ops</literal> and also takes signature length as
a parameter. [-Default-]{+The default+} value of <literal>siglen</literal> in
<literal>gist__ltree_ops</literal> is 28 bytes.
</para>
<para>
Example of creating such an index with [-a-]{+the+} default signature length of 28 bytes:
</para>
<programlisting>
CREATE INDEX path_gist_idx ON test USING GIST (array_path);
diff --git a/doc/src/sgml/pgtrgm.sgml b/doc/src/sgml/pgtrgm.sgml
index dde02634ae..97b3d13a88 100644
--- a/doc/src/sgml/pgtrgm.sgml
+++ b/doc/src/sgml/pgtrgm.sgml
@@ -391,13 +391,13 @@ CREATE INDEX trgm_idx ON test_trgm USING GIN (t gin_trgm_ops);
</para>

<para>
<literal>gist_trgm_ops</literal> GiST opclass approximates {+a+} set of
trigrams as a bitmap signature. [-Optional-]{+Its optional+} integer parameter
<literal>siglen</literal>[-of <literal>gist_trgm_ops</literal>-] determines {+the+}
signature length in bytes. [-Default signature-]{+The default+} length is 12 bytes.
Valid values of signature length are between 1 and 2024 bytes. Longer
signatures [-leads-]{+lead+} to {+a+} more precise search [-(scan less-]{+(scanning a smaller+} fraction of [-index, scan-]
[- less-]{+the index and+}
{+ fewer+} heap pages), [-but-]{+at the cost of a+} larger index.
</para>

<para>
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 2217fcd6c2..0dc427289d 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3670,17 +3670,17 @@ SELECT plainto_tsquery('supernovae stars');
to check the actual table row to eliminate such false matches.
(<productname>PostgreSQL</productname> does this automatically when needed.)
GiST indexes are lossy because each document is represented in the
index by a fixed-length signature. [-Signature-]{+The signature+} length in bytes is determined
by the value of the optional integer parameter <literal>siglen</literal>.
[-Default-]{+The default+} signature length (when <literal>siglen</literal> is not [-specied)-]{+specified)+} is
124 bytes, [-maximal-]{+the maximum signature+} length is 2024 bytes. The signature is generated by hashing
each word into a single bit in an n-bit string, with all these bits OR-ed
together to produce an n-bit document signature. When two words hash to
the same bit position there will be a false match. If all words in
the query have matches (real or false) then the table row must be
retrieved to see if the match is correct. Longer signatures [-leads-]{+lead+} to {+a+} more
precise search [-(scan less-]{+(scanning a smaller+} fraction of [-index, scan less-]{+the index and fewer+} heap
pages), [-but-]{+at the cost of a+} larger index.
</para>

<para>

Attachment Content-Type Size
v1-0001-Doc-review-Implement-operator-class-parameters.patch text/x-diff 9.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-03-31 02:46:22 Re: improve transparency of bitmap-only heap scans
Previous Message Amit Kapila 2020-03-31 02:20:45 Re: error context for vacuum to include block number