Re: doc: BRIN indexes and autosummarize

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Roberto Mello <roberto(dot)mello(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Subject: Re: doc: BRIN indexes and autosummarize
Date: 2022-07-04 21:22:28
Message-ID: 20220704212227.GN13040@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 04, 2022 at 09:38:42PM +0200, Alvaro Herrera wrote:
> + There are several triggers for initial summarization of a page range
> + to occur. If the table is vacuumed, either because
> + <xref linkend="sql-vacuum" /> has been manually invoked or because
> + autovacuum causes it,
> + all existing unsummarized page ranges are summarized.

I'd say "If the table is vacuumed manually or by autovacuum, ..."
(Or "either manually or by autovacuum, ...")

> + Also, if the index has the
> + <xref linkend="index-reloption-autosummarize"/> parameter set to on,

Maybe say "If the autovacuum parameter is enabled" (this may avoid needing to
revise it later if we change the default).

> + then any run of autovacuum in the database will summarize all

I'd avoid saying "run" and instead say "then anytime autovacuum runs in that
database, all ..."

> + unsummarized page ranges that have been completely filled recently,
> + regardless of whether the table is processed by autovacuum for other
> + reasons; see below.

say "whether the table itself" and remove "for other reasons" ?

> <para>
> When autosummarization is enabled, each time a page range is filled a

Maybe: filled comma

> - request is sent to autovacuum for it to execute a targeted summarization
> - for that range, to be fulfilled at the end of the next worker run on the
> - same database. If the request queue is full, the request is not recorded
> - and a message is sent to the server log:
> + request is sent to <literal>autovacuum</literal> for it to execute a targeted
> + summarization for that range, to be fulfilled at the end of the next
> + autovacuum worker run on the same database. If the request queue is full, the

"to be fulfilled the next time an autovacuum worker finishes running in that
database."

or

"to be fulfilled by an autovacuum worker the next it finishes running in that
database."

> +++ b/doc/src/sgml/ref/create_index.sgml
> @@ -580,6 +580,8 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
> <para>
> Defines whether a summarization run is invoked for the previous page
> range whenever an insertion is detected on the next one.
> + See <xref linkend="brin-operation"/> for more details.
> + The default is <literal>off</literal>.

Maybe "invoked" should say "queued" ?

Also, a reminder that this was never addressed (I wish the project had a way to
keep track of known issues).

https://www.postgresql.org/message-id/20201113160007.GQ30691@telsasoft.com
|error_severity of brin work item
|left | could not open relation with OID 292103095
|left | processing work entry for relation "ts.child.alarms_202010_alarm_clear_time_idx"
|Those happen following a REINDEX job on that index.

This inline patch includes my changes as well as yours.
And the attached patch is my changes only.

diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index caf1ea4cef1..90897a4af07 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -73,31 +73,55 @@
summarized range, that range does not automatically acquire a summary
tuple; those tuples remain unsummarized until a summarization run is
invoked later, creating initial summaries.
- This process can be invoked manually using the
- <function>brin_summarize_range(regclass, bigint)</function> or
- <function>brin_summarize_new_values(regclass)</function> functions;
- automatically when <command>VACUUM</command> processes the table;
- or by automatic summarization executed by autovacuum, as insertions
- occur. (This last trigger is disabled by default and can be enabled
- with the <literal>autosummarize</literal> parameter.)
- Conversely, a range can be de-summarized using the
- <function>brin_desummarize_range(regclass, bigint)</function> function,
- which is useful when the index tuple is no longer a very good
- representation because the existing values have changed.
</para>

<para>
- When autosummarization is enabled, each time a page range is filled a
- request is sent to autovacuum for it to execute a targeted summarization
- for that range, to be fulfilled at the end of the next worker run on the
- same database. If the request queue is full, the request is not recorded
- and a message is sent to the server log:
+ There are several ways to trigger the initial summarization of a page range.
+ If the table is vacuumed, either manually or by
+ <link linkend="autovacuum">autovacuum</link>,
+ all existing unsummarized page ranges are summarized.
+ Also, if the index's
+ <xref linkend="index-reloption-autosummarize"/> parameter is enabled,
+ whenever autovacuum runs in that database, summarization will
+ occur for all
+ unsummarized page ranges that have been filled,
+ regardless of whether the table itself is processed by autovacuum; see below.
+
+ Lastly, the following functions can be used:
+
+ <simplelist>
+ <member>
+ <function>brin_summarize_range(regclass, bigint)</function>
+ summarizes all unsummarized ranges
+ </member>
+ <member>
+ <function>brin_summarize_new_values(regclass)</function>
+ summarizes one specific range, if it is unsummarized
+ </member>
+ </simplelist>
+ </para>
+
+ <para>
+ When autosummarization is enabled, each time a page range is filled, a
+ request is sent to <literal>autovacuum</literal> to execute a targeted
+ summarization for that range, to be fulfilled the next time an autovacuum
+ worker finishes running in that database. If the request queue is full, the
+ request is not recorded and a message is sent to the server log:
<screen>
LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was not recorded
</screen>
When this happens, the range will be summarized normally during the next
regular vacuum of the table.
</para>
+
+ <para>
+ Conversely, a range can be de-summarized using the
+ <function>brin_desummarize_range(regclass, bigint)</function> function,
+ which is useful when the index tuple is no longer a very good
+ representation because the existing values have changed.
+ See <xref linkend="functions-admin-index"/> for details.
+ </para>
+
</sect2>
</sect1>

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 9ffcdc629e6..a5bac9f7373 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -578,8 +578,10 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
</term>
<listitem>
<para>
- Defines whether a summarization run is invoked for the previous page
+ Defines whether a summarization run is queued for the previous page
range whenever an insertion is detected on the next one.
+ See <xref linkend="brin-operation"/> for more details.
+ The default is <literal>off</literal>.
</para>
</listitem>
</varlistentry>

Attachment Content-Type Size
0001-f.txt text/x-diff 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2022-07-04 21:48:02 Re: TAP output format in pg_regress
Previous Message Andres Freund 2022-07-04 21:18:22 Re: [PoC] Improve dead tuple storage for lazy vacuum