PostgreSQL Weekly News - August 29, 2021

From: PWN via PostgreSQL Announce <announce-noreply(at)postgresql(dot)org>
To: PostgreSQL Announce <pgsql-announce(at)lists(dot)postgresql(dot)org>
Subject: PostgreSQL Weekly News - August 29, 2021
Date: 2021-08-30 06:57:19
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-announce

# PostgreSQL Weekly News - August 29, 2021

# PostgreSQL Product News

pg_dbms_job 1.0.1, an extension to create, manage and use Oracle-style
DBMS_JOB scheduled jobs,

dbMigration .NET v14.4, a database migration and sync tool,

WAL-G 1.1 a backup management system for PostgreSQL and other databases written
in Go, [released](

pglogical 2.4.0, a logical-WAL-based replication system for PostgreSQL,

Crunchy PostgreSQL Operator 5.0.0, a system for deploying and managing open
source PostgreSQL clusters on Kubernetes,

`set_user` 2.0.1, an extension allowing privilege escalation with enhanced
logging and control, [released](

AGE 0.5.0, a PostgreSQL extension that provides graph database functionality,

pg_msvc_generator 1.0.0 beta, a tool for making Windows versions of extensions,

# PostgreSQL Jobs for August


# PostgreSQL in the News

Planet PostgreSQL: [](

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david(at)fetter(dot)org(dot)

# Applied Patches

Michaël Paquier pushed:

- Fix backup manifests to generate correct WAL-Ranges across timelines. In a
backup manifest, WAL-Ranges stores the range of WAL that is required for the
backup to be valid. pg_verifybackup would then internally use pg_waldump for
the checks based on this data. When the timeline where the backup started was
more than 1 with a history file looked at for the manifest data generation,
the calculation of the WAL range for the first timeline to check was
incorrect. The previous logic used as start LSN the start position of the
first timeline, but it needs to use the start LSN of the backup. This would
cause failures with pg_verifybackup, or any tools making use of the backup
manifests. This commit adds a test based on a logic using a self-promoted
node, making it rather cheap. Author: Kyotaro Horiguchi Discussion:
Backpatch-through: 13

- Add tab completion for EXPLAIN .. EXECUTE in psql. Author: Dagfinn Ilmari
Mannsåker Discussion:

- Fix incorrect merge in ECPG code with DECLARE. The same condition was repeated
twice when comparing the connection used by existing declared statement with
the one coming from a fresh DECLARE statement. This had no consequences, but
let's keep the code clean. Oversight in f576de1. Author: Shenhao Wang
Backpatch-through: 14

Bruce Momjian pushed:

- Improve defaults shown in postgresql.conf.sample and pg_settings. Previously,
these showed unlikely default values. The new default value 128MB (since PG
10) is not always accurate since initdb tries several increasing values, but
it likely to be accurate. Reported-by: Zhangjie <zhangjie2(at)fujitsu(dot)com>
Author: Zhangjie Backpatch-through: master

Álvaro Herrera pushed:

- Avoid creating archive status ".ready" files too early. WAL records may span
multiple segments, but XLogWrite() does not wait for the entire record to be
written out to disk before creating archive status files. Instead, as soon as
the last WAL page of the segment is written, the archive status file is
created, and the archiver may process it. If PostgreSQL crashes before it is
able to write and flush the rest of the record (in the next WAL segment), the
wrong version of the first segment file lingers in the archive, which causes
operations such as point-in-time restores to fail. To fix this, keep track of
records that span across segments and ensure that segments are only marked
ready-for-archival once such records have been completely written to disk.
This has always been wrong, so backpatch all the way back. Author: Nathan
Bossart <bossartn(at)amazon(dot)com> Reviewed-by: Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> Reviewed-by: Ryo Matsumura
<matsumura(dot)ryo(at)fujitsu(dot)com> Reviewed-by: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>

- psql \dP: reference regclass with "pg_catalog." prefix. Strictly speaking this
isn't a bug, but since all references to catalog objects are schema-qualified,
we might as well be consistent. The omission first appeared in commit
1c5d9270e339, so backpatch to 12. Author: Justin Pryzby
<pryzbyj(at)telsasoft(dot)com> Discussion:

- psql \dX: reference regclass with "pg_catalog." prefix. Déjà vu of commit
fc40ba1296a7, for another backslash command. Strictly speaking this isn't a
bug, but since all references to catalog objects are schema-qualified, we
might as well be consistent. The omission first appeared in commit
ad600bba0422 and replicated in a4d75c86bf15; backpatch to 14. Author: Justin
Pryzby <pryzbyj(at)telsasoft(dot)com> Discussion:

- Keep stats up to date for partitioned tables. In the long-going saga for
analyze on partitioned tables, one thing I missed while reverting 0827e8af70f4
is the maintenance of analyze count and last analyze time for partitioned
tables. This is a mostly trivial change that enables users assess the need
for invoking manual ANALYZE on partitioned tables. This patch, posted by
Justin and modified a bit by me (Álvaro), can be mostly traced back to
Hosoya-san, though any problems introduced with the scissors are mine.
Backpatch to 14, in line with 6f8127b73901. Co-authored-by: Yuzuko Hosoya
<yuzukohosoya(at)gmail(dot)com> Co-authored-by: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Co-authored-by: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> Reported-by: Justin
Pryzby <pryzby(at)telsasoft(dot)com> Discussion:

Tom Lane pushed:

- Prevent regexp back-refs from sometimes matching when they shouldn't. The
recursion in cdissect() was careless about clearing match data for capturing
parentheses after rejecting a partial match. This could allow a later
back-reference to succeed when by rights it should fail for lack of a defined
referent. To fix, think a little more rigorously about what the contract
between different levels of cdissect's recursion needs to be. With the right
spec, we can fix this using fewer rather than more resets of the match data;
the key decision being that a failed sub-match is now explicitly responsible
for clearing any matches it may have set. There are enough other cross-checks
and optimizations in the code that it's not especially easy to exhibit this
problem; usually, the match will fail as-expected. Plus, regexps that are
even potentially vulnerable are most likely user errors, since there's just
not much point in writing a back-ref that doesn't always have a referent.
These facts perhaps explain why the issue hasn't been detected, even though
it's almost certainly a couple of decades old. Discussion:

- Fix regexp misbehavior with capturing parens inside "{0}". Regexps like
"(.){0}...\1" drew an "invalid backreference number". That's not unreasonable
on its face, since the capture group will never be matched if it's iterated
zero times. However, other engines such as Perl's don't complain about this,
nor do we throw an error for related cases such as "(.)|\1", even though that
backref can never succeed either. Also, if the zero-iterations case happens
at runtime rather than compile time --- say, `"(x)*...\1"` when there's no `"x"`
to be found --- that's not an error, we just deem the backref to not match.
Making this even less defensible, no error was thrown for nested cases such as
"((.)){0}...\2"; and to add insult to injury, those cases could result in
assertion failures instead. (It seems that nothing especially bad happened in
non-assert builds, though.) Let's just fix it so that no error is thrown and
instead the backref is deemed to never match, so that compile-time detection
of no iterations behaves the same as run-time detection. Per report from Mark
Dilger. This appears to be an aboriginal error in Spencer's library, so
back-patch to all supported versions. Pre-v14, it turns out to also be
necessary to back-patch one aspect of commits cb76fbd7e/00116dee5, namely to
create capture-node subREs with the begin/end states of their subexpressions,
not the current lp/rp of the outer parseqatom invocation. Otherwise delsub
complains that we're trying to disconnect a state from itself. This is a bit
scary but code examination shows that it's safe: in the pre-v14 code, if we
want to wrap iteration around the subexpression, the first thing we do is
overwrite the atom's begin/end fields with new states. So the bogus values
didn't survive long enough to be used for anything, except if no iteration is
required, in which case it doesn't matter. Discussion:

- Remove redundant test. The condition "context_start < context_end" is strictly
weaker than "context_end - context_start >= 50", so we don't need both.
Oversight in commit ffd3944ab, noted by tanghy.fnst. In passing, line-wrap a
nearby test to make it more readable. Discussion:

- Handle interaction of regexp's makesearch and MATCHALL more honestly. Second
thoughts about commit 824bf7190: we apply makesearch() to an NFA after having
determined whether it is a MATCHALL pattern. Prepending `".*"` doesn't make it
non-MATCHALL, but it does change the maximum possible match length, and
makesearch() failed to update that. This has no ill effects given the stylized
usage of search NFAs, but it seems like it's better to keep the data structure
consistent. In particular, fixing this allows more honest handling of the
MATCHALL check in matchuntil(): we can now assert that maxmatchall is
infinity, instead of lamely assuming that it should act that way. In passing,
improve the code in dump[c]nfa so that infinite maxmatchall is printed as
"inf" not a magic number.

- Count SP-GiST index scans in pg_stat statistics. Somehow, spgist overlooked
the need to call pgstat_count_index_scan(). Hence,
pg_stat_all_indexes.idx_scan and equivalent columns never became nonzero for
an SP-GiST index, although the related per-tuple counters worked fine. This
fix works a bit differently from other index AMs, in that the counter
increment occurs in spgrescan not spggettuple/spggetbitmap. It looks like this
won't make the user-visible semantics noticeably different, so I won't go to
the trouble of introducing an is-this- the-first-call flag just to make the
counter bumps happen in the same places. Per bug #17163 from Christian Quest.
Back-patch to all supported versions. Discussion:

- Doc: add a little about LACON execution to src/backend/regex/README. I wrote
this while thinking about a possible optimization, but it's a useful
description of the existing code regardless of whether the optimization ever
happens. So push it separately.

Amit Kapila pushed:

- Fix Alter Subscription's Add/Drop Publication behavior. The current refresh
behavior tries to just refresh added/dropped publications but that leads to
removing wrong tables from subscription. We can't refresh just the dropped
publication because it is quite possible that some of the tables are removed
from publication by that time and now those will remain as part of the
subscription. Also, there is a chance that the tables that were part of the
publication being dropped are also part of another publication, so we can't
remove those. So, we decided that by default, add/drop commands will also act
like REFRESH PUBLICATION which means they will refresh all the publications.
We can keep the old behavior for "add publication" but it is better to be
consistent with "drop publication". Author: Hou Zhijie Reviewed-by: Masahiko
Sawada, Amit Kapila Backpatch-through: 14, where it was introduced Discussion:

- Fix toast rewrites in logical decoding. Commit 325f2ec555 introduced
pg_class.relwrite to skip operations on tables created as part of a heap
rewrite during DDL. It links such transient heaps to the original relation OID
via this new field in pg_class but forgot to do anything about toast tables.
So, logical decoding was not able to skip operations on internally created
toast tables. This leads to an error when we tried to decode the WAL for the
next operation for which it appeared that there is a toast data where actually
it didn't have any toast data. To fix this, we set pg_class.relwrite for
internally created toast tables as well which allowed skipping operations on
them during logical decoding. Author: Bertrand Drouvot Reviewed-by: David
Zhang, Amit Kapila Backpatch-through: 11, where it was introduced Discussion:

- Add logical change details to logical replication worker errcontext.
Previously, on the subscriber, we set the error context callback for the tuple
data conversion failures. This commit replaces the existing error context
callback with a comprehensive one so that it shows not only the details of
data conversion failures but also the details of logical change being applied
by the apply worker or table sync worker. The additional information displayed
will be the command, transaction id, and timestamp. The error context is
added to an error only when applying a change but not while doing other work
like receiving data etc. This will help users in diagnosing the problems that
occur during logical replication. It also can be used for future work that
allows skipping a particular transaction on the subscriber. Author: Masahiko
Sawada Reviewed-by: Hou Zhijie, Greg Nancarrow, Haiying Tang, Amit Kapila
Tested-by: Haiying Tang Discussion:

Fujii Masao pushed:

- ecpg: Remove trailing period from error message. This commit improves the
ecpg's error message that commit f576de1db1 updated, so that it gets rid of
trailing period and uppercases the command name in the error message. Author:
Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion:

- Improve error message about valid value for distance in phrase operator. The
distance in phrase operator must be an integer value between zero and
MAXENTRYPOS inclusive. But previously the error message about its valid value
included the information about its upper limit but not lower limit (i.e.,
zero). This commit improves the error message so that it also includes the
information about its lower limit. Back-patch to v9.6 where full-text phrase
search was supported. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao

- Avoid using ambiguous word "positive" in error message. There are two
identical error messages about valid value of modulus for hash partition, in
PostgreSQL source code. Commit 0e1275fb07 improved only one of them so that
ambiguous word "positive" was avoided there, and forgot to improve the other.
This commit improves the other. Which would reduce translator burden.
Back-pach to v11 where the error message exists. Author: Kyotaro Horiguchi
Reviewed-by: Fujii Masao Discussion:

Etsuro Fujita pushed:

- Doc: Tweak function prototype indentation for consistency.

Peter Eisentraut pushed:

- Fix typo.

- psql: Make cancel test more timing robust. The previous coding relied on the
PID file appearing and the query starting "fast enough", which can fail on
slow machines. Also, there might have been an undocumented interference
between alarm and IPC::Run. This new coding doesn't rely on any of these
concurrency mechanisms. Instead, we wait unitl the PID file is complete
before proceeding, and then also wait until the sleep query is registered by
the server. Discussion:

- Fix handling of partitioned index in RelationGetNumberOfBlocksInFork(). Since
a partitioned index doesn't have storage, getting the number of blocks from it
will not give sensible results. Existing callers already check that they
don't call it that way, so there doesn't appear to be a live problem. But for
correctness, handle RELKIND_PARTITIONED_INDEX together with the other
non-storage relkinds. Reviewed-by: Michael Paquier <michael(at)paquier(dot)xyz>
Reviewed-by: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> Discussion:

- Change Texinfo output to UTF-8. Since the whole documentation tool chain is
now UTF-8 and there is an increasing number of non-ISO-8859-1 characters in
the text, keeping the Texinfo output in ISO 8859-1 just creates unnecessary
complications. Depending on the platform, there are conversion failures and
thus build failures, or weirdly converted characters. By changing the output
to UTF-8, the whole encoding conversion business is sidestepped.

Robert Haas pushed:

- Fix broken snapshot handling in parallel workers. Pengchengliu reported an
assertion failure in a parallel woker while performing a parallel scan using
an overflowed snapshot. The proximate cause is that TransactionXmin was set to
an incorrect value. The underlying cause is incorrect snapshot handling in
parallel.c. In particular, InitializeParallelDSM() was unconditionally
calling GetTransactionSnapshot(), because I (rhaas) mistakenly thought that
was always retrieving an existing snapshot whereas, at isolation levels less
than REPEATABLE READ, it's actually taking a new one. So instead do this only
at higher isolation levels where there actually is a single snapshot for the
whole transaction. By itself, this is not a sufficient fix, because we still
need to guarantee that TransactionXmin gets set properly in the workers. The
easiest way to do that seems to be to install the leader's active snapshot as
the transaction snapshot if the leader did not serialize a transaction
snapshot. This doesn't affect the results of future GetTrasnactionSnapshot()
calls since those have to take a new snapshot anyway; what we care about is
the side effect of setting TransactionXmin. Report by Pengchengliu. Patch by
Greg Nancarrow, except for some comment text which I supplied. Discussion:

John Naylor pushed:

- Rename unicode_combining_table to unicode_width_table. No functional changes.
A future commit will use this table for other purposes besides combining

- Change mbbisearch to return the character range. Add a width field to
mbinterval and have mbbisearch return a pointer to the found range rather than
just bool for success. A future commit will add another width besides zero,
and this will allow that to use the same search. Reviewed by Jacob Champion

- Revert "Change mbbisearch to return the character range". This reverts commit
78ab944cd4b9977732becd9d0bc83223b88af9a2. After I had committed eb0d0d2c7 and
78ab944cd, I decided to add a sanity check for a "can't happen" scenario just
to be cautious. It turned out that it already happened in the official Unicode
source data, namely that a character can be both wide and a combining
character. This fact renders the aforementioned commits unnecessary, so revert
both of them. Discussion:

- Revert "Rename unicode_combining_table to unicode_width_table". This reverts
commit eb0d0d2c7300c9c5c22b35975c11265aa4becc84. After I had committed
eb0d0d2c7 and 78ab944cd, I decided to add a sanity check for a "can't happen"
scenario just to be cautious. It turned out that it already happened in the
official Unicode source data, namely that a character can be both wide and a
combining character. This fact renders the aforementioned commits unnecessary,
so revert both of them. Discussion:

- Update display widths as part of updating Unicode. The hardcoded "wide
character" set in ucs_wcwidth() was last updated around the Unicode 5.0 era.
This led to misalignment when printing emojis and other codepoints that have
since been designated wide or full-width. To fix and keep up to date, extend
update-unicode to download the list of wide and full-width codepoints from the
offical sources. In passing, remove some comments about non-spacing
characters that haven't been accurate since we removed the former hardcoded
logic. Jacob Champion Reported and reviewed by Pavel Stehule Discussion:

- Extend collection of Unicode combining characters to beyond the BMP. The
former limit was perhaps a carryover from an older hand-coded table. Since
commit bab982161 we have enough space in mbinterval to store larger
codepoints, so collect all combining characters. Discussion:

Peter Geoghegan pushed:

- contrib/amcheck: Add heapam CHECK_FOR_INTERRUPTS(). Add a
CHECK_FOR_INTERRUPTS() call to make heap relation verification responsive to
query cancellations. Author: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Backpatch: 14-, where amcheck heap verification was introduced.

- vacuumlazy.c: Remove unnecessary parentheses. This was arguably a minor
oversight in commit b4af70cb, which cleaned up the function signatures of
functions that modify IndexBulkDeleteResult variables.

- Reorder log_autovacuum_min_duration log output. This order seems more natural.
It starts with details that are particular to heap and index data structures,
and ends with system-level costs incurred during the autovacuum worker's
VACUUM/ANALYZE operation. Author: Peter Geoghegan <pg(at)bowt(dot)ie> Discussion:
Backpatch: 14-, which enhanced the log output in various ways.

- track_io_timing logging: Don't special case 0 ms. Adjust track_io_timing
related logging code added by commit 94d13d474d. Make it consistent with other
nearby autovacuum and autoanalyze logging code by removing logic that
suppressed zero millisecond outputs. log_autovacuum_min_duration log output
now reliably shows "read:" and "write:" millisecond-based values in its report
(when track_io_timing is enabled). Author: Peter Geoghegan <pg(at)bowt(dot)ie>
Reviewed-By: Stephen Frost <sfrost(at)snowman(dot)net> Discussion:
Backpatch: 14-, where the track_io_timing logging was introduced.

Daniel Gustafsson pushed:

- Avoid invoking PQfnumber in loop constructs. When looping over the resultset
from a SQL query, extracting the field number before the loop body to avoid
repeated calls to PQfnumber is an established pattern. On very wide tables
this can have a performance impact, but it wont be noticeable in the common
case. This fixes a few queries which were extracting the field number in the
loop body. Author: Hou Zhijie <houzj(dot)fnst(at)fujitsu(dot)com> Reviewed-by: Nathan
Bossart <bossartn(at)amazon(dot)com> Discussion:

- docs: clarify bgw_restart_time documentation. Author: Dave Cramer
<davecramer(at)gmail(dot)com> Reviewed-by: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> Discussion:

Stephen Frost pushed:

- docs: Add command tags for SQL commands. Commit 6c3ffd6 added a couple new
predefined roles but didn't properly wrap the SQL commands mentioned in the
description of those roles with command tags, so add them now.
Backpatch-through: 14 Reported-by: Michael Banck Discussion:

- Use maintenance_io_concurrency for ANALYZE prefetch. When prefetching pages
for ANALYZE, we should be using maintenance_io_concurrenty (by calling
get_tablespace_maintenance_io_concurrency(), not
get_tablespace_io_concurrency()). ANALYZE prefetching was introduced in
c6fc50c, so back-patch to 14. Backpatch-through: 14 Reported-By: Egor Rogov

Noah Misch pushed:

- Fix data loss in wal_level=minimal crash recovery of CREATE TABLESPACE. If the
system crashed between CREATE TABLESPACE and the next checkpoint, the result
could be some files in the tablespace unexpectedly containing no rows.
Affected files would be those for which the system did not write WAL; see the
wal_skip_threshold documentation. Before v13, a different set of conditions
governed the writing of WAL; see v12's <sect2 id="populate-pitr">. (The v12
conditions were broader in some ways and narrower in others.) Users may want
to audit non-default tablespaces for unexpected short files. The bug could
have truncated an index without affecting the associated table, and reindexing
the index would fix that particular problem. This fixes the bug by making
create_tablespace_directories() more like TablespaceCreateDbspace().
create_tablespace_directories() was recursively removing tablespace contents,
reasoning that WAL redo would recreate everything removed that way. That
assumption holds for other wal_level values. Under wal_level=minimal, the old
approach could delete files for which no other copy existed. Back-patch to
9.6 (all supported versions). Reviewed by Robert Haas and Prabhat Sahu.
Reported by Robert Haas. Discussion:

Browse pgsql-announce by date

  From Date Subject
Next Message Psycopg Development Team via PostgreSQL Announce 2021-09-01 13:02:38 Psycopg 3.0 beta 1 released!
Previous Message Toshiba via PostgreSQL Announce 2021-08-30 00:13:19 PGSpider extention is newly released