PostgreSQL Weekly News - February 14, 2021

Posted on 2021-02-15 by PWN

PostgreSQL Weekly News - February 14, 2021

Security releases 13.2, 12.6, 11.11, 10.16, 9.6.21, and 9.5.25 are out. Please upgrade ASAP. 9.2.25 is the last release of PostgreSQL 9.5.

Person of the week:

PostgreSQL Product News

check_pgbackrest 2.0, a Nagios-compatible monitor for pgBackRest, released.

AGE 0.3.0, a PostgreSQL extension that provides graph database functionality, released.

PostgreSQL Jobs for February

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Heikki Linnakangas pushed:

  • Fix permission checks on constraint violation errors on partitions. If a cross-partition UPDATE violates a constraint on the target partition, and the columns in the new partition are in different physical order than in the parent, the error message can reveal columns that the user does not have SELECT permission on. A similar bug was fixed earlier in commit 804b6b6db4. The cause of the bug is that the callers of the ExecBuildSlotValueDescription() function got confused when constructing the list of modified columns. If the tuple was routed from a parent, we converted the tuple to the parent's format, but the list of modified columns was grabbed directly from the child's RTE entry. ExecUpdateLockMode() had a similar issue. That lead to confusion on which columns are key columns, leading to wrong tuple lock being taken on tables referenced by foreign keys, when a row is updated with INSERT ON CONFLICT UPDATE. A new isolation test is added for that corner case. With this patch, the ri_RangeTableIndex field is no longer set for partitions that don't have an entry in the range table. Previously, it was set to the RTE entry of the parent relation, but that was confusing. NOTE: This modifies the ResultRelInfo struct, replacing the ri_PartitionRoot field with ri_RootResultRelInfo. That's a bit risky to backpatch, because it breaks any extensions accessing the field. The change that ri_RangeTableIndex is not set for partitions could potentially break extensions, too. The ResultRelInfos are visible to FDWs at least, and this patch required small changes to postgres_fdw. Nevertheless, this seem like the least bad option. I don't think these fields widely used in extensions; I don't think there are FDWs out there that uses the FDW "direct update" API, other than postgres_fdw. If there is, you will get a compilation error, so hopefully it is caught quickly. Backpatch to 11, where support for both cross-partition UPDATEs, and unique indexes on partitioned tables, were added. Reviewed-by: Amit Langote Security: CVE-2021-3393

Tom Lane pushed:

  • Fix mishandling of column-level SELECT privileges for join aliases. scanNSItemForColumn, expandNSItemAttrs, and ExpandSingleTable would pass the wrong RTE to markVarForSelectPriv when dealing with a join ParseNamespaceItem: they'd pass the join RTE, when what we need to mark is the base table that the join column came from. The end result was to not fill the base table's selectedCols bitmap correctly, resulting in an understatement of the set of columns that are read by the query. The executor would still insist on there being at least one selectable column; but with a correctly crafted query, a user having SELECT privilege on just one column of a table would nonetheless be allowed to read all its columns. To fix, make markRTEForSelectPriv fetch the correct RTE for itself, ignoring the possibly-mismatched RTE passed by the caller. Later, we'll get rid of some now-unused RTE arguments, but that risks API breaks so we won't do it in released branches. This problem was introduced by commit 9ce77d75c, so back-patch to v13 where that came in. Thanks to Sven Klemm for reporting the problem. Security: CVE-2021-20229

  • Remove no-longer-used RTE argument of markVarForSelectPriv(). In the wake of c028faf2a, this is no longer needed. I left it out of that patch since the API change would be undesirable in a released branch; but there's no reason not to do it in HEAD.

  • Simplify jsonfuncs.c code by using strtoint() not strtol(). Explicitly testing for INT_MIN and INT_MAX isn't particularly good style; it's tedious and may draw useless compiler warnings on machines where int and long are the same width. We invented strtoint() precisely for this usage, so use that instead. While here, remove gratuitous variations in the way the tests for did-strtoint-succeed were spelled. Also, avoid attempting to negate INT_MIN; that would probably work given that the result is implicitly cast to uint32, but I think it's nominally undefined behavior. Per gripe from Ranier Vilela, though this isn't his proposed patch. Discussion:

  • Remove dead code in ECPGconnect(), and improve documentation. The stanza in ECPGconnect() that intended to allow specification of a Unix socket directory path in place of a port has never executed since it was committed, nearly two decades ago; the preceding strrchr() already found the last colon so there cannot be another one. The lack of complaints about that is doubtless related to the fact that no user-facing documentation suggested it was possible. Rather than try to fix that up, let's just remove the unreachable code, and instead document the way that does work to write a socket directory path, namely specifying it as a "host" option. In support of that, make another pass at clarifying the syntax documentation for ECPG connection targets, particularly documenting which things are parsed as identifiers and where to use double quotes. Rearrange some things that seemed poorly ordered, and fix a couple of minor doc errors. Kyotaro Horiguchi, per gripe from Shenhao Wang (docs changes mostly by me) Discussion:

  • Avoid divide-by-zero in regex_selectivity() with long fixed prefix. Given a regex pattern with a very long fixed prefix (approaching 500 characters), the result of pow(FIXED_CHAR_SEL, fixed_prefix_len) can underflow to zero. Typically the preceding selectivity calculation would have underflowed as well, so that we compute 0/0 and get NaN. In released branches this leads to an assertion failure later on. That doesn't happen in HEAD, for reasons I've not explored yet, but it's surely still a bug. To fix, just skip the division when the pow() result is zero, so that we'll (most likely) return a zero selectivity estimate. In the edge cases where "sel" didn't yet underflow, perhaps this isn't desirable, but I'm not sure that the case is worth spending a lot of effort on. The results of regex_selectivity_sub() are barely worth the electrons they're written on anyway :-( Per report from Alexander Lakhin. Back-patch to all supported versions. Discussion:

  • Tweak compiler version cutoff for no_sanitize("alignment") support. Buildfarm results show that gcc up through 7.x produces annoying warnings for this construct (and, presumably, wouldn't do the right thing anyway). clang seems okay with the cutoff we have, though. Discussion: Discussion:

Peter Geoghegan pushed:

Michaël Paquier pushed:

Fujii Masao pushed:

  • Display the time when the process started waiting for the lock, in pg_locks. This commit adds new column "waitstart" into pg_locks view. This column reports the time when the server process started waiting for the lock if the lock is not held. This information is useful, for example, when examining the amount of time to wait on a lock by subtracting "waitstart" in pg_locks from the current time, and identify the lock that the processes are waiting for very long. This feature uses the current time obtained for the deadlock timeout timer as "waitstart" (i.e., the time when this process started waiting for the lock). Since getting the current time newly can cause overhead, we reuse the already-obtained time to avoid that overhead. Note that "waitstart" is updated without holding the lock table's partition lock, to avoid the overhead by additional lock acquisition. This can cause "waitstart" in pg_locks to become NULL for a very short period of time after the wait started even though "granted" is false. This is OK in practice because we can assume that users are likely to look at "waitstart" when waiting for the lock for a long time. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao Discussion:

  • Revert "Display the time when the process started waiting for the lock, in pg_locks.". This reverts commit 3b733fcd04195399db56f73f0616b4f5c6828e18. Per buildfarm members prion and rorqual.

Amit Kapila pushed:

  • Make pg_replication_origin_drop safe against concurrent drops. Currently, we get the origin id from the name and then drop the origin by taking ExclusiveLock on ReplicationOriginRelationId. So, two concurrent sessions can get the id from the name at the same time and then when they try to drop the origin, one of the sessions will get the either "tuple concurrently deleted" or "cache lookup failed for replication origin ..". To prevent this race condition we do the entire operation under lock. This obviates the need for replorigin_drop() API and we have removed it so if any extension authors are using it they need to instead use replorigin_drop_by_name. See it's usage in pg_replication_origin_drop(). Author: Peter Smith Reviewed-by: Amit Kapila, Euler Taveira, Petr Jelinek, and Alvaro Herrera Discussion:

  • Allow multiple xacts during table sync in logical replication. For the initial table data synchronization in logical replication, we use a single transaction to copy the entire table and then synchronize the position in the stream with the main apply worker. There are multiple downsides of this approach: (a) We have to perform the entire copy operation again if there is any error (network breakdown, error in the database operation, etc.) while we synchronize the WAL position between tablesync worker and apply worker; this will be onerous especially for large copies, (b) Using a single transaction in the synchronization-phase (where we can receive WAL from multiple transactions) will have the risk of exceeding the CID limit, (c) The slot will hold the WAL till the entire sync is complete because we never commit till the end. This patch solves all the above downsides by allowing multiple transactions during the tablesync phase. The initial copy is done in a single transaction and after that, we commit each transaction as we receive. To allow recovery after any error or crash, we use a permanent slot and origin to track the progress. The slot and origin will be removed once we finish the synchronization of the table. We also remove slot and origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not finished. The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true cannot be executed inside a transaction block because they can now drop the slots for which we have no provision to rollback. This will also open up the path for logical replication of 2PC transactions on the subscriber side. Previously, we can't do that because of the requirement of maintaining a single transaction in tablesync workers. Bump catalog version due to change of state in the catalog (pg_subscription_rel). Author: Peter Smith, Amit Kapila, and Takamichi Osumi Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila Discussion:

  • Fix Subscription test added by commit ce0fdbfe97. We want to test the variants of Alter Subscription that are not allowed in the transaction block but for that, we don't need to create a subscription that tries to connect to the publisher. As such, there is no problem with this test but it is good to allow such tests to run with wal_level = minimal and max_wal_senders = 0 so as to keep them consistent with other tests. Reported by buildfarm. Author: Amit Kapila Reviewed-by: Ajin Cherian Discussion:

Peter Eisentraut pushed:

Magnus Hagander pushed:

Alexander Korotkov pushed:

Bruce Momjian pushed:

Pending Patches

Tang Haiying sent in another revision of a patch to support tab completion for upper case inputs in psql when using set/reset/show.

Pavel Borisov sent in three revisions of a patch to make amcheck check the UNIQUE constraint for btree indexes.

Vigneshwaran C sent in three more revisions of a patch to make the libpq connection parameter "target_session_attrs" support new values: read-only, primary, standby, and prefer-standby.

Iwata Aya sent in two more revisions of a patch to add tracing to libpq.

Amit Langote, Greg Nancarrow, and Hou Zhijie traded patches to implement parallel execution for INSERT ... SELECT.

Scott Mead sent in another revision of a patch to make autovacuum dynamically decrease cost_limit and cost_delay.

Matthias van de Meent and Josef Šimánek traded patches to enhance COPY progress reporting.

Heikki Linnakangas and John Naylor traded patches to speed up utf-8 checking with SIMD instructions.

Mark Rofail and Joel Jacobson traded patches to implement foreign key arrays.

Amit Langote sent in another revision of a patch to set ForeignScanState.resultRelInfo and initialize result relation information lazily.

Peter Eisentraut sent in a patch to add routine usage information schema tables.

Heikki Linnakangas sent in another revision of a patch to add a 'noError' argument to encoding conversion functions, and use same to do COPY FROM encoding conversion/verification in larger chunks.

Alexey Bashtanov sent in a patch to add a bit_xor aggregate.

Daniel Gustafsson sent in another revision of a patch to make it possible to use NSS for libpq's TLS implementation.

Jacob Champion sent in two more revisions of a patch to log authenticated identity from all auth backends.

Kyotaro HORIGUCHI and Dilip Kumar traded patches to provide a new interface to get the recovery pause status.

Tom Lane sent in a patch to disallow some bug-prone characters from being used as the names of custom GUCs.

Nathan Bossart sent in a patch to broaden the scope of the heap-only tuples (HOT) optimization be more discerning about updating only indexes where the indexed value actually changed. Before this, HOT could only work on completely unindexed columns, as the alternative was to update all indexes regardless of whether anything in them had actually changed.

Peter Geoghegan sent in three revisions of a patch to use 64-bit XIDs in deleted nbtree pages, and add pages_newly_deleted to VACUUM VERBOSE.

Takayuki Tsunakawa sent in another revision of a patch to speed up COPY FROM when the target table has remote partitions.

Justin Pryzby sent in another revision of a patch to make CLUSTER work on partitioned tables.

Stephen Frost sent in another revision of a patch to include the I/O timing if track_io_timing is enabled in logs for autovacuum and autoanalyze along with the read rate and the dirty rate for autoanalyze.

Peter Smith sent in two more revisions of a patch to implement logical decoding of two-phase transactions.

Andy Fan sent in a patch to Introduce notnullattrs field in RelOptInfo to indicate which attributes are not null in current query.

Etsuro Fujita sent in two more revisions of a patch to implement synchronous append on PostgreSQL FDW nodes.

Ranier Vilela and Michaël Paquier traded patches to fix a possible out-of-bounds access in pg_cryptohash_final by adding a length argument to same.

Dilip Kumar, Robert Haas, and Justin Pryzby traded patches to add custom compression methods for tables.

Michail Nikolaev sent in another revision of a patch to add full support for index LP_DEAD hint bits on standbys.

Justin Pryzby sent in a patch to touch up the documentation for 14.

Peter Eisentraut sent in another revision of a patch to implement SQL standard function bodies for SQL functions.

Peter Eisentraut sent in a patch to add tests for the bytea LIKE operator.

Fujii Masao sent in a patch to intended to fix a bug that manifested as ERROR: invalid spinlock number: 0 by moving the assignment of written_lsn to a place where it can use pg_atomic_read_u64(&WalRcv->writtenUpto) better.

Tomáš Vondra sent in another revision of a patch to implement BRIN multi-range indexes.

Anastasia Lubennikova sent in another revision of a patch to intended to fix a bug that manifested as pg_upgrade fails with non-standard ACL.

Melanie Plageman sent in another revision of a patch to update comments and phase naming for parallel hash joins.

Noah Misch sent in two more revisions of a patch to dump public schema ownership and security labels, and dump COMMENT ON SCHEMA public.

Zhihong Yu and Ranier Vilela traded patches to fix a possible dereference null return in src/backend/replication/logical/reorderbuffer.c.

Tom Lane sent in two revisions of a patch to invent rainbow arcs for regexes and short-circuit character-by-character scanning when matching a sub-NFA that is like "." or ".*" or variants of that, ie it will match any sequence of some number of characters.

Thomas Munro sent in a patch to try to hold onto buffers between WAL records, which if successful amortizes the cost of looking up, pinning, locking, unlocking and unpinning said buffers over multiple executions.

Erik Rijkers and Amit Kapila traded patches to fix a recent breakage of logical replication.

Bharath Rupireddy sent in a patch to emove unnecessary wrapping of MakeTupleTableSlot in MakeSingleTupleTableSlot.

Li Japin sent in another revision of a patch to implement ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION.

Noah Misch sent in a patch to add a public schema default ACL.