PostgreSQL Weekly News - December 13, 2020

Posted on 2020-12-14 by PWN

PostgreSQL Weekly News - December 13, 2020

PostgreSQL Person of the Week:

PostgreSQL Jobs for December

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Michaël Paquier pushed:

  • Fix fd leak in pg_verifybackup. An error code path newly-introduced by 87ae969 forgot to close a file descriptor when verifying a file's checksum. Per report from Coverity, via Tom Lane.

  • Avoid using tuple from syscache for update of pg_database.datfrozenxid. pg_database.datfrozenxid gets updated using an in-place update at the end of vacuum or autovacuum. Since 96cdeae, as pg_database has a toast relation, it is possible for a pg_database tuple to have toast values if there is a large set of ACLs in place. In such a case, the in-place update would fail because of the flattening of the toast values done for the catcache entry fetched. Instead of using a copy from the catcache, this changes the logic to fetch the copy of the tuple by directly scanning pg_database. Per the lack of complaints on the matter, no backpatch is done. Note that before 96cdeae, attempting to insert such a tuple to pg_database would cause a "row is too big" error, so the end-of-vacuum problem was not reachable. Author: Ashwin Agrawal, Junfeng Yang Discussion:

  • pgcrypto: Detect errors with EVP calls from OpenSSL. The following routines are called within pgcrypto when handling digests but there were no checks for failures: - EVP_MD_CTX_size (can fail with -1 as of 3.0.0) - EVP_MD_CTX_block_size (can fail with -1 as of 3.0.0) - EVP_DigestInit_ex - EVP_DigestUpdate - EVP_DigestFinal_ex A set of elog(ERROR) is added by this commit to detect such failures, that should never happen except in the event of a processing failure internal to OpenSSL. Note that it would be possible to use ERR_reason_error_string() to get more context about such errors, but these refer mainly to the internals of OpenSSL, so it is not really obvious how useful that would be. This is left out for simplicity. Per report from Coverity. Thanks to Tom Lane for the discussion. Backpatch-through: 9.5

  • Simplify code for getting a unicode codepoint's canonical class. Three places of unicode_norm.c use a similar logic for getting the combining class from a codepoint. Commit 2991ac5 has added the function get_canonical_class() for this purpose, but it was only called by the backend. This commit refactors the code to use this function in all the places where the combining class is retrieved from a given codepoint. Author: John Naylor Discussion:

  • Refactor MD5 implementations according to new cryptohash infrastructure. This commit heavily reorganizes the MD5 implementations that exist in the tree in various aspects. First, MD5 is added to the list of options available in cryptohash.c and cryptohash_openssl.c. This means that if building with OpenSSL, EVP is used for MD5 instead of the fallback implementation that Postgres had for ages. With the recent refactoring work for cryptohash functions, this change is straight-forward. If not building with OpenSSL, a fallback implementation internal to src/common/ is used. Second, this reduces the number of MD5 implementations present in the tree from two to one, by moving the KAME implementation from pgcrypto to src/common/, and by removing the implementation that existed in src/common/. KAME was already structured with an init/update/final set of routines by pgcrypto (see original pgcrypto/md5.h) for compatibility with OpenSSL, so moving it to src/common/ has proved to be a straight-forward move, requiring no actual manipulation of the internals of each routine. Some benchmarking has not shown any performance gap between both implementations. Similarly to the fallback implementation used for SHA2, the fallback implementation of MD5 is moved to src/common/md5.c with an internal header called md5_int.h for the init, update and final routines. This gets then consumed by cryptohash.c. The original routines used for MD5-hashed passwords are moved to a separate file called md5_common.c, also in src/common/, aimed at being shared between all MD5 implementations as utility routines to keep compatibility with any code relying on them. Like the SHA2 changes, this commit had its round of tests on both Linux and Windows, across all versions of OpenSSL supported on HEAD, with and even without OpenSSL. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion:

  • Fix compilation of uuid-ossp. This module had a dependency on pgcrypto's md5.c that got removed by b67b57a. Instead of the code from pgcrypto, this code can just use the new cryptohash routines for MD5 as a drop-in replacement, so let's just do this switch. This has also the merit to simplify a bit the compilation of uuid-ossp. This requires --with-uuid to be reproduced, and I have used e2fs as a way to reproduce the failure, then test this commit. Per reports from buildfarm members longfin, florican and sifaka. Discussion:

Tom Lane pushed:

  • pg_dump: Reorganize dumpBaseType(). Along the same lines as ed2c7f65b and daa9fe8a5, reduce code duplication by having just one copy of the parts of the query that are the same across all server versions; and make the conditionals control the smallest possible amount of code. This is in preparation for adding another dumpable field to pg_type.

  • Add a couple of regression test cases related to array subscripting. Exercise some error cases that were never reached in the existing regression tests. This is partly for code-coverage reasons, and partly to memorialize the current behavior in advance of planned changes for generic subscripting. Also, I noticed that type_sanity's check to verify that all standard types have array types was never extended when we added arrays for all system catalog rowtypes (f7f70d5e2), nor when we added arrays over domain types (c12d570fa). So do that. Also, since the query's expected output isn't empty, it seems like a good idea to add an ORDER BY to make sure the result stays stable.

  • Doc: explain that the string types can't store \0 (ASCII NUL). This restriction was mentioned in connection with string literals, but it wasn't made clear that it's a general restriction not just a syntactic limitation in query strings. Per unsigned documentation comment. Discussion:

  • Doc: clarify that CREATE TABLE discards redundant unique constraints. The SQL standard says that redundant unique constraints are disallowed, but we long ago decided that throwing an error would be too user-unfriendly, so we just drop redundant ones. The docs weren't very clear about that though, as this behavior was only explained for PRIMARY KEY vs UNIQUE, not UNIQUE vs UNIQUE. While here, I couldn't resist doing some copy-editing and markup-fixing on the adjacent text about INCLUDE options. Per bug #16767 from Matthias vd Meent. Discussion:

  • Remove operator_precedence_warning. This GUC was always intended as a temporary solution to help with finding 9.4-to-9.5 migration issues. Now that all pre-9.5 branches are out of support, and 9.5 will be too before v14 is released, it seems like it's okay to drop it. Doing so allows removal of several hundred lines of poorly-tested code in parse_expr.c, which have been a fertile source of bugs when people did use this. Discussion:

  • Teach contain_leaked_vars that assignment SubscriptingRefs are leaky. array_get_element and array_get_slice qualify as leakproof, since they will silently return NULL for bogus subscripts. But array_set_element and array_set_slice throw errors for such cases, making them clearly not leakproof. contain_leaked_vars was evidently written with only the former case in mind, as it gave the wrong answer for assignment SubscriptingRefs (nee ArrayRefs). This would be a live security bug, were it not that assignment SubscriptingRefs can only occur in INSERT and UPDATE target lists, while we only care about leakproofness for qual expressions; so the wrong answer can't occur in practice. Still, that's a rather shaky answer for a security-related question; and maybe in future somebody will want to ask about leakproofness of a tlist. So it seems wise to fix and even back-patch this correction. (We would need some change here anyway for the upcoming generic-subscripting patch, since extensions might make different tradeoffs about whether to throw errors. Commit 558d77f20 attempted to lay groundwork for that by asking check_functions_in_node whether a SubscriptingRef contains leaky functions; but that idea fails now that the implementation methods of a SubscriptingRef are not SQL-visible functions that could be marked leakproof or not.) Back-patch to 9.6. While 9.5 has the same issue, the code's a bit different. It seems quite unlikely that we'd introduce any actual bug in the short time 9.5 has left to live, so the work/risk/reward balance isn't attractive for changing 9.5. Discussion:

  • Support subscripting of arbitrary types, not only arrays. This patch generalizes the subscripting infrastructure so that any data type can be subscripted, if it provides a handler function to define what that means. Traditional variable-length (varlena) arrays all use array_subscript_handler(), while the existing fixed-length types that support subscripting use raw_array_subscript_handler(). It's expected that other types that want to use subscripting notation will define their own handlers. (This patch provides no such new features, though; it only lays the foundation for them.) To do this, move the parser's semantic processing of subscripts (including coercion to whatever data type is required) into a method callback supplied by the handler. On the execution side, replace the ExecEvalSubscriptingRef* layer of functions with direct calls to callback-supplied execution routines. (Thus, essentially no new run-time overhead should be caused by this patch. Indeed, there is room to remove some overhead by supplying specialized execution routines. This patch does a little bit in that line, but more could be done.) Additional work is required here and there to remove formerly hard-wired assumptions about the result type, collation, etc of a SubscriptingRef expression node; and to remove assumptions that the subscript values must be integers. One useful side-effect of this is that we now have a less squishy mechanism for identifying whether a data type is a "true" array: instead of wiring in weird rules about typlen, we can look to see if pg_type.typsubscript == F_ARRAY_SUBSCRIPT_HANDLER. For this to be bulletproof, we have to forbid user-defined types from using that handler directly; but there seems no good reason for them to do so. This patch also removes assumptions that the number of subscripts is limited to MAXDIM (6), or indeed has any hard-wired limit. That limit still applies to types handled by array_subscript_handler or raw_array_subscript_handler, but to discourage other dependencies on this constant, I've moved it from c.h to utils/array.h. Dmitry Dolgov, reviewed at various times by Tom Lane, Arthur Zakirov, Peter Eisentraut, Pavel Stehule Discussion: Discussion:

  • Allow ALTER TYPE to update an existing type's typsubscript value. This is essential if we'd like to allow existing extension data types to support subscripting in future, since dropping and recreating the type isn't a practical thing for an extension upgrade script, and direct manipulation of pg_type isn't a great answer either. There was some discussion about also allowing alteration of typelem, but it's less clear whether that's a good idea or not, so for now I forebore. Discussion:

  • Provide an error cursor for "can't subscript" error messages. Commit c7aba7c14 didn't add this, but after more fooling with the feature I feel that it'd be useful. To make this possible, refactor getSubscriptingRoutines() so that the caller is responsible for throwing any error. (In clauses.c, I just chose to make the most conservative assumption rather than throwing an error. We don't expect failures there anyway really, so the code space for an error message would be a poor investment.)

  • Allow subscripting of hstore values. This is basically a finger exercise to prove that it's possible for an extension module to add subscripting ability. Subscripted fetch from an hstore is not different from the existing "hstore -> text" operator. Subscripted update does seem to be a little easier to use than the traditional update method using hstore concatenation, but it's not a fundamentally new ability. However, there may be some value in the code as sample code, since it shows what's basically the minimum-complexity way to implement subscripting when one needn't consider nested container objects. Discussion:

Heikki Linnakangas pushed:

Andres Freund pushed:

Fujii Masao pushed:

  • Bump catversion for pg_stat_wal changes. Oversight in 01469241b2. Reported-by: Andres Freund Discussion:

  • Speed up rechecking if relation needs to be vacuumed or analyze in autovacuum. After autovacuum collects the relations to vacuum or analyze, it rechecks whether each relation still needs to be vacuumed or analyzed before actually doing that. Previously this recheck could be a significant overhead especially when there were a very large number of relations. This was because each recheck forced the statistics to be refreshed, and the refresh of the statistics for a very large number of relations could cause heavy overhead. There was the report that this issue caused autovacuum workers to have gotten “stuck” in a tight loop of table_recheck_autovac() that rechecks whether a relation needs to be vacuumed or analyzed. This commit speeds up the recheck by making autovacuum worker reuse the previously-read statistics for the recheck if possible. Then if that "stale" statistics says that a relation still needs to be vacuumed or analyzed, autovacuum refreshes the statistics and does the recheck again. The benchmark shows that the more relations exist and autovacuum workers are running concurrently, the more this change reduces the autovacuum execution time. For example, when there are 20,000 tables and 10 autovacuum workers are running, the benchmark showed that the change improved the performance of autovacuum more than three times. On the other hand, even when there are only 1000 tables and only a single autovacuum worker is running, the benchmark didn't show any big performance regression by the change. Firstly POC patch was proposed by Jim Nasby. As the result of discussion, we used Tatsuhito Kasahara's version of the patch using the approach suggested by Tom Lane. Reported-by: Jim Nasby Author: Tatsuhito Kasahara Reviewed-by: Masahiko Sawada, Fujii Masao Discussion:

Dean Rasheed pushed:

Peter Eisentraut pushed:

Bruce Momjian pushed:

Noah Misch pushed:

Pending Patches

Bharath Rupireddy sent in a patch to allow parallel mode in REFRESH MATERIALIZED VIEW planning.

James Coleman sent in a patch to allow parallel LATERAL subqueries with LIMIT/OFFSET.

Peter Eisentraut sent in a WIP patch to fix temp-install tests to work with macOS SIP.

Pavel Stěhule sent in another revision of a patch to add a unistr function which evaluates various forms of unicode-escaped characters.

Bruce Momjian sent in two more revisions of a patch to implement key management for PostgreSQL.

Justin Pryzby sent in two more revisions of a patch to pg_dump which causes it to output separate "object" for ALTER TABLE..ATTACH PARTITION.

Peter Eisentraut sent in a patch to convert elog(LOG) calls to ereport() where appropriate.

David Zhang sent in two more revisions of a patch to add table access method as an option to pgbench.

Ajin Cherian and Peter Smith traded patches to add two-phase transactions to logical decoding.

Amit Langote, Kyotaro HORIGUCHI, and Keisuke Kuroda traded patches to fix a bug than manifested as huge memory consumption on partitioned tables with foreign keys.

Dean Rasheed sent in two more revisions of a patch to make OR clauses use extended statistics.

Peter Eisentraut sent in two revisions of a patch to remove gratuitous uses of deprecated SELECT INTO, and clarify status of SELECT INTO on the reference page.

Takayuki Tsunakawa sent in two revisions of a patch to fix a bug that manifested as ALTER TABLE SET LOGGED/UNLOGGED on a partitioned table does nothing silently.

Takamichi Osumi sent in another revision of a patch to add an option to disable WAL logging to speed up data loading.

Julien Rouhaud sent in a patch to add a new COLLATION option to REINDEX, which can be used to filter the list of indexes to rebuild. This is handy for times when system collation gets updated.

Daniel Gustafsson sent in another revision of a patch to support page checksum enable/disable in a running cluster.

Peter Eisentraut sent in a patch to remove unnecessary grammar symbols.

Thomas Munro sent in a WIP patch to use Github Actions for CI.

Bharath Rupireddy sent in three more revisions of a patch to two add new functions: pg_terminate_backend(pid, wait, timeout), which terminates and waits or times out for a given backend, and pg_wait_backend(pid, timeout), which checks the existence of the backend with a given PID and waits or times out until it goes away.

Pavel Borisov sent in two more revisions of a patch to implement covering SPGiST index.

Jeff Davis sent in a patch to ensure that the format is sent along with the type in walsender.

Craig Ringer sent in another revision of a patch to replace CLOBBER_CACHE_ALWAYS with a new GUC, debug_clobber_cache_depth.

Amit Kapila and Peter Smith traded patches to speed up throughput in logical replication's tablesync.

Kirk Jamison sent in two more revisions of a patch to prevent invalidating blocks in smgrextend() during recovery, add a bool param in smgrnblocks() for cached blocks to ensures that we return a reliable value from smgrnblocks, make DropRelFileNodeBuffers() more efficient during recovery by avoiding scanning the whole buffer pool when the relation is small enough, or the the total number of blocks to be invalidated is below the threshold of full scanning, and make DropRelFileNodesAllBuffers() more efficient in recovery by skipping the time-consuming scan of the whole buffer pool during recovery when the relation is small enough, or when the number of blocks to be invalidated is below the full scan threshold.

Tomáš Vondra sent in two more revisions of a patch to make it possible to use extended statistics on expressions.

Nathan Bossart and Michaël Paquier traded patches to add the checkpoint/restartpoint status to ps display.

Laurenz Albe sent in two more revisions of a patch to add session statistics to pg_stat_database.

Julien Rouhaud sent in two revisions of a patch to add a bool toplevel column to pg_stat_statements.

Tom Lane sent in a patch to remove operator_precedence_warning.

Antonin Houska sent in another revision of a patch to clean up orphaned files using undo logs.

Joel Jacobson sen in three revisions of a patch to add support for leading/trailing bytea trim()ing.

David Rowley sent in two more revisions of a patch to allow estimate_num_groups() to pass back further details about the estimation, allow users of simplehash.h to perform direct deletions, add a Result Cache executor node, remove some code duplication in nodeResultCache.c, and use a Result Cache node to cache results from subplans.

Greg Nancarrow sent in two more revisions of a patch to add new configuration parameter "disable_event_triggers", and add a new "client_connection" event, supporting a "logon trigger".

Bharath Rupireddy sent in another revision of a patch to add a postgres_fdw function to discard cached connections, add a add keep_connections GUC to the postgres_fdw to avoid caching connections, and add a postgres_fdw server-level option, keep_connection to not cache connections.

Peter Eisentraut sent in a patch to change the definitions of bitmap flags to bit-shifting style.

Nathan Bossart sent in three more revisions of a patch to add a SPREAD option to checkpoint.

Justin Pryzby sent in a patch to make the changes to pg_upgrade/ that are needed to allow testing upgrade from v11, and adds a pg_upgrade test to exercise binary compatibility.

Amit Langote sent in another revision of a patch to set ForeignScanState.resultRelInfo lazily, set ResultRelInfo.ri_PartitionRoot in all result relations, not just those that are targets of tuple routing, and initialize result relation information lazily.

Vigneshwaran C sent in a patch to add some copy-related data structures to typedefs.list which were were added during the split of copy into smaller files.

Peter Eisentraut sent in another revision of a patch to add primary keys and unique constraints to system catalogs.

Vigneshwaran C sent in another revision of a patch to make it possible to run COPY FROM with multiple workers.

Andrey Borodin sent in two more revisions of a patch to add functions to 'pageinspect' to inspect GiST indexes.

Bharath Rupireddy sent in three more revisions of a patch to make it possible to use parallel inserts in CTAS, and make some adjustments to tuple costs for this case.

Stephen Frost sent in two revisions of a patch to change the default checkpoint_completion_target to 0.9.

Greg Nancarrow sent in three more revisions of a patch to make it possible to parallelize parts of INSERT ... SELECT.

Rémi Lapeyre sent in another revision of a patch to add header support to the "COPY" text format.

Alexander Korotkov sent in another revision of a patch to support multiranges.

Takamichi Osumi sent in another revision of a patch to implement a stronger safeguard for archive recovery to ensure that it does not miss data.

Vigneshwaran C sent in another revision of a patch to print backtraces of postgres process that are part of the instance current instance.

Gilles Darold sent in a patch to add three hooks at the XactCommand level: start_xact_command_hook, called at end of start_xact_command(), finish_xact_command, called in finish_xact_command() just before CommitTransactionCommand(), and abort_current_transaction_hook, called after an error is encountered at end of AbortCurrentTransaction().

Masahiro Ikeda sent in two revisions of a patch to to add WAL write/fsync statistics to the pg_stat_wal view.

Denis Smirnov sent in a PoC patch to refactor the AM analyse API.

Bharath Rupireddy sent in a patch to add table Access Methods for Multi and Single Inserts.

Andrey Borodin and Gilles Darold traded patches to use a shared lock rather than an exclusive lock in GetMultiXactIdMembers for offsets and members, make MultiXact local cache size configurable, add a condition variable to wait for the next MultXact offset in a corner case, and add GUCs to tune MultiXact SLRUs.

Hou Zhijie sent in a patch to fix a typo about generate_gather_paths.

Shinya Kato sent in a patch to improve psql's tab completion for CLOSE, FETCH, and MOVE.

Michaël Paquier sent in a patch intended to fix a bug that manifested as occasional tablespace.sql failures in check-world -jnn by arranging for pg_regress to clean out the test tablespace dir, or create it if it doesn't exist on all platforms.

Kyotaro HORIGUCHI sent in a patch to add a new test to detect a replication bug, and ensure that WalSndSegmentOpen tracks a timeline switch while sending a historic timeline running physical replication.

Andrey Borodin sent in another revision of a patch to make it possible to disallow cancelation of syncronous commit.

Zeng Wenjing sent in another revision of a patch to implement global temporary tables.

Dilip Kumar sent in another revision of a patch to implement custom compression methods for tables, including ways to change them on the fly.

Fujii Masao sent in another revision of a patch to add a stats_reset time to pg_stat_statements.

Justin Pryzby sent in another revision of a patch to make pg_ls_*() show directories and shared filesets.

Justin Pryzby sent in a patch to add an INCLUDING ACCESS METHOD option to CREATE TABLE LIKE.

Chen Hujaun sent in another revision of a patch to make it possible to compress pages for OLTP.

Andrey Borodin sent in another revision of a patch to speed up the pglz compression code by converting some macro-functions to regular functions and using a more compact hash table, along with some other optimizations.

Atsushi Torikoshi sent in another revision of a patch to implement pg_get_target_backend_memory_contexts(), which is able to collect arbitrary backend process's memory contexts.

Peter Geoghegan sent in another revision of a patch to pass down a "logically unchanged index" hint, and use same to add bottom-up index deletion.

Dilip Kumar sent in another revision of a patch to ensure that pg_is_wal_replay_paused waits for recovery to pause.

Daniel Gustafsson sent in a patch to move the information callback earlier in TLS negotiation to capture the connection. The callback for retrieving state change information during connection setup was only installed when the connection was mostly set up, and thus didn't provide much information. This also extends the callback with printing detailed information about the state change.

Peter Eisentraut sent in a patch to allow a GRANTED BY clause in normal GRANT and REVOKE statements, per the SQL standard.

Stephen Frost sent in a patch intended to fix a bug that manifested as autovacuum worker doesn't immediately exit on postmaster death by replacing some system calls with a WaitLatch.

Bharath Rupireddy sent in a patch to fail fast in CTAS/CMV if relation already exists.

Lukas Meisegeier sent in a patch to add an ssltermination parameter for SNI-based load balancing.

Amit Kapila sent in a patch to speed up xor'ing two gist index signatures for tsvectors by using popcount64 on each of the chunks, and avoid a function pointer dereference for calls to pg_popcount32/64().

Kyotaro HORIGUCHI sent in another revision of a patch to rework the stats collector to use shared memory instead of files for temporary storage.

Kyotaro HORIGUCHI sent in a patch to fix the documentation for pg_shmem_allocations by mentioning that it's NULL for anonymous allocations.

Peter Eisentraut sent in a patch to clean up an ancient test style. Tests were written similar to SELECT '' AS two, i.* FROM INT2_TBL where the first column indicated the number of expected result rows. To clean this up, remove all those extra columns.

Tom Lane sent in two revisions of a patch to rework PL/pgsql's assignment implementation to use more of what's in core.

Peter Eisentraut and Justin Pryzby traded patches to allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly.