PostgreSQL Weekly News - September 19, 2021

Posted on 2021-09-20 by PWN

PostgreSQL Weekly News - September 19, 2021

Pgpool-II 4.2.5, a connection pooler and statement replication system for PostgreSQL, released

Database Lab 2.5, a tool for fast cloning of large PostgreSQL databases to build non-production environments, released.

pgexporter 0.1.0, a Prometheus exporter for PostgreSQL, released

PostgreSQL Product News

PostgreSQL Jobs for September

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Michaël Paquier pushed:

  • Refactor the syslogger pipe protocol to use a bitmask for its options. The previous protocol expected a set of matching characters to check if a message sent was the last one or not, that changed depending on the destination wanted: - 't' and 'f' tracked the last message of a log sent to stderr. - 'T' and 'F' tracked the last message of a log sent to csvlog. This could be extended with more characters when introducing new destinations, but using a bitmask is much more elegant. This commit changes the protocol so as a bitmask is used in the header of a log chunk message sent to the syslogger, with the following options available for now: - log_destination as stderr. - log_destination as csvlog. - if a message is the last chunk of a message. Sehrope found this issue in a patch set to introduce JSON as an option for log_destination, but his patch made the size of the protocol header larger. This commit keeps the same size as the original, and adapts the protocol as wanted. Thanks also to Andrew Dunstan and Greg Stark for the discussion. Author: Michael Paquier, Sehrope Sarkuni Discussion:

  • Add regression tests for csvlog with the logging collector. These are added in the existing tests of pg_ctl for log rotation, that already tested stderr. The same amount of coverage is added for csvlog: - Checks for pg_current_logfile(). - Log rotation with expected file name. - Log contents generated. This test is refactored to minimize the amount of work required to add tests for new log formats, easing some upcoming work. Author: Michael Paquier, Sehrope Sarkuni Discussion:

  • Fix error handling with threads on OOM in ECPG connection logic. An out-of-memory failure happening when allocating the structures to store the connection parameter keywords and values would mess up with the set of connections saved, as on failure the pthread mutex would still be hold with the new connection object listed but free()'d. Rather than just unlocking the mutex, which would leave the static list of connections into an inconsistent state, move the allocation for the structures of the connection parameters before beginning the test manipulation. This ensures that the list of connections and the connection mutex remain consistent all the time in this code path. This error is unlikely going to happen, but this could mess up badly with ECPG clients in surprising ways, so backpatch all the way down. Reported-by: ryancaicse Discussion: Backpatch-through: 9.6

  • Remove code duplication for permission checks with replication slots. Two functions, both named check_permissions(), used the same checks to verify if a user had required privileges to work on replication slots. This commit removes the duplication, and moves the function doing the checks to slot.c to be centralized. Author: Bharath Rupireddy Reviewed-by: Nathan Bossart, Euler Taveira Discussion:

  • Update README for resource owners about the resource types supported. All the types supported were listed directly in the README, but it was very outdated. Rather than listing all the types supported in the README, this commit adds a reference to look at ResourceOwnerData in resowner.c to get this information. The order of the paragraphs is reworked a bit for clarity. Author: Amit Langote Discussion:

  • Support "postgres -C" with runtime-computed GUCs. Until now, the -C option of postgres was handled before a small subset of GUCs computed at runtime are initialized, leading to incorrect results as GUC machinery would fall back to default values for such parameters. For example, data_checksums could report "off" for a cluster as the control file is not loaded yet. Or wal_segment_size would show a segment size at 16MB even if initdb --wal-segsize used something else. Worse, the command would fail to properly report the recently-introduced shared_memory, that requires to load shared_preload_libraries as these could ask for a chunk of shared memory. Support for runtime GUCs comes with a limitation, as the operation is now allowed on a running server. One notable reason for this is that _PG_init() functions of loadable libraries are called before all runtime-computed GUCs are initialized, and this is not guaranteed to be safe to do on running servers. For the case of shared_memory_size, where we want to know how much memory would be used without allocating it, this limitation is fine. Another case where this will help is for huge pages, with the introduction of a different GUC to evaluate the amount of huge pages required for a server before starting it, without having to allocate large chunks of memory. This feature is controlled with a new GUC flag, and four parameters are classified as runtime-computed as of this change: - data_checksums - shared_memory_size - data_directory_mode - wal_segment_size Some TAP tests are added to provide some coverage here, using data_checksums in the tests of pg_checksums. Per discussion with Andres Freund, Justin Pryzby, Magnus Hagander and more. Author: Nathan Bossart Discussion:

  • Disable test for postgres -C on Msys. The output generated on Msys is incorrect because of the different way IPC::Run processes outputs with native Perl (converts natively \r\n to \n) and Msys perl (\r\n kept as-is), causing this test to fail. For now, just disable the test to bring the buildfarm to a green state. I think that the correct long-term solution would be to tweak all the routines command_checks_* in to handle this output like psql does when using Msys, by discarding \r automatically before comparing it. Per report from jacana and fairywren. Thanks to Tom Lane for the ping. Discussion:

  • Clarify some errors in pg_receivewal when closing WAL segments. A WAL segment closed during a WAL stream for pg_receivewal would generate incorrect error messages depending on the context, as the file name used when referring to a WAL segment ignored partial files or the compression method used. In such cases, the error message generated (failure on close, seek or rename) would not match a physical file name. The same code paths are used by pg_basebackup, but it uses no partial suffix so it is not impacted. 7fbe0c8 has introduced in walmethods.c a callback to get the exact physical file name used for a given context, this commit makes use of it to improve those error messages. This could be extended to more code paths of pg_basebackup/ in the future, if necessary. Extracted from a larger patch by the same author. Author: Georgios Kokolatos Discussion:

  • Improve some check logic in pg_receivewal. The following things are improved:

  • Fetch the system identifier from the source server before any WAL streaming loop. This triggers extra checks to make sure that pg_receivewal is still connected to a server with the same system ID with a correct timeline. - Switch umask() (for file creation mode mask) and RetrieveWalSegSize() (to fetch the size of WAL segments) a bit later before the initial stream attempt. If the connection was done with a database, pg_receivewal would fail but those commands were still executed, which was a waste. The slot creation and drop are now done before retrieving the segment size. Author: Bharath Rupireddy Reviewed-by: Ronan Dunklau, Michael Paquier Discussion:

Amit Kapila pushed:

Tom Lane pushed:

  • Fix EXIT out of outermost block in plpgsql. Ordinarily, using EXIT this way would draw "control reached end of function without RETURN". However, if the function is one where we don't require an explicit RETURN (such as a DO block), that should not happen. It did anyway, because add_dummy_return() neglected to account for the case. Per report from Herwig Goemans. Back-patch to all supported branches. Discussion:

  • Doc: improve documentation of CREATE/ALTER SUBSCRIPTION. Improve the descriptions of some options. Fix sloppy grammar and markup. Peter Smith and Tom Lane Discussion:

  • Clear conn->errorMessage at successful completion of PQconnectdb(). Commits ffa2e4670 and 52a10224e caused libpq's connection-establishment functions to usually leave a nonempty string in the connection's errorMessage buffer, even after a successful connection. While that was intentional on my part, more sober reflection says that it wasn't a great idea: the string would be a bit confusing. Also this broke at least one application that checked for connection success by examining the errorMessage, instead of using PQstatus() as documented. Let's clear the buffer at success exit, restoring the pre-v14 behavior. Discussion:

  • Fix planner error with multiple copies of an AlternativeSubPlan. It's possible for us to copy an AlternativeSubPlan expression node into multiple places, for example the scan quals of several partition children. Then it's possible that we choose a different one of the alternatives as optimal in each place. Commit 41efb8340 failed to consider this scenario, so its attempt to remove "unused" subplans could remove subplans that were still used elsewhere. Fix by delaying the removal logic until we've examined all the AlternativeSubPlans in a given query level. (This does assume that AlternativeSubPlans couldn't get copied to other query levels, but for the foreseeable future that's fine; cf qual_is_pushdown_safe.) Per report from Rajkumar Raghuwanshi. Back-patch to v14 where the faulty logic came in. Discussion:

  • Send NOTIFY signals during CommitTransaction. Formerly, we sent signals for outgoing NOTIFY messages within ProcessCompletedNotifies, which was also responsible for sending relevant ones of those messages to our connected client. It therefore had to run during the main-loop processing that occurs just before going idle. This arrangement had two big disadvantages: * Now that procedures allow intra-command COMMITs, it would be useful to send NOTIFYs to other sessions immediately at COMMIT (though, for reasons of wire-protocol stability, we still shouldn't forward them to our client until end of command). * Background processes such as replication workers would not send NOTIFYs at all, since they never execute the client communication loop. We've had requests to allow triggers running in replication workers to send NOTIFYs, so that's a problem. To fix these things, move transmission of outgoing NOTIFY signals into AtCommit_Notify, where it will happen during CommitTransaction. Also move the possible call of asyncQueueAdvanceTail there, to ensure we don't bloat the async SLRU if a background worker sends many NOTIFYs with no one listening. We can also drop the call of asyncQueueReadAllNotifications, allowing ProcessCompletedNotifies to go away entirely. That's because commit 790026972 added a call of ProcessNotifyInterrupt adjacent to PostgresMain's call of ProcessCompletedNotifies, and that does its own call of asyncQueueReadAllNotifications, meaning that we were uselessly doing two such calls (inside two separate transactions) whenever inbound notify signals coincided with an outbound notify. We need only set notifyInterruptPending to ensure that ProcessNotifyInterrupt runs, and we're done. The existing documentation suggests that custom background workers should call ProcessCompletedNotifies if they want to send NOTIFY messages. To avoid an ABI break in the back branches, reduce it to an empty routine rather than removing it entirely. Removal will occur in v15. Although the problems mentioned above have existed for awhile, I don't feel comfortable back-patching this any further than v13. There was quite a bit of churn in adjacent code between 12 and 13. At minimum we'd have to also backpatch 51004c717, and a good deal of other adjustment would also be needed, so the benefit-to-risk ratio doesn't look attractive. Per bug #15293 from Michael Powers (and similar gripes from others). Artur Zakirov and Tom Lane Discussion:

  • Improve log messages from pg_import_system_collations(). pg_import_system_collations() was a bit inconsistent about how it reported locales (names output by "locale -a") that it didn't make pg_collation entries for. IMV we should print some suitable message for every locale that we reject, except when it matches a pre-existing pg_collation entry. (This is all at DEBUG1 log level, though, so as not to create noise during initdb.) Add messages for the two cases that were previously not logged, namely unrecognized encoding and client-only encoding. Re-word the existing messages to have a consistent style. Anton Voloshin and Tom Lane Discussion:

  • Disallow LISTEN in background workers. It's possible to execute user-defined SQL in some background processes; for example, logical replication workers can fire triggers. This opens the possibility that someone would try to execute LISTEN in such a context. But since only regular backends ever call ProcessNotifyInterrupt, no messages would actually be received, and thus the registered listener would simply prevent the message queue from being cleaned. Eventually NOTIFY would stop working, which is bad. Perhaps someday somebody will invent infrastructure to make listening in a background worker actually useful. In the meantime, forbid it. Back-patch to v13, which is where we introduced the MyBackendType variable. It'd be a lot harder to implement the check without that, and it doesn't seem worth the trouble. Discussion:

  • Remove arbitrary 64K-or-so limit on rangetable size. Up to now the size of a query's rangetable has been limited by the constants INNER_VAR et al, which mustn't be equal to any real rangetable index. 65000 doubtless seemed like enough for anybody, and it still is orders of magnitude larger than the number of joins we can realistically handle. However, we need a rangetable entry for each child partition that is (or might be) processed by a query. Queries with a few thousand partitions are getting more realistic, so that the day when that limit becomes a problem is in sight, even if it's not here yet. Hence, let's raise the limit. Rather than just increase the values of INNER_VAR et al, this patch adopts the approach of making them small negative values, so that rangetables could theoretically become as long as INT_MAX. The bulk of the patch is concerned with changing Var.varno and some related variables from "Index" (unsigned int) to plain "int". This is basically cosmetic, with little actual effect other than to help debuggers print their values nicely. As such, I've only bothered with changing places that could actually see INNER_VAR et al, which the parser and most of the planner don't. We do have to be careful in places that are performing less/greater comparisons on varnos, but there are very few such places, other than the IS_SPECIAL_VARNO macro itself. A notable side effect of this patch is that while it used to be possible to add INNER_VAR et al to a Bitmapset, that will now draw an error. I don't see any likelihood that it wouldn't be a bug to include these fake varnos in a bitmapset of real varnos, so I think this is all to the good. Although this touches outfuncs/readfuncs, I don't think a catversion bump is required, since stored rules would never contain Vars with these fake varnos. Andrey Lepikhov and Tom Lane, after a suggestion by Peter Eisentraut Discussion:

  • Fix EXPLAIN to handle SEARCH BREADTH FIRST queries. The rewriter transformation for SEARCH BREADTH FIRST produces a FieldSelect on a Var of type RECORD, where the Var references the recursive union's worktable output. EXPLAIN VERBOSE failed to handle this case, because it only expected such Vars to appear in CteScans not WorkTableScans. Fix that, and add some test cases exercising EXPLAIN on SEARCH and CYCLE queries. In principle this oversight is an old bug, but it seems that the case is unreachable without SEARCH BREADTH FIRST, because the parser fails when attempting to create such a reference manually. So for today I'll just patch HEAD/v14. Someday we might find that the code portion of this patch needs to be back-patched further. Per report from Atsushi Torikoshi. Discussion:

  • Fix pull_varnos to cope with translated PlaceHolderVars. Commit 55dc86eca changed pull_varnos to use (if possible) the associated ph_eval_at for a PlaceHolderVar. I missed a fine point though: we might be looking at a PHV in the quals or tlist of a child appendrel, in which case we need to compute a ph_eval_at value that's been translated in the same way that the PHV itself has been (cf. adjust_appendrel_attrs). Fortunately, enough info is available in the PlaceHolderInfo to make such translation possible without additional outside data, so we don't need another round of uglification of planner APIs. This is a little bit complicated, but since it's a hard-to-hit corner case, I'm not much worried about adding cycles here. Per report from Jaime Casanova. Back-patch to v12, like the previous commit. Discussion:

  • Doc: fix typos. "PGcon" should be "PGconn". Noted by D. Frey. Discussion:

Andres Freund pushed:

Peter Eisentraut pushed:

Daniel Gustafsson pushed:

Fujii Masao pushed:

Peter Geoghegan pushed:

  • pageinspect: Make page deletion elog less chatty. An elog that reports the value of a transaction ID stored on a deleted nbtree page was added by commit e5d8a999, which taught page deletion to store full 64-bit XIDs. It seems very chatty on further reflection, so lower its elevel from NOTICE to DEBUG2. Author: Peter Geoghegan Backpatch: 14-, just like the nbtree XID enhancement.