PostgreSQL Weekly News - December 19, 2021

Posted on 2021-12-20 by PWN

FOSDEM PGDay 2022 will be held on line, on Feb 5-6, 2022.

A PostgreSQL Transition Guide, containing much hard-won wisdom, and available in French and English, has been published

pgDay Paris 2022 will be held in Paris, France on March 24, 2022. The CfP is open through December 31, 2021 at midnight, Paris time.

Citus Con, a virtual global developer event, is happening April 12-13, 2022. The CFP is now open.

PostgreSQL Product News

Pgpool-II 4.3.0, a connection pooler and statement replication system for PostgreSQL, released.

Access-to-PostgreSQL v2.3 released.

check_pgbackrest 2.2, a Nagios-compatible monitor for pgBackRest, released.

DB Comparer 5.0 for PostgreSQL released.

Database .NET v33.6, a multi-database management tool, now with support for PostgreSQL, released.

pgAdmin4 6.3, a web- and native GUI control center for PostgreSQL, released.

pgFormatter 5.2, a formatter/beautifier for SQL code, released.

MySQL-to-PostgreSQL v5.5 released.

PostgreSQL Jobs for December

PostgreSQL Local

Nordic PGDay 2022 will be held in Helsinki, Finland at the Hilton Helsinki Strand Hotel on March 22, 2022. The CfP is open through December 31, 2021 here

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Michaël Paquier pushed:

  • Improve psql tab completion for views, FDWs, sequences and transforms. The following improvements are done: - Addition of type completion for ALTER SEQUENCE AS. - Ignore ALTER for transforms, as the command is not supported. - Addition of more completion for ALTER FOREIGN DATA WRAPPER. - Addition of options related to columns in ALTER VIEW. This is a continuation of the work done in 0cd6d3b. Author: Ken Kato Discussion:

  • Centralize timestamp computation of control file on updates. This commit moves the timestamp computation of the control file within the routine of src/common/ in charge of updating the backend's control file, which is shared by multiple frontend tools (pg_rewind, pg_checksums and pg_resetwal) and the backend itself. This change has as direct effect to update the control file's timestamp when writing the control file in pg_rewind and pg_checksums, something that is helpful to keep track of control file updates for those operations, something also tracked by the backend at startup within its logs. This part is arguably a bug, as ControlFileData->time should be updated each time a new version of the control file is written, but this is a behavior change so no backpatch is done. Author: Amul Sul Reviewed-by: Nathan Bossart, Michael Paquier, Bharath Rupireddy Discussion:

  • Fix compatibility thinko for fstat() on standard streams in win32stat.c. GetFinalPathNameByHandleA() cannot be used in compilation environments where _WIN32_WINNT < 0x0600, meaning at least Windows XP used by some buildfarm members under MinGW that Postgres still needs to support. This was reported as a compilation warning by the buildfarm, but this is actually worse than the report as the code would have not worked. Instead, this switches to GetFileInformationByHandle() that is able to fail for standard streams and succeed for redirected ones, which is what we are looking for herein the code emulating fstat(). We also know that it is able to work in all the environments still supported, thanks to the existing logic of win32stat.c. Issue introduced by 10260c7, so backpatch down to 14. Reported-by: Justin Pryzby, via buildfarm member jacana Author: Michael Paquier Reviewed-by: Juan José Santamaría Flecha Discussion: Backpatch-through: 14

  • Fix typos. Author: Lingjie Qiang Discussion:

  • Fix flags of some GUCs and improve some descriptions. This commit fixes some issues with GUCs: - enable_incremental_sort was not marked as GUC_EXPLAIN, causing it to not be listed in the output of EXPLAIN (SETTINGS) if using a value different than the default, contrary to the other planner-level GUCs. - trace_recovery_messages missed GUC_NOT_IN_SAMPLE, like the other developer options. - ssl_renegotiation_limit should be marked as COMPAT_OPTIONS_PREVIOUS. While on it, this fixes one incorrect comment related to autovacuum_freeze_max_age, and improves the descriptions of some other GUCs, recently introduced. Extracted from a larger patch set by the same author. Author: Justin Pryzby Description:

  • Improve psql tab completion for various DROP commands. The following improvements are done: - Handling of RESTRICT/CASCADE for DROP OWNED, matviews and policies. - Handling of DROP TRANSFORM This is a continuation of the work done in 0cd6d3b and f44ceb4. Author: Ken Kato Reviewed-by: Asif Rehman Discussion:

  • Fix comment grammar in slotfuncs.c. Author: Bharath Rupireddy Discussion:

  • Move into separate file all the SQL queries used in pg_upgrade tests. The existing pg_upgrade/ and the buildfarm code have been holding the same set of SQL queries when doing cross-version upgrade tests to adapt the objects created by the regression tests before the upgrade (mostly, incompatible or non-existing objects need to be dropped from the origin, perhaps re-created). This moves all those SQL queries into a new, separate, file with a set of \if clauses to handle the version checks depending on the old version of the cluster to-be-upgraded. The long-term plan is to make the buildfarm code re-use this new SQL file, so as committers are able to fix any compatibility issues in the tests of pg_upgrade with a refresh of the core code, without having to poke at the buildfarm client. Note that this is only able to handle the main regression test suite, and that nothing is done yet for contrib modules yet (these have more issues like their database names). A backpatch down to 10 is done, adapting the version checks as this script needs to be only backward-compatible, so as it becomes possible to clean up a maximum amount of code within the buildfarm client. Author: Justin Pryzby, Michael Paquier Discussion: Backpatch-through: 10

  • pg_waldump: Emit stats summary when interrupted by SIGINT. Previously, pg_waldump would not display its statistics summary if it got interrupted by SIGINT (or say a simple Ctrl+C). It gains with this commit a signal handler for SIGINT, trapping the signal to exit at the earliest convenience to allow a display of the stats summary before exiting. This makes the reports more interactive, similarly to strace -c. This new behavior makes the combination of the options --stats and --follow much more useful, so as the user will get a report for any invocation of pg_waldump in such a case. Information about the LSN range of the stats computed is added as a header to the report displayed. This implementation comes from a suggestion by Álvaro Herrera and myself, following a complaint by the author of this patch about --stats and --follow not being useful together originally. As documented, this is not supported on Windows, though its support would be possible by catching the terminal events associated to Ctrl+C, for example (this may require a more centralized implementation, as other tools could benefit from a common API). Author: Bharath Rupireddy Discussion:

  • Improve the description of various GUCs. This commit fixes a couple of inconsistencies in the descriptions of some GUCs, while making their wording more general regarding the units they rely on. For most of them, this removes the use of terms like "N seconds" or "N bytes", which may not apply easily to all the languages these strings are translated to (from my own experience, this works in French and English, less in Japanese). Per debate between the authors listed below. Author: Justin Pryzby, Michael Paquier Discussion:

  • Fix corruption of toast indexes with REINDEX CONCURRENTLY. REINDEX CONCURRENTLY run on a toast index or a toast relation could corrupt the target indexes rebuilt, as a backend running in parallel that manipulates toast values would directly release the lock on the toast relation when its local operation is done, rather than releasing the lock once the transaction that manipulated the toast values committed. The fix done here is simple: we now hold a ROW EXCLUSIVE lock on the toast relation when saving or deleting a toast value until the transaction working on them is committed, so as a concurrent reindex happening in parallel would be able to wait for any activity and see any new rows inserted (or deleted). An isolation test is added to check after the case fixed here, which is a bit fancy by design as it relies on allow_system_table_mods to rename the toast table and its index to fixed names. This way, it is possible to reindex them directly without any dependency on the OID of the underlying relation. Note that this could not use a DO block either, as REINDEX CONCURRENTLY cannot be run in a transaction block. The test is backpatched down to 13, where it is possible, thanks to c4a7a39, to use allow_system_table_mods in a test suite. Reported-by: Alexey Ermakov Analyzed-by: Andres Freund, Noah Misch Author: Michael Paquier Reviewed-by: Nathan Bossart Discussion: Backpatch-through: 12

  • Improve parsing of options of CREATE/ALTER SUBSCRIPTION. This simplifies the code so as it is not necessary anymore for the caller of parse_subscription_options() to zero SubOpts, holding a bitmaps of the provided options as well as the default/parsed option values. This also simplifies some checks related to the options supported by a command when checking for incompatibilities. While on it, the errors generated for unsupported combinations with "slot_name = NONE" are reordered. This may generate a different errors compared to the previous major versions, but users have to go through all those errors to get a correct command in this case when using incorrect values for options "enabled" and "create\slot", so at the end the resulting command would remain the same. Author: Peter Smith Reviewed-by: Nathan Bossart Discussion:

  • Fix some typos with {a,an}. One of the changes impacts the documentation, so backpatch. Author: Peter Smith Discussion: Backpatch-through: 14

  • Improve description of some WAL records with transaction commands. This commit improves the description of some WAL records for the Transaction RMGR: - Track remote_apply for a transaction commit. This GUC is user-settable, so this information can be useful for debugging. - Add replication origin information for PREPARE TRANSACTION, with the origin ID, LSN and timestamp - Same as above, for ROLLBACK PREPARED. This impacts the format of pg_waldump or anything using these description routines, so no backpatch is done. Author: Masahiko Sawada, Michael Paquier Discussion:

  • Remove assertion for replication origins in PREPARE TRANSACTION. When using replication origins, pg_replication_origin_xact_setup() is an optional choice to be able to set a LSN and a timestamp to mark the origin, which would be additionally added to WAL for transaction commits or aborts (including 2PC transactions). An assertion in the code path of PREPARE TRANSACTION assumed that this data should always be set, so it would trigger when using replication origins without setting up an origin LSN. Some tests are added to cover more this kind of scenario. Oversight in commit 1eb6d65. Per discussion with Amit Kapila and Masahiko Sawada. Discussion: Backpatch-through: 11

  • Adjust behavior of some env settings for the TAP tests of MSVC. edc2332 has introduced in some control on the environment variables LZ4, TAR and GZIP_PROGRAM to allow any TAP tests to be able use those commands. This makes the settings more consistent with src/, as the same default gets used for Make and MSVC builds. Each parameter can be changed in, but as a default gets assigned after loading, it is not possible to unset any of these, and using an empty value would not work with "||=" either. As some environments may not have a compatible command in their PATH (tar coming from MinGW is an issue, for one), this could break tests without an exit path to bypass any failing test. This commit changes things so as the default values for LZ4, TAR and GZIP_PROGRAM are assigned before loading, not after. This way, we keep the same amount of compatibility as a GNU build with the same defaults, and it becomes possible to unset any of those values. While on it, this adds some documentation about those three variables in the section dedicated to the TAP tests for MSVC. Per discussion with Andrew Dunstan. Discussion: Backpatch-through: 10

  • Add option -N/--no-sync to pg_upgrade. This is an option consistent with what the other tools of src/bin/ (pg_checksums, pg_dump, pg_rewind and pg_basebackup) provide which is useful for leveraging the I/O effort when testing things. This is not to be used in a production environment. All the regression tests of pg_upgrade are updated to use this new option. This happens to cut at most a couple of seconds in environments constrained on I/O, by avoiding a flush of data folder for the new cluster upgraded. Author: Michael Paquier Reviewed-by: Peter Eisentraut Discussion:

  • Fix typo in TAP tests of pg_receivewal. Introduced in d62bcc8, noticed while hacking in the area.

Tom Lane pushed:

  • Replace random(), pg_erand48(), etc with a better PRNG API and algorithm. Standardize on xoroshiro128 as our basic PRNG algorithm, eliminating a bunch of platform dependencies as well as fundamentally-obsolete PRNG code. In addition, this API replacement will ease replacing the algorithm again in future, should that become necessary. xoroshiro128 is a few percent slower than the drand48 family, but it can produce full-width 64-bit random values not only 48-bit, and it should be much more trustworthy. It's likely to be noticeably faster than the platform's random(), depending on which platform you are thinking about; and we can have non-global state vectors easily, unlike with random(). It is not cryptographically strong, but neither are the functions it replaces. Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself Discussion:

  • Portability hack for pg_global_prng_state. PGDLLIMPORT is only appropriate for variables declared in the backend, not when the variable is coming from a library included in frontend code. (This isn't a particularly nice fix, but for now, use the same method employed elsewhere.) Discussion:

  • Simplify declaring variables exported from libpgcommon and libpgport. This reverts commits c2d1eea9e and 11b500072, as well as similar hacks elsewhere, in favor of setting up the PGDLLIMPORT macro so that it can just be used unconditionally. That can work because in frontend code, we need no marking in either the defining or consuming files for a variable exported from these libraries; and frontend code has no need to access variables exported from the core backend, either. While at it, write some actual documentation about the PGDLLIMPORT and PGDLLEXPORT macros. Patch by me, based on a suggestion from Robert Haas. Discussion:

  • Doc: improve documentation about ORDER BY in matviews. Remove the confusing use of ORDER BY in an example materialized view. It adds nothing to the example, but might encourage people to follow bad practice. Clarify REFRESH MATERIALIZED VIEW's note about whether view ordering is retained (it isn't). Maciek Sakrejda Discussion:

  • Cope with cross-compiling when checking for a random-number source. Commit 16f96c74d neglected to consider the possibility of cross-compiling, causing cross-compiles to fail at the configure stage unless you'd selected --with-openssl. Since we're now more or less assuming that /dev/urandom is available everywhere, it seems reasonable to assume that the cross-compile target has it too, rather than failing. Per complaint from Vincas Dargis. Back-patch to v14 where this came in. Discussion:

  • psql: include intra-query "--" comments in what's sent to the server. psql's lexer has historically deleted dash-dash (single-line) comments from what's collected and sent to the server. This is inconsistent with what it does for slash-star comments, and people have complained before that they wish such comments would be captured in the server log. Undoing the decision completely seems like too big a behavioral change, however. In particular, comments on lines preceding the start of a query are generally not thought of as being part of that query. What we can do to improve the situation is to capture comments that are clearly within a query, that is after the first non-whitespace, non-comment token but before the query's ending semicolon or backslash command. This is a nearly trivial code change, and it affects only a few regression test results. (It is tempting to try to apply the same rule to slash-star comments. But it's hard to see how to do that without getting strange history behavior for comments that cross lines, especially if the user then starts a new query on the same line as the star-slash. In view of the lack of complaints, let's leave that case alone.) Discussion:

  • psql: treat "--" comments between queries as separate history entries. If we've not yet collected any non-whitespace, non-comment token for a new query, flush the current input line to history before reading another line. This aligns psql's history behavior with the observation that lines containing only comments are generally not thought of as being part of the next query. psql's prompting behavior is consistent with that view, too, since it won't change the prompt until you enter something that's neither whitespace nor a "--" comment. Greg Nancarrow, simplified a bit by me Discussion:

  • psql: initialize comment-begin setting to a useful value by default. Readline's meta-# command is supposed to insert a comment marker at the start of the current line. However, the default marker is "#" which is entirely unhelpful for SQL. Set it to "-- " instead. (This setting can still be overridden in one's ~/.inputrc file, so this change won't affect people who have already taken steps to make the command useful.) Discussion:

  • Avoid leaking memory during large-scale REASSIGN OWNED BY operations. The various ALTER OWNER routines tend to leak memory in CurrentMemoryContext. That's not a problem when they're only called once per command; but in this usage where we might be touching many objects, it can amount to a serious memory leak. Fix that by running each call in a short-lived context. (DROP OWNED BY likely has a similar issue, except that you'll probably run out of lock table space before noticing. REASSIGN is worth fixing since for most non-table object types, it won't take any lock.) Back-patch to all supported branches. Unfortunately, in the back branches this helps to only a limited extent, since the sinval message queue bloats quite a lot in this usage before commit 3aafc030a, consuming memory more or less comparable to what's actually leaked. Still, it's clearly a leak with a simple fix, so we might as well fix it. Justin Pryzby, per report from Guillaume Lelarge Discussion:

  • Add configure probe for rl_variable_bind(). Some exceedingly ancient readline libraries lack this function, causing commit 3d858af07 to fail. Per buildfarm (via Michael Paquier). Discussion:

  • On Windows, close the client socket explicitly during backend shutdown. It turns out that this is necessary to keep Winsock from dropping any not-yet-sent data, such as an error message explaining the reason for process termination. It's pretty weird that the implicit close done by the kernel acts differently from an explicit close, but it's hard to argue with experimental results. Independently submitted by Alexander Lakhin and Lars Kanis (comments by me, though). Back-patch to all supported branches. Discussion: Discussion:

  • Refactor pg_dump's tracking of object components to be dumped. Split the DumpableObject.dump bitmask field into separate bitmasks tracking which components are requested to be dumped (in the existing "dump" field) and which components exist for the particular object (in the new "components" field). This gets rid of some klugy and easily-broken logic that involved setting bits and later clearing them. More importantly, it restores the originally intended behavior that pg_dump's secondary data-gathering queries should not be executed for objects we have no interest in dumping. That optimization got broken when the dump flag was turned into a bitmask, because irrelevant bits tended to remain set in many cases. Since the "components" field starts from a minimal set of bits and is added onto as needed, ANDing it with "dump" provides a reliable indicator of what we actually have to dump, without having to complicate the logic that manages the request bits. This makes a significant difference in the number of queries needed when, for example, there are many functions in extensions. Discussion: Discussion:

  • Rethink pg_dump's handling of object ACLs. Throw away most of the existing logic for this, as it was very inefficient thanks to expensive sub-selects executed to collect ACL data that we very possibly would have no interest in dumping. Reduce the ACL handling in the initial per-object-type queries to be just collection of the catalog ACL fields, as it was originally. Fetch pg_init_privs data separately in a single scan of that catalog, and do the merging calculations on the client side. Remove the separate code path used for pre-9.6 source servers; there is no good reason to treat them differently from newer servers that happen to have empty pg_init_privs. Discussion: Discussion:

  • Postpone calls of unsafe server-side functions in pg_dump. Avoid calling pg_get_partkeydef(), pg_get_expr(relpartbound), and regtypeout until we have lock on the relevant tables. The existing coding is at serious risk of failure if there are any concurrent DROP TABLE commands going on --- including drops of other sessions' temp tables. Arguably this is a bug fix that should be back-patched, but it's moderately invasive and we've not had all that many complaints about such failures. Let's just put it in HEAD for now. Discussion: Discussion:

  • Avoid per-object queries in performance-critical paths in pg_dump. Instead of issuing a secondary data-collection query against each table to be dumped, issue just one query, with a WHERE clause restricting it to be applied to only the tables we intend to dump. Likewise for indexes, constraints, and triggers. This greatly reduces the number of queries needed to dump a database containing many tables. It might seem that WHERE clauses listing many target OIDs could be inefficient, but at least on recent server versions this provides a very substantial speedup. (In principle the same thing could be done with other object types such as functions; but that would require significant refactoring of pg_dump, so those will be tackled in a different way in a following patch.) The new WHERE clauses depend on the unnest() function, which is only present in 8.4 and above. We could implement them differently for older servers, but there is an ongoing discussion that will probably result in dropping pg_dump support for servers before 9.2, so that seems like it'd be wasted work. For now, just bump the server version check to require >= 8.4, without stopping to remove any of the code that's thereby rendered dead. We'll mop that situation up soon. Patch by me, based on an idea from Andres Freund. Discussion:

  • Use PREPARE/EXECUTE for repetitive per-object queries in pg_dump. For objects such as functions, pg_dump issues the same secondary data-collection query against each object to be dumped. This can't readily be refactored to avoid the repetitive queries, but we can PREPARE these queries to reduce planning costs. This patch applies the idea to functions, aggregates, operators, and data types. While it could be carried further, the remaining sorts of objects aren't likely to appear in typical databases enough times to be worth worrying over. Moreover, doing the PREPARE is likely to be a net loss if there aren't at least some dozens of objects to apply the prepared query to. Discussion:

  • Account for TOAST data while scheduling parallel dumps. In parallel mode, pg_dump tries to order the table-data-dumping jobs with the largest tables first. However, it was only consulting the pg_class.relpages value to determine table size. This ignores TOAST data, and so we could make poor scheduling decisions in cases where some large tables are mostly TOASTed data while others have very little. To fix, add in the relpages value for the TOAST table as well. This patch also fixes a potential integer-overflow issue that could result in poor scheduling on machines where off_t is only 32 bits wide. Such platforms are probably extinct in the wild, but we do still nominally support them, so repair. Per complaint from Hans Buschmann. Discussion:

  • On Windows, also call shutdown() while closing the client socket. Further experimentation shows that commit 6051857fc is not sufficient when using (some versions of?) OpenSSL. The reason is obscure, but calling shutdown(socket, SD_SEND) improves matters. Per testing by Andrew Dunstan and Alexander Lakhin. Back-patch as before. Discussion:

  • Doc: improve xfunc-c-type-table. List types numeric and timestamptz, which don't seem to have ever been included here. Restore bigint, which was no-doubt-accidentally deleted in v12. Fix some errors, or at least obsolete usages (nobody declares float arguments as "float8*" anymore, even though they might be that under the hood). Re-alphabetize. Remove the seeming claim that this is a complete list of built-in types. Per question from Oskar Stenberg. Discussion:

  • Create a new type category for "internal use" types. Historically we've put type "char" into the S (String) typcategory, although calling it a string is a stretch considering it can only store one byte. (In our actual usage, it's more like an enum.) This choice now seems wrong in view of the special heuristics that parse_func.c and parse_coerce.c have for TYPCATEGORY_STRING: it's not a great idea for "char" to have those preferential casting behaviors. Worse than that, recent patches inventing special-purpose types like pg_node_tree have assigned typcategory S to those types, meaning they also get preferential casting treatment that's designed on the assumption that they can hold arbitrary text. To fix, invent a new category TYPCATEGORY_INTERNAL for internal-use types, and assign that to all these types. I used code 'Z' for lack of a better idea ('I' was already taken). This change breaks one query in psql/describe.c, which now needs to explicitly cast a catalog "char" column to text before concatenating it with an undecorated literal. Also, a test case in contrib/citext now needs an explicit cast to convert citext to "char". Since the point of this change is to not have "char" be a surprisingly-available cast target, these breakages seem OK. Per report from Ian Campbell. Discussion:

  • Implement poly_distance(). geo_ops.c contains half a dozen functions that are just stubs throwing ERRCODE_FEATURE_NOT_SUPPORTED. Since it's been like that for more than twenty years, there's clearly not a lot of interest in filling in the stubs. However, I'm uncomfortable with deleting poly_distance(), since every other geometric type supports a distance-to-another-object- of-the-same-type function. We can easily add this capability by cribbing from poly_overlap() and path_distance(). It's possible that the (existing) test case for this will show some numeric instability, but hopefully the buildfarm will expose it if so. In passing, improve the documentation to try to explain why polygons are distinct from closed paths in the first place. Discussion:

  • Doc: de-document unimplemented geometric operators. In commit 791090bd7, I made an effort to fill in documentation for all geometric operators listed in pg_operator. However, it now appears that at least some of the omissions may have been intentional, because some of those operator entries point at unimplemented stub functions. Remove those from the docs again. (In HEAD, poly_distance stays, because c5c192d7b just added an implementation for it.) Per complaint from Anton Voloshin. Discussion:

  • Remove unimplemented/undocumented geometric functions & operators. Nobody has filled in these stubs for upwards of twenty years, so it's time to drop the idea that they might get implemented any day now. The associated pg_operator and pg_proc entries are just confusing wastes of space. Per complaint from Anton Voloshin. Discussion:

  • Fix datatype confusion in logtape.c's right_offset(). This could only matter if (a) long is wider than int, and (b) the heap of free blocks exceeds UINT_MAX entries, which seems pretty unlikely. Still, it's a theoretical bug, so backpatch to v13 where the typo came in (in commit c02fdc922). In passing, also make swap_nodes() use consistent datatypes. Ma Liangzhu Discussion:

  • Improve sift up/down code in binaryheap.c and logtape.c. Borrow the logic that's long been used in tuplesort.c: instead of physically swapping the data in two heap entries, keep the value that's being sifted up or down in a local variable, and just move the other values as necessary. This makes the code shorter as well as faster. It's not clear that any current callers are really time-critical enough to notice, but we might as well code heap maintenance the same way everywhere. Ma Liangzhu and Tom Lane Discussion:

  • Remove pg_dump/pg_dumpall support for dumping from pre-9.2 servers. Per discussion, we'll limit support for old servers to those branches that can still be built easily on modern platforms, which as of now is 9.2 and up. Remove over a thousand lines of code dedicated to dumping from older server versions. (As in previous changes of this sort, we aren't removing pg_restore's ability to read older archive files ... though it's fair to wonder how that might be tested nowadays.) This cleans up some dead code left behind by commit 989596152. Discussion:

  • Remove pg_upgrade support for upgrading from pre-9.2 servers. Per discussion, we'll limit support for old servers to those branches that can still be built easily on modern platforms, which as of now is 9.2 and up. Discussion:

  • Remove pg_dump's --no-synchronized-snapshots switch. Server versions for which there was a plausible reason to use this switch are all out of support now. Leaving it around would accomplish little except to let careless DBAs shoot themselves in the foot. Discussion:

  • Always use ReleaseTupleDesc after lookup_rowtype_tupdesc et al. The API spec for lookup_rowtype_tupdesc previously said you could use either ReleaseTupleDesc or DecrTupleDescRefCount. However, the latter choice means the caller must be certain that the returned tupdesc is refcounted. I don't recall right now whether that was always true when this spec was written, but it's certainly not always true since we introduced shared record typcaches for parallel workers. That means that callers using DecrTupleDescRefCount are dependent on typcache behavior details that they probably shouldn't be. Hence, change the API spec to say that you must call ReleaseTupleDesc, and fix the half-dozen callers that weren't. AFAICT this is just future-proofing, there's no live bug here. So no back-patch. Per gripe from Chapman Flack. Discussion:

  • Clean up some more freshly-dead code in pg_dump and pg_upgrade. I missed a few things in 30e7c175b and e469f0aaf, as noted by Justin Pryzby. Discussion:

  • Remove psql support for server versions preceding 9.2. Per discussion, we'll limit support for old servers to those branches that can still be built easily on modern platforms, which as of now is 9.2 and up. Aside from removing code that is dead per the assumption of server >= 9.2, I tweaked the startup warning for unsupported versions to complain about too-old servers as well as too-new ones. The warning that "Some psql features might not work" applies precisely to both cases. Discussion:

  • Ensure casting to typmod -1 generates a RelabelType. Fix the code changed by commit 5c056b0c2 so that we always generate RelabelType, not something else, for a cast to unspecified typmod. Otherwise planner optimizations might not happen. It appears we missed this point because the previous experiments were done on type numeric: the parser undesirably generates a call on the numeric() length-coercion function, but then numeric_support() optimizes that down to a RelabelType, so that everything seems fine. It misbehaves for types that have a non-optimized length coercion function, such as bpchar. Per report from John Naylor. Back-patch to all supported branches, as the previous patch eventually was. Unfortunately, that no longer includes 9.6 ... we really shouldn't put this type of change into a nearly-EOL branch. Discussion:

  • Fix the public schema's permissions in a separate test script. In the wake of commit b073c3ccd, it's necessary to grant create permissions on the public schema to PUBLIC to get many of the core regression test scripts to pass. That commit did so via the quick-n-dirty expedient of adding the GRANT to the tablespace test, which runs first. This is problematic for single-machine replication testing, though. The least painful way to run the regression tests on such a setup is to skip the tablespace test, and that no longer works. To fix, let's invent a separate "test_setup" script to run first, and put the GRANT there. Revert b073c3ccd's changes to the tablespace.source files. In the future it might be good to try to reduce coupling between the various test scripts by having test_setup create widely-used objects, with the goal that most of the scripts could run after having run only test_setup. That's going to take some effort, so this commit just addresses my immediate pain point. Discussion:

  • Remove some more dead code in pg_dump. Coverity complained that parts of dumpFunc() and buildACLCommands() were now unreachable, as indeed they are. Remove 'em. In passing, make dumpFunc's handling of protrftypes less gratuitously different from other fields.

Peter Geoghegan pushed:

  • vacuumlazy.c: Rename dead_tuples to dead_items. Commit 8523492d simplified what it meant for an item to be considered "dead" to VACUUM: TIDs collected in memory (in preparation for index vacuuming) must always come from LP_DEAD stub line pointers in heap pages, found following pruning. This formalized the idea that index vacuuming (and heap vacuuming) are optional processes. Unlike pruning, they can be delayed indefinitely, without any risk of that violating fundamental invariants. For example, leaving LP_DEAD items behind clearly won't add to the risk of transaction ID wraparound. You can't have transaction ID wraparound without transaction IDs. Renaming anything that references DEAD tuples (tuples with storage) reinforces all this. Code outside vacuumlazy.c continues to fudge the distinction between dead/deleted tuples, and LP_DEAD items. This is necessary because autovacuum scheduling is still mostly driven by "dead items/tuples" statistics. In the future we may find it useful to replace this model with something more sophisticated, as a step towards teaching autovacuum to perform more frequent vacuuming that targeting individual indexes that happen to be more prone to becoming bloated through version churn. In passing, simplify some function signatures that deal with VACUUM's dead_items array. Author: Peter Geoghegan Reviewed-By: Masahiko Sawada Discussion:

  • vacuumlazy.c: fix remaining "dead tuple" references. Oversight in commit 4f8d9d12. Reported-By: Masahiko Sawada Discussion:

  • Standardize cleanup lock terminology. The term "super-exclusive lock" is a synonym for "buffer cleanup lock" that first appeared in nbtree many years ago. Standardize things by consistently using the term cleanup lock. This finishes work started by commit 276db875. There is no good reason to have two terms. But there is a good reason to only have one: to avoid confusion around why VACUUM acquires a full cleanup lock (not just an ordinary exclusive lock) in index AMs, during ambulkdelete calls. This has nothing to do with protecting the physical index data structure itself. It is needed to implement a locking protocol that ensures that TIDs pointing to the heap/table structure cannot get marked for recycling by VACUUM before it is safe (which is somewhat similar to how VACUUM uses cleanup locks during its first heap pass). Note that it isn't strictly necessary for index AMs to implement this locking protocol -- several index AMs use an MVCC snapshot as their sole interlock to prevent unsafe TID recycling. In passing, update the nbtree README. Cleanly separate discussion of the aforementioned index vacuuming locking protocol from discussion of the "drop leaf page pin" optimization added by commit 2ed5b87f. We now structure discussion of the latter by describing how individual index scans may safely opt out of applying the standard locking protocol (and so can avoid blocking progress by VACUUM). Also document why the optimization is not safe to apply during nbtree index-only scans. Author: Peter Geoghegan Discussion: Discussion:

  • Document that tar archives are now properly terminated. Commit 5a1007a5088cd6ddf892f7422ea8dbaef362372f changed the server behavior, but I didn't notice that the existing behavior was documented, and therefore did not update the documentation. This commit does that. I chose to mention that the behavior has changed rather than just removing the reference to a deviation from a standard. It seemed like that might be helpful to tool authors. Discussion:

  • Default to log_checkpoints=on, log_autovacuum_min_duration=10m. The idea here is that when a performance problem is known to have occurred at a certain point in time, it's a good thing if there is some information available from the logs to help figure out what might have happened around that time. This change attracted an above-average amount of dissent, because it means that a server with default settings will produce some amount of log output even if nothing has gone wrong. However, by my count, the mailing list discussion had about twice as many people in favor of the change as opposed. The reasons for believing that the extra log output is not an issue in practice are: (1) the rate at which messages can be generated by this setting is bounded to one every few minutes on a properly-configured system and (2) production systems tend to have a lot more junk in the log from that due to failed connection attempts, ERROR messages generated by application activity, and the like. Bharath Rupireddy, reviewed by Fujii Masao and by me. Many other people commented on the thread, but as far as I can see that was discussion of the merits of the change rather than review of the patch. Discussion:

  • Remove InitXLOGAccess(). It's not great that RecoveryInProgress() calls InitXLOGAccess(), because a status inquiry function typically shouldn't have the side effect of performing initializations. We could fix that by calling InitXLOGAccess() from some other place, but instead, let's remove it altogether. One thing InitXLogAccess() did is initialize wal_segment_size, but it doesn't need to do that. In the postmaster, PostmasterMain() calls LocalProcessControlFile(), and all child processes will inherit that value -- except in EXEC_BACKEND bulds, but then each backend runs SubPostmasterMain() which also calls LocalProcessControlFile(). The other thing InitXLOGAccess() did is update RedoRecPtr and doPageWrites, but that's not critical, because all code that uses them will just retry if it turns out that they've changed. The only difference is that most code will now see an initial value that is definitely invalid instead of one that might have just been way out of date, but that will only happen once per backend lifetime, so it shouldn't be a big deal. Patch by me, reviewed by Nathan Bossart, Michael Paquier, Andres Freund, Heikki Linnakangas, and Álvaro Herrera. Discussion:

  • postgres_fdw: Fix unexpected reporting of empty message. pgfdw_report_error() in postgres_fdw gets a message from PGresult or PGconn to report an error received from a remote server. Previously if it could get a message from neither of them, it reported empty message unexpectedly. The cause of this issue was that pgfdw_report_error() didn't handle properly the case where no message could be obtained and its local variable message_primary was set to '\0'. This commit improves pgfdw_report_error() so that it reports the message "could not obtain ..." when it gets no message and message_primary is set to '\0'. This is the same behavior as when message_primary is NULL. dblink_res_error() in dblink has the same issue, so this commit also improves it in the same way. Back-patch to all supported branches. Author: Fujii Masao Reviewed-by: Bharath Rupireddy Discussion:

  • postgres_fdw: Report warning when timeout expires while getting query result. When aborting remote transaction or sending cancel request to a remote server, postgres_fdw calls pgfdw_get_cleanup_result() to wait for the result of transaction abort query or cancel request to arrive. It fails to get the result if the timeout expires or a connection trouble happens. Previously postgres_fdw reported no warning message even when the timeout expired or a connection trouble happened in pgfdw_get_cleanup_result(). This could make the troubleshooting harder when such an event occurred. This commit makes pgfdw_get_cleanup_result() tell its caller what trouble (timeout or connection error) occurred, on failure, and also makes its caller report the proper warning message based on that information. Author: Fujii Masao Reviewed-by: Bharath Rupireddy Discussion:

  • doc: Add note about postgres_fdw.application_name. postgres_fdw.application_name can be any string of any length and contain even non-ASCII characters. However when it's passed to and used as application_name in a foreign server, it's truncated to less than NAMEDATALEN characters and any characters other than printable ASCII ones in it will be replaced with question marks. This commit adds these notes into the docs. Author: Hayato Kuroda Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion:

