Quick Links

PostgreSQL Weekly News - August 22, 2021

Posted on 2021-08-23 by PWN

PWN

PostgreSQL Weekly News - August 22, 2021

Person of the week

PostgreSQL Product News

orafce_mail, a utility similar to Oracle's DBMS_MAIL, released.

PostgreSQL Jobs for August

https://archives.postgresql.org/pgsql-jobs/2021-08/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Michaël Paquier pushed:

Refresh apply delay on reload of recovery_min_apply_delay at recovery. This commit ensures that the wait interval in the replay delay loop waiting for an amount of time defined by recovery_min_apply_delay is correctly handled on reload, recalculating the delay if this GUC value is updated, based on the timestamp of the commit record being replayed. The previous behavior would be problematic for example with replay still waiting even if the delay got reduced or just cancelled. If the apply delay was increased to a larger value, the wait would have just respected the old value set, finishing earlier. Author: Soumyadeep Chakraborty, Ashwin Agrawal Reviewed-by: Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/CAE-ML+93zfr-HLN8OuxF0BjpWJ17O5dv1eMvSE5jsj9jpnAXZA@mail.gmail.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/e4ba1005c0f7a95e3252f38aee02426117b8e12b
Revert refactoring of hex code to src/common/. This is a combined revert of the following commits: - c3826f8, a refactoring piece that moved the hex decoding code to src/common/. This code was cleaned up by aef8948, as it originally included no overflow checks in the same way as the base64 routines in src/common/ used by SCRAM, making it unsafe for its purpose. - aef8948, a more advanced refactoring of the hex encoding/decoding code to src/common/ that added sanity checks on the result buffer for hex decoding and encoding. As reported by Hans Buschmann, those overflow checks are expensive, and it is possible to see a performance drop in the decoding/encoding of bytea or LOs the longer they are. Simple SQLs working on large bytea values show a clear difference in perf profile. - ccf4e27, a cleanup made possible by aef8948. The reverts of all those commits bring back the performance of hex decoding and encoding back to what it was in ~13. Fow now and post-beta3, this is the simplest option. Reported-by: Hans Buschmann Discussion: https://postgr.es/m/1629039545467.80333@nidsa.net Backpatch-through: 14 https://git.postgresql.org/pg/commitdiff/2576dcfb76aa71e4222bac5a3a43f71875bfa9e8
Improve performance of float overflow checks in btree_gist. The current code could do unnecessary calls to isinf() (two for the argument values all the time while one could be sufficient in some cases). zero_is_valid was never used but the result value was still checked on 0 in the first position of the check. This is similar to 607f8ce. btree_gist has just copy-pasted the code doing those checks from the backend float4/8 code, as of the macro CHECKFLOATVAL(), to do the work. Author: Haiying Tang Discussion: https://postgr.es/m/OS0PR01MB611358E3A7BC3C2F874AC36BFBF39@OS0PR01MB6113.jpnprd01.prod.outlook.com https://git.postgresql.org/pg/commitdiff/32cf7f7acce3891cbc3de53327704372bdd36d38

Daniel Gustafsson pushed:

Clarify initdb --sync-only help message and docs. The initdb help message for --sync-only was a bit terse, and not really self-explanatory. Make it clearer that initdb --sync-only will exit after syncing, and expand the docs with a note on when the option can be useful. Also align the help output with others that exit immediately. Author: Nathan Bossart, Gurjeet Singh Discussion: https://postgr.es/m/CABwTF4U6hbNNE1bv=LxQdJybmUdZ5NJQ9rKY9tN82NXM8QH+iQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/ea499f3d28c657a044f0a948e6b77ac56f28a8f6
Emit namespace in the post-copy errmsg. During a VACUUM or CLUSTER command, the initial output emits a fully qualified relation path with namespace. The post-action errmsg only emitted the relation name however, which may lead to hard to parse output when using multiple jobs with vacuumdb as the output from different jobs may be interleaved. Include the full path in the post-action errmsg to be consistent with the initial errmsg. Author: Mike Fiedler miketheman@gmail.com Reviewed-by: Corey Huinker corey.huinker@gmail.com Discussion: https://postgr.es/m/CAMerE0oz+8G-aORZL_BJcPxnBqewZAvND4bSUysjz+r-oT1BxQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/069d33d0c5a021601245e44df77a0423ddd69359
Set type identifier on BIO. In OpenSSL there are two types of BIO's (I/O abstractions): source/sink and filters. A source/sink BIO is a source and/or sink of data, ie one acting on a socket or a file. A filter BIO takes a stream of input from another BIO and transforms it. In order for BIO_find_type() to be able to traverse the chain of BIO's and correctly find all BIO's of a certain type they shall have the type bit set accordingly, source/sink BIO's (what PostgreSQL implements) use BIO_TYPE_SOURCE_SINK and filter BIO's use BIO_TYPE_FILTER. In addition to these, file descriptor based BIO's should have the descriptor bit set, BIO_TYPE_DESCRIPTOR. The PostgreSQL implementation didn't set the type bits, which went unnoticed for a long time as it's only really relevant for code auditing the OpenSSL installation, or doing similar tasks. It is required by the API though, so this fixes it. Backpatch through 9.6 as this has been wrong for a long time. Author: Itamar Gafni Discussion: https://postgr.es/m/SN6PR06MB39665EC10C34BB20956AE4578AF39@SN6PR06MB3966.namprd06.prod.outlook.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/31f860a52bf97b898d8af6333b23869f1bbac17e
Fix pg_amcheck --skip option parameter handling. The skip options set for all-visible and all-frozen were incorrect as they used space rather than hyphen, causing a syntax error when invoked. Also, the option for not skipping any pages at all, none, was documented but not implemented. Backpatch through 14 where pg_amcheck was introduced. Bug: #17149 Reported-by: Chen Jiaoqian chenjq.jy@fujitsu.com Reviewed-by: Masahiko Sawada sawada.mshk@gmail.com Discussion: https://postgr.es/m/17149-5918ea748da36b15@postgresql.org Backpatch-through: 14 https://git.postgresql.org/pg/commitdiff/500256d953444628164f0b77ef1ce8c9e05e575f
Doc: Fix typo in logical decoding example. Fixes one occurrence of "atleast" in the logical decoding example section. Discussion: https://postgr.es/m/5467E625-1369-48CF-BE62-3BB69395C1F1@yesql.se https://git.postgresql.org/pg/commitdiff/76987bad3380be862ea3cc36d1709134be126150
Remove --quiet option from pg_amcheck. Using --quiet in combination with --no-strict-names didn't work as documented, a warning message was still emitted. Since the --quiet flag was working in an unconventional way to other utilities, fix by removing the functionality instead. Backpatch through 14 where pg_amcheck was introduced. Bug: 17148 Reported-by: Chen Jiaoqian chenjq.jy@fujitsu.com Reviewed-by: Julien Rouhaud rjuju123@gmail.com Discussion: https://postgr.es/m/17148-b5087318e2b04fc6@postgresql.org Backpatch-through: 14 https://git.postgresql.org/pg/commitdiff/9a9c8b92018d4d48f93cd8fa1895c53fa5946d75

John Naylor pushed:

Use direct function calls for pg_popcount{32,64} on non-x86 platforms. Previously, all pg_popcount{32,64} calls were indirected through a function pointer, even though we had no fast implementation for non-x86 platforms. Instead, for those platforms use wrappers around the pg_popcount{32,64}_slow functions. Review and additional hacking by David Rowley Reviewed by Álvaro Herrera Discussion: https://www.postgresql.org/message-id/flat/CAFBsxsE7otwnfA36Ly44zZO%2Bb7AEWHRFANxR1h1kxveEV%3DghLQ%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/4864c8e8f184a35ed1c2c51a15e9a455e9fbb398

Tom Lane pushed:

Reduce memory consumption for pending invalidation messages. The existing data structures in inval.c are fairly inefficient for the common case of a command or subtransaction that registers a small number of cache invalidation events. While this doesn't matter if we commit right away, it can build up to a lot of bloat in a transaction that contains many DDL operations. By making a few more assumptions about the expected use-case, we can switch to a representation using densely-packed arrays. Although this eliminates some data-copying, it doesn't seem to make much difference time-wise. But the space consumption decreases substantially. Patch by me; thanks to Nathan Bossart for review. Discussion: https://postgr.es/m/2380555.1622395376@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/3aafc030a53621e91be2e7c1c72b5f3e8b103486
Improve regex compiler's arc moving/copying logic. The functions moveins(), copyins(), moveouts(), copyouts() are required to preserve the invariant that there are no duplicate arcs in the regex's NFA. Spencer's original implementation of them was O(N^2) since it checked separately for a match to each source arc. In commit 579840ca0 I improved that by adding sort/merge logic to be used if more than a few arcs are to be moved/copied. However, I now realize that that missed a bet. At many call sites, the target state is newly made and cannot have any existing in-arcs (respectively out-arcs) that could be duplicates. So spending any cycles at all on checking for duplicates is wasted effort; in these cases we can just blindly move/copy all the source arcs. Add code paths to do that. It turns out that for copyins()/copyouts(), all the call sites have this property, making all the "improved" logic in them flat out unreachable. Perhaps we'll need the full capability again someday, so I just #ifdef'd those paths out rather than removing them entirely. In passing, add a few test cases to improve code coverage in this area as well as in regc_locale.c/regc_pg_locale.c. Discussion: https://postgr.es/m/810272.1629064063@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/78a843f119ca7d4a6eb173a7ee3bed45889d48d8
Reduce assumptions about locale's behavior in new regex tests. I was overoptimistic to assume that UTF8-based locales would all consider U+1500 to be a member of the [[:alpha:]] char class. Tweak the test cases added by commit 78a843f11 to avoid that assumption. We might need to lobotomize them further, but this should be enough to fix the early buildfarm reports. https://git.postgresql.org/pg/commitdiff/b66336c4e1af0e8eae520623e4b018251807b0bb
Prevent ALTER TYPE/DOMAIN/OPERATOR from changing extension membership. If recordDependencyOnCurrentExtension is invoked on a pre-existing, free-standing object during an extension update script, that object will become owned by the extension. In our current code this is possible in three cases: * Replacing a "shell" type or operator. * CREATE OR REPLACE overwriting an existing object. * ALTER TYPE SET, ALTER DOMAIN SET, and ALTER OPERATOR SET. The first of these cases is intentional behavior, as noted by the existing comments for GenerateTypeDependencies. It seems like appropriate behavior for CREATE OR REPLACE too; at least, the obvious alternatives are not better. However, the fact that it happens during ALTER is an artifact of trying to share code (GenerateTypeDependencies and makeOperatorDependencies) between the CREATE and ALTER cases. Since an extension script would be unlikely to ALTER an object that didn't already belong to the extension, this behavior is not very troubling for the direct target object ... but ALTER TYPE SET will recurse to dependent domains, and it is very uncool for those to become owned by the extension if they were not already. Let's fix this by redefining the ALTER cases to never change extension membership, full stop. We could minimize the behavioral change by only changing the behavior when ALTER TYPE SET is recursing to a domain, but that would complicate the code and it does not seem like a better definition. Per bug #17144 from Alex Kozhemyakin. Back-patch to v13 where ALTER TYPE SET was added. (The other cases are older, but since they only affect the directly-named object, there's not enough of a problem to justify changing the behavior further back.) Discussion: https://postgr.es/m/17144-e67d7a8f049de9af@postgresql.org https://git.postgresql.org/pg/commitdiff/6b71c925cb817f79cb0d389edacdd033efaa301d
Fix check_agg_arguments' examination of aggregate FILTER clauses. Recursion into the FILTER clause was mis-implemented, such that a relevant Var or Aggref at the very top of the FILTER clause would be ignored. (Of course, that'd have to be a plain boolean Var or boolean-returning aggregate.) The consequence would be mis-identification of the correct semantic level of the aggregate, which could lead to not-per-spec query behavior. If the FILTER expression is an aggregate, this could also lead to failure to issue an expected "aggregate function calls cannot be nested" error, which would likely result in a core dump later on, since the planner and executor aren't expecting such cases to appear. The root cause is that commit b560ec1b0 blindly copied some code that assumed it's recursing into a List, and thus didn't examine the top-level node. To forestall questions about why this call doesn't look like the others, as well as possible future copy-and-paste mistakes, let's change all three check_agg_arguments_walker calls in check_agg_arguments, even though only the one for the filter clause is really broken. Per bug #17152 from Zhiyong Wu. This has been wrong since we implemented FILTER, so back-patch to all supported versions. (Testing suggests that pre-v11 branches manage to avoid crashing in the bad-Aggref case, thanks to "redundant" checks in ExecInitAgg. But I'm not sure how thorough that protection is, and anyway the wrong-behavior issue remains, so fix 9.6 and 10 too.) Discussion: https://postgr.es/m/17152-c7f906cc1a88e61b@postgresql.org https://git.postgresql.org/pg/commitdiff/2313dda9d493d3685ac7328b49dc6f5a87c1c295
Avoid trying to lock OLD/NEW in a rule with FOR UPDATE. transformLockingClause neglected to exclude the pseudo-RTEs for OLD/NEW when processing a rule's query. This led to odd errors or even crashes later on. This bug is very ancient, but it's not terribly surprising that nobody noticed, since the use-case for SELECT FOR UPDATE in a non-view rule is somewhere between thin and non-existent. Still, crashing is not OK. Per bug #17151 from Zhiyong Wu. Thanks to Masahiko Sawada for analysis of the problem. Discussion: https://postgr.es/m/17151-c03a3e6e4ec9aadb@postgresql.org https://git.postgresql.org/pg/commitdiff/8d2d6ec7708b475787fd92a9f828e554805e3df6
Fix performance bug in regexp's citerdissect/creviterdissect. After detecting a sub-match "dissect" failure (i.e., a backref match failure) in the i'th sub-match of an iteration node, we should proceed by adjusting the attempted length of the i'th submatch. As coded, though, these functions changed the attempted length of the last sub-match, and only after exhausting all possibilities for that would they back up to adjust the next-to-last sub-match, and then the second-from-last, etc; all of which is wasted effort, since only changing the start or length of the i'th sub-match can possibly make it succeed. This oversight creates the possibility for exponentially bad performance. Fortunately the problem is masked in most cases by optimizations or constraints applied elsewhere; which explains why we'd not noticed it before. But it is possible to reach the problem with fairly simple, if contrived, regexps. Oversight in my commit 173e29aa5. That's pretty ancient now, so back-patch to all supported branches. Discussion: https://postgr.es/m/1808998.1629412269@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/facce1da918a8bf55a8f54606512f944529cba85
Improve error messages about misuse of SELECT INTO. Improve two places in plpgsql, and one in spi.c, where an error message would confusingly tell you that you couldn't use a SELECT query, when what you had written was a SELECT query. The actual problem is that you can't use SELECT ... INTO in these contexts, but the messages failed to make that apparent. Special-case SELECT INTO to make these errors more helpful. Also, fix the same spots in plpgsql, as well as several messages in exec_eval_expr(), to not quote the entire complained-of query or expression in the primary error message. That behavior very easily led to violating our message style guideline about keeping the primary error message short and single-line. Also, since the important part of the message was after the inserted text, it could make the real problem very hard to see. We can report the query or expression as the first line of errcontext instead. Per complaint from Roger Mason. Back-patch to v14, since (a) some of these messages are new in v14 and (b) v14's translatable strings are still somewhat in flux. The problem's older than that of course, but I'm hesitant to change the behavior further back. Discussion: https://postgr.es/m/1914708.1629474624@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/26ae66090398082c54ce046936fc41633dbfc41e

Álvaro Herrera pushed:

Revert analyze support for partitioned tables. This reverts the following commits: 1b5617eb844cd2470a334c1d2eec66cf9b39c41a Describe (auto-)analyze behavior for partitioned tables 0e69f705cc1a3df273b38c9883fb5765991e04fe Set pg_class.reltuples for partitioned tables 41badeaba8beee7648ebe7923a41c04f1f3cb302 Document ANALYZE storage parameters for partitioned tables 0827e8af70f4653ba17ed773f123a60eadd9f9c9 autovacuum: handle analyze for partitioned tables There are efficiency issues in this code when handling databases with large numbers of partitions, and it doesn't look like there isn't any trivial way to handle those. There are some other issues as well. It's now too late in the cycle for nontrivial fixes, so we'll have to let Postgres 14 users continue to manually deal with ANALYZE their partitioned tables, and hopefully we can fix the issues for Postgres 15. I kept [most of] be280cdad298 ("Don't reset relhasindex for partitioned tables on ANALYZE") because while we added it due to 0827e8af70f4, it is a good bugfix in its own right, since it affects manual analyze as well as autovacuum-induced analyze, and there's no reason to revert it. I retained the addition of relkind 'p' to tables included by pg_stat_user_tables, because reverting that would require a catversion bump. Also, in pg14 only, I keep a struct member that was added to PgStat_TabStatEntry to avoid breaking compatibility with existing stat files. Backpatch to 14. Discussion: https://postgr.es/m/20210722205458.f2bug3z6qzxzpx2s@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/6f8127b7390119c21479f5ce495b7d2168930e82

Heikki Linnakangas pushed:

doc: \123 and \x12 escapes in COPY are in database encoding. The backslash sequences, including \123 and \x12 escapes, are interpreted after encoding conversion. The docs failed to mention that. Backpatch to all supported versions. Reported-by: Andreas Grob Discussion: https://www.postgresql.org/message-id/17142-9181542ca1df75ab%40postgresql.org https://git.postgresql.org/pg/commitdiff/e9a79c220bf55e179bb8e0c37fca1239e0fb3b0b

Michael Meskes pushed:

Improved ECPG warning as suggested by Michael Paquier and removed test case.that triggers the warning during regression tests. https://git.postgresql.org/pg/commitdiff/f576de1db1eeca63180b1ffa4b42b1e360f88577

Amit Kapila pushed:

Fix typo in protocol.sgml. The 'Stream Stop' message is misspelled as 'Stream End' in the docs. Author: Masahiko Sawada Backpatch-through: 14, where it was introduced Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/0ac1aee0d7d8d5c3493e6e8b1d3925af35a31648
Rename LOGICAL_REP_MSG_STREAM_END to LOGICAL_REP_MSG_STREAM_STOP. In the code, most places used the term "Stream Stop" for the logical stream message. This commit improves consistency by renaming LogicalRepMsgType "LOGICAL_REP_MSG_STREAM_END" to "LOGICAL_REP_MSG_STREAM_STOP". Author: Masahiko Sawada Reviewed-by: Hou Zhijie, Amit Kapila Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/4cd7a189687171374ff302ad71c99d39ff6d2bab

Andres Freund pushed:

Unset MyBEEntry, making elog.c's call to pgstat_get_my_query_id() safe. Previously log messages late during shutdown could end up using either another backend's PgBackendStatus (multi user) or segfault (single user) because pgstat_get_my_query_id()'s check for !MyBEEntry didn't filter out use after pgstat_beshutdown_hook(). This became a bug in 4f0b0966c86, but was a bit fishy before. But given there's no known problematic cases before 14, it doesn't seem worth backpatching further. Also fixes a wrong filename in a comment, introduced in e1025044. Reported-By: Andres Freund andres@anarazel.de Reviewed-By: Julien Rouhaud rjuju123@gmail.com Discussion: https://postgr.es/m/Julien Rouhaud rjuju123@gmail.com Backpatch: 14- https://git.postgresql.org/pg/commitdiff/bed5eac2d50eb86a254861dcdea7b064d10c72cf

Peter Eisentraut pushed:

pg_resetwal: Improve numeric command-line argument parsing. Check errno after strtoul()/strtol() to handle out of range errors better. For out of range, strtoul() returns ULONG_MAX, and the previous code would proceed with that result. Reported-by: Mark Dilger mark.dilger@enterprisedb.com Discussion: https://www.postgresql.org/message-id/flat/6a10a211-872b-3c4c-106b-909ae5fefa61%40enterprisedb.com https://git.postgresql.org/pg/commitdiff/9a6345ed741783e8770ef160e822d2257873adef
pg_amcheck: Fix block number parsing on command line. The previous code wouldn't handle higher block numbers on systems where sizeof(long) == 4. Reviewed-by: Mark Dilger mark.dilger@enterprisedb.com Discussion: https://www.postgresql.org/message-id/flat/6a10a211-872b-3c4c-106b-909ae5fefa61%40enterprisedb.com https://git.postgresql.org/pg/commitdiff/f1899f251df421a4715ce5e231855eb6914bf77d
psql: Add test for query canceling. Query canceling in psql was accidentally broken by 3a5130672296ed4e682403a77a9a3ad3d21cef75 (since reverted), so having some test coverage for that seems useful. Reviewed-by: Fabien COELHO coelho@cri.ensmp.fr Discussion: https://www.postgresql.org/message-id/18c78a01-4a34-9dd4-f78b-6860f1420c8e@enterprisedb.com https://git.postgresql.org/pg/commitdiff/5b3f471ff23a2773e6c1ee1704377581c987107d
psql: Improve portability of query cancel test. Some shells apparently don't support $PPID, so skip the test in that case. https://git.postgresql.org/pg/commitdiff/c818c25f448d0085e1bb2be402463a4b28bc20c4

David Rowley pushed:

Allow parallel DISTINCT. We've supported parallel aggregation since e06a38965. At the time, we didn't quite get around to also adding parallel DISTINCT. So, let's do that now. This is implemented by introducing a two-phase DISTINCT. Phase 1 is performed on parallel workers, rows are made distinct there either by hashing or by sort/unique. The results from the parallel workers are combined and the final distinct phase is performed serially to get rid of any duplicate rows that appear due to combining rows for each of the parallel workers. Author: David Rowley Reviewed-by: Zhihong Yu Discussion: https://postgr.es/m/CAApHDvrjRxVKwQN0he79xS+9wyotFXL=RmoWqGGO2N45Farpgw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/22c4e88ebff408acd52e212543a77158bde59e69
Fix broken regression test caused by 22c4e88eb. Per buildfarm members hoverfly and thorntail https://git.postgresql.org/pg/commitdiff/945f395aeb74cea77d5239db01357bbcbea80534