PostgreSQL Weekly News - July 18, 2021

Posted on 2021-07-21 by PWN
PWN

PostgreSQL Weekly News - July 18, 2021

Person of the week

PostgreSQL Jobs for July

https://archives.postgresql.org/pgsql-jobs/2021-07/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Michaël Paquier pushed:

  • pageinspect: Improve page_header() for pages of 32kB. ld_upper, ld_lower, pd_special and the page size have been using smallint as return type, which could cause those fields to return negative values in certain cases for builds configures with a page size of 32kB. Bump pageinspect to 1.10. page_header() is able to handle the correct return type of those fields at runtime when using an older version of the extension, with some tests are added to cover that. Author: Quan Zongliang Reviewed-by: Michael Paquier, Bharath Rupireddy Discussion: https://postgr.es/m/8b8ec36e-61fe-14f9-005d-07bc85aa4eed@yeah.net https://git.postgresql.org/pg/commitdiff/127404fbe28455d6e8183fa58f3b7aefeba8f909

  • Fix issues with Windows' stat() for files pending on deletion. The code introduced by bed9075 to enhance the stat() implementation on Windows for file sizes larger than 4GB fails to properly detect files pending for deletion with its method based on NtQueryInformationFile() or GetFileInformationByHandleEx(), as proved by Alexander Lakhin in a custom TAP test of his own. The method used in the implementation of open() to sleep and loop when when failing on ERROR_ACCESS_DENIED (EACCES) is showing much more stability, so switch to this method. This could still lead to issues if the permission problem stays around for much longer than the timeout of 1 second used, but that should (hopefully) never happen in performance-critical paths. Still, there could be a point in increasing the timeouts for the sake of machines that handle heavy loads. Note that WIN32's open() now uses microsoft_native_stat() as it should be similar to stat() when working around issues with concurrent file deletions. I have spent some time testing this patch with pgbench in combination of the SQL functions from genfile.c, as well as running the TAP test provided on the thread with MSVC builds, and this looks much more stable than the previous method. Author: Alexander Lakhin Reviewed-by: Tom Lane, Michael Paquier, Justin Pryzby Discussion: https://postgr.es/m/c3427edf-d7c0-ff57-90f6-b5de3bb62709@gmail.com Backpatch-through: 14 https://git.postgresql.org/pg/commitdiff/54fb8c7ddf152629021cab3ac3596354217b7d81

  • Revert "Fix issues with Windows' stat() for files pending on deletion". This reverts commit 54fb8c7, as per the issues reported by fairywren when it comes to MinGW because of the lack of microsoft_native_stat() there. Using just stat() for MSVC is not sufficient to take care of the concurrency problems with files pending on deletion. It may be possible to paint some MINGW64 in the code to switch to a different implementation of stat() in this build context, but I am not sure either if relying on the implementation of stat() in MinGW to take care of the problems we are trying to fix is enough or not. So this needs more study. Discussion: https://postgr.es/m/YOvOlfRrIO0yGtgw@paquier.xyz Backpatch-through: 14 https://git.postgresql.org/pg/commitdiff/2c9b46c090e76c62f24563b9be2c34e6b92e9329

  • Install properly fe-auth-sasl.h. The internals of the frontend-side callbacks for SASL are visible in libpq-int.h, but the header was not getting installed. This would cause compilation failures for applications playing with the internals of libpq. Issue introduced in 9fd8557. Author: Mikhail Kulagin Reviewed-by: Jacob Champion Discussion: https://postgr.es/m/05ce01d777cb$40f31d60$c2d95820$@postgrespro.ru https://git.postgresql.org/pg/commitdiff/6c9c2831668345122fd0f92280b30f3bbe2dd4e6

  • Remove unnecessary assertion in postmaster.c. A code path asserted that the archiver was dead, but a check made that impossible to happen. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACW=CYE1ars+2XyPTEPq0wQvru4c0dPZ=Nrn3EqNBkksvQ@mail.gmail.com Backpatch-throgh: 14 https://git.postgresql.org/pg/commitdiff/dc2db1eac365b97c9129393acfe11102859f9e23

  • Add TAP tests for ZLIB compression for pg_receivewal. There is a non-trivial amount of code that handles ZLIB compression in pg_receivewal, from basics like the format name, the calculation of the start streaming position and of course the compression itself, but there was no automated coverage for it. This commit introduces a set of conditional tests (if the build supports ZLIB) to cover the creation of ZLIB-compressed WAL segments, the handling of the partial, compressed, WAL segments and the compression operation in itself. Note that there is an extra phase checking the validity of the generated files by using directly a gzip command, passed down by the Makefile of pg_receivewal. This part is skipped if the command cannot be found, something likely going to happen on Windows with MSVC except if one sets the variable GZIP_PROGRAM in the environment of the test. This set of tests will become handy for upcoming patches that add more options for the compression methods used by pg_receivewal, like LZ4, to make sure that no existing facilities are broken. Author: Georgios Kokolatos Reviewed-by: Gilles Darold, Michael Paquier Discussion: https://postgr.es/m/07BK3Mk5aEOsTwGaY77qBVyf9GjoEzn8TMgHLyPGfEFPIpTEmoQuP2P4c7teesjSg-LPeUafsp1flnPeQYINMSMB_UpggJDoduB5EDYBqaQ=@protonmail.com https://git.postgresql.org/pg/commitdiff/ffc9ddaea33f6dfd3dfa95828a0970fbb617bf8a

  • Fix portability issue with gzip in TAP test of pg_receivewal. The OpenBSD implementation of gzip considers only files suffixed by "Z", "gz", "z", "tgz" or "taz" as valid targets, discarding anything else and making a command using --test exit with an error code of 512 if anything invalid is found. The test introduced in ffc9dda tested a WAL segment suffixed as .gz.partial, enough to make the test fail. Testing only a full segment is fine enough in terms of coverage, so simplify the code by discarding the .gz.partial segment in this check. This should be enough to make the test pass with OpenBSD environments. Per report from curculio. Discussion: https://postgr.es/m/YPAdf9r5aJbDoHoq@paquier.xyz https://git.postgresql.org/pg/commitdiff/0da3c1bc3f7261d5157f5b86ade88e8b379f8686

  • Disable tests involving ZLIB on Windows for pg_receivewal. As reported by buildfarm member bowerbird, those tests are unstable on Windows. The failure produced there points to a problem with gzflush(), that fails to sync a file freshly-opened, with a gzFile properly opened. While testing this myself with MSVC, I bumped into a different error where a file could simply not be opened, so this makes me rather doubtful that testing this area on Windows is a good idea if this finishes with random concurrency failures. This requires more investigation, and keeping this buildfarm member red is not a good thing in the long-term, so for now this just disables this set of tests on Windows. Discussion: https://postgr.es/m/YPDLz2x3o1aX2wRh@paquier.xyz https://git.postgresql.org/pg/commitdiff/6cea447e6a10cd7ef511470e809a894a013e6a18

Heikki Linnakangas pushed:

Peter Eisentraut pushed:

Tom Lane pushed:

Thomas Munro pushed:

David Rowley pushed:

Amit Kapila pushed:

  • Add support for prepared transactions to built-in logical replication. To add support for streaming transactions at prepare time into the built-in logical replication, we need to do the following things: * Modify the output plugin (pgoutput) to implement the new two-phase API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle two-phase transactions by replaying them on prepare. * Add a new SUBSCRIPTION option "two_phase" to allow users to enable two-phase transactions. We enable the two_phase once the initial data sync is over. We however must explicitly disable replication of two-phase transactions during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover, we don't have a replication connection open so we don't know where to send the data anyway. The streaming option is not allowed with this new two_phase option. This can be done as a separate patch. We don't allow to toggle two_phase option of a subscription because it can lead to an inconsistent replica. For the same reason, we don't allow to refresh the publication once the two_phase is enabled for a subscription unless copy_data option is false. Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow Tested-By: Haiying Tang Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/a8fd13cab0ba815e9925dc9676e6309f699b5f72

Magnus Hagander pushed:

John Naylor pushed:

Dean Rasheed pushed:

Alexander Korotkov pushed:

Daniel Gustafsson pushed:

Álvaro Herrera pushed:

Pending Patches

David Rowley sent in another revision of a patch to track non-pruned partitions in a Bitmapset added to the RelOptInfo struct, and allow ordered partition scans in more cases.

Hayato Kuroda sent in another revision of a patch to fix DEALLOCATE and DESCRIBE in ECPG.

David Rowley sent in another revision of a patch to speed up transaction completion faster after many relations are accessed in a transaction.

Hou Zhijie sent in another revision of a patch to make it possible to use parallel operations in things that write.

Rahila Syed sent in two more revisions of a patch to make it possible to filter logical replication by column.

Álvaro Herrera sent in a patch intended to fix a bug that manifested as partition check not updated when insert into a partition.

Jeevan Ladhe sent in two revisions of a patch to specify in error messages which DBs had errors in pg_upgrade.

Magnus Hagander sent in another revision of a patch to add support for PROXY protocol.

Bharath Rupireddy sent in a patch to use WaitLatch for {pre, post}_auth_delay instead of pg_usleep.

Vigneshwaran C sent in three more revisions of a patch to add schema level granularity to PUBLICATIONs.

John Naylor sent in four more revisions of a patch to speed up verifying UTF-8.

David Rowley sent in four more revisions of a patch to add proper planner support for ORDER BY / DISTINCT aggregates.

Ronan Dunklau, David Rowley, and Ranier Vilela traded patches to use the optimized single-datum tuplesort in ExecSort.

Masahiro Ikeda sent in two revisions of a patch to fix some mistaken comments in heap_prune_chain().

Alexander Lakhin sent in another revision of a patch intended to fix a bug that manifested as more time spent in "delete pending" on Win32.

Tom Lane sent in another revision of a patch to reduce memory consumption for pending inval messages.

Peter Smith sent in a patch to avoid unnecessary calls to PGserverVersion.

Craig Ringer sent in a patch to teach pgflex.pl and pgbision.pl to read buildenv.pl for tool names, and support extra preprocessor definitions in config.pl.

Maxim Orlov sent in another revision of a patch to fix a parallel worker failed assertion and core dump.

Tom Lane sent in another revision of a patch intended to fix a bug that manifested as merge join on tables with different DB collation behind postgres_fdw fails.

Peter Smith and Euler Taveira de Oliveira traded patches to add row filtering for logical replication.

Ian Barwick sent in a patch to document pg_encoding_to_char() and pg_char_to_encoding().

Fabien COELHO sent in another revision of a patch to psql to factor out the echo code.

Heikki Linnakangas sent in another revision of a patch to refactor LogicalTapeSet/LogicalTape interface in a way that stops requiring knowing how many tapes will be created up front, and replace the polyphase merge algorithm with a simple balanced k-way merge.

Heikki Linnakangas sent in another revision of a patch to move a few ResourceOwnerEnlarge() calls for safety and clarity, make resowners more easily extensible, and optimize the hash function used in same.

Daniil Zakhlystov sent in another revision of a patch to add zlib and zstd streaming compression generally, then use this infrastructure to add them to libpq.

Zhihong Yu sent in two revisions of a patch to pfree() a palloc()ed string.

Ajin Cherian sent in another revision of a patch to skip empty transactions for logical replication.

Kyotaro HORIGUCHI sent in another revision of a patch to be strict in checking numeric parameters on command line, in environment variables, and document the effect of those changes on PGCTLTIMEOUT.

Hou Zhijie sent in a patch to avoid repeated calls to PQfnumber in pg_dump.

Gilles Darold sent in two more revisions of a patch to allows user-defined code to be executed at the start of any command through a xact registered callback.

Peter Smith sent in another revision of a patch to add prepare API support for streaming transactions.

Huailing Liu sent in a patch to remove a confusing SI inval.

Kyotaro HORIGUCHI sent in another revision of a patch to remove read_page callback from XLogReadRecord.

Heikki Linnakangas sent in another revision of a patch to add amcheck for GIN.

enis Hirn sent in another revision of a patch to allow multiple linear recursive self-references in common table expressions.

Haiying Tang sent in a patch to tighten up the help output for the things that psql tab-completes.

Vigneshwaran C sent in another revision of a patch to identify missing publications from publishers during CREATE/ALTER SUBSCRIPTION.

Arne Roland sent in another revision of a patch to rename triggers on partitioned tables recursively.

Andrey V. Lepikhov sent in another revision of a patch to teach optimizer to consider partitionwise joins of non-partitioned table with each partition of partitioned table, and disallow asymmetric machinery for joining of two partitioned (or appended) relations because it could cause huge consumption of CPU and memory during reparameterization of the NestLoop path.

Bharath Rupireddy sent in another revision of a patch to disambiguate error messages that use "non-negative".

Magnus Hagander sent in a patch to tag tarballs with their git revision.

Andrey V. Lepikhov sent in another revision of a patch to remove inner joins of a relation to itself if it can be proven that the join can be replaced with a scan.

Ryohei Takahashi sent in two revisions of a patch to speed up COMMIT PREPARED.

Mark Dilger sent in a patch to stop ignoring failures on file close.

Justin Pryzby sent in two more revisions of a patch to add new metacommands \dn+ to show the size of each schema and \dA+ for AMs.

Aleksander Alekseev sent in another revision of a patch to refactor procarray.

Yugo Nagata sent in another revision of a patch to fix bugs that manifested as pgbench errors and serialization/deadlock retries.

Justin Pryzby sent in another revision of a patch to support ALTER TABLE ... ACCESS METHOD and allow specifying access methods of partitioned tables to be inherited by partitions.

Dilip Kumar added a missing bbsink_forward_end_archive(sink) to src/backend/replication/basebackup_gzip.c.

Kyotaro HORIGUCHI sent in a patch to fix a number of repeated typos in the char-mapping tables.

Bertrand Drouvot sent in another revision of a patch to implement minimal logical decoding on standbys.

Li Japin and Amit Kapila traded patches to disallow setting a replication slot name to be an empty string.

Ranier Vilela sent in a patch to un-shadow some variables, and reduce some -Wsign-compare warnings from modern compilers.

Vigneshwaran C sent in another revision of a patch to include the actual datatype used in logical replication message description.

James Coleman sent in another revision of a patch to allow parallel LATERAL subqueries with LIMIT/OFFSET.

Dinesh Chemuduru sent in a patch to add new diagnostics, PG_PARSE_SQL_STATEMENT, and PG_PARSE_SQL_STATEMENT_POSITION, to PL/pgsql.

Andrew Dunstan sent in three more revisions of a patch to clean up PostgresNode.pm.

Álvaro Herrera and Ranier Vilela traded patches to remove a pointless strlen from slot.c.

Kyotaro HORIGUCHI sent in two more revisions of a patch to make FPI_FOR_HINT follow standard FPI emitting policy.

Yugo Nagata sent in a patch intended to fix a bug that manifested as by preparing commands at the start of a script instead of at the first execution of the command.

Soumyadeep Chakraborty sent in another revision of a patch to make ProcSendSignal() more efficient by referring target processes using pgprocno in order to avoid scanning ProcArray and keeping track of the startup process.

Yugo Nagata and Ranier Vilela traded patches intended to fix a bug that manifested as corruption of WAL page header is never reported.

Atsushi Torikoshi sent in a patch to document the necessity for superuser privileges to execute pg_import_system_collations().

Peter Smith sent in a patch to prevent potential buffer overruns when using strcpy to gid buffer.

Atsushi Torikoshi sent in another revision of a patch to add a function to log the complete query string and its plan for the query currently running on the backend with the specified process ID.