PostgreSQL Weekly News - October 3, 2021

Posted on 2021-10-04 by PWN

PostgreSQL Weekly News - October 3, 2021

PostgreSQL 14 released!

PostgreSQL Product News

pgtt 2.6, an extension to implement global temporary tables, released.

oracle_fdw 2.4.0 released.

pgFormatter 5.1, a formatter/beautifier for SQL code, released.

PostgreSQL Jobs for October

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Thomas Munro pushed:

Peter Geoghegan pushed:

Michaël Paquier pushed:

Tom Lane pushed:

  • Re-enable contrib/bloom's TAP tests. These tests were disabled back in 2018 (commit d3c09b9b1) because of failures observed in the buildfarm. I've not been able to reproduce any failure on longfin's host, though, so I'm curious whether or to what extent we've fixed the problem. Let's re-enable it (in HEAD only) and see what blows up. Discussion:

  • Fix instability in contrib/bloom TAP tests. It turns out that the instability complained of in commit d3c09b9b1 has an embarrassingly simple explanation. The test script waits for the standby to flush incoming WAL to disk, but it should wait for the WAL to be replayed, since we are testing for the effects of that to be visible. While at it, use wait_for_catchup instead of reinventing that logic, and adjust $Test::Builder::Level to improve future error reports. Back-patch to v12 where the necessary infrastructure came in (cf. aforesaid commit). Also back-patch 7d1aa6bf1 so that the test will actually get run. Discussion:

  • Treat ETIMEDOUT as indicating a non-recoverable connection failure. Add ETIMEDOUT to ALL_CONNECTION_FAILURE_ERRNOS' list of "errnos that identify hard failure of a previously-established network connection". While one could imagine that this is sometimes recoverable, the same could be said of other entries such as ENETDOWN. In support of this, handle ETIMEDOUT on par with other socket errors in relevant infrastructure, such as TranslateSocketError(). (I made a couple of cosmetic adjustments in TranslateSocketError(), too.) The code now assumes that ETIMEDOUT is defined everywhere, which it should be given that POSIX has required it since SUSv2. Perhaps this should be back-patched, but I'm hesitant to do so given the lack of previous complaints, and the hazard that there's a small ABI break on Windows from redefining the symbol. Even if we decide to do that, it'd be prudent to let this bake awhile in HEAD first. Jelte Fennema Discussion:

  • Remove gratuitous environment dependency in test. Computing related timestamps by subtracting "N days" is sensitive to the prevailing timezone, since we interpret that as "same local time on the N'th prior day". Even though the intervals in question are only two to four days, through remarkable bad luck they managed to cross the end of Ramadan in 2014, causing the test's output to change if timezone is set to Africa/Casablanca. (Maybe in other Muslim areas as well; I didn't check.) There's absolutely no reason for this test to exercise interval subtraction, so just get rid of that and use plain timestamptz constants representing the intended values. Per report from Andres Freund. Back-patch to v10 where this test script came in. Discussion:

  • Fix Portal snapshot tracking to handle subtransactions properly. Commit 84f5c2908 forgot to consider the possibility that EnsurePortalSnapshotExists could run inside a subtransaction with lifespan shorter than the Portal's. In that case, the new active snapshot would be popped at the end of the subtransaction, leaving a dangling pointer in the Portal, with mayhem ensuing. To fix, make sure the ActiveSnapshot stack entry is marked with the same subtransaction nesting level as the associated Portal. It's certainly safe to do so since we won't be here at all unless the stack is empty; hence we can't create an out-of-order stack. Let's also apply this logic in the case where PortalRunUtility sets portalSnapshot, just to be sure that path can't cause similar problems. It's slightly less clear that that path can't create an out-of-order stack, so add an assertion guarding it. Report and patch by Bertrand Drouvot (with kibitzing by me). Back-patch to v11, like the previous commit. Discussion:

  • Avoid believing incomplete MCV-only stats in get_variable_range(). get_variable_range() would incautiously believe that statistics containing only an MCV list are sufficient to derive a range estimate. That's okay for an enum-like column that contains only MCVs, but otherwise the estimate could be pretty bad. Make it report that the range is indeterminate unless the MCVs plus nullfrac account for the whole table. I don't think this needs a dedicated test case, since a quick code coverage check verifies that the existing regression tests traverse all the alternatives. There is room to doubt that a future-proof test case could be built anyway, given that the submitted example accidentally doesn't fail before v11. Per bug #17207 from Simon Perepelitsa. Back-patch to v10. In principle this has been broken all along, but I'm hesitant to make such changes in 9.6, since if anyone is unhappy with 9.6.24's behavior there will be no second chance to fix it. Discussion:

  • Re-alphabetize the win32_tzmap[] array. The original intent seems to have been to sort case-insensitively by the Windows zone name, but various changes over the years did not get that memo. This commit just moves a few entries to restore exact alphabetic order, to ease comparison to the outputs of processing scripts. Back-patch to all supported branches, as is our usual practice for time zone data updates. Discussion:

  • Update our mapping of Windows time zone names using CLDR info. This corrects a bunch of entries in win32_tzmap[], and adds a few new ones, based on the CLDR project's windowsZones.xml file. Non-cosmetic changes fall into four main categories: * Flat-out errors: US/Aleutan doesn't exist America/Salvador doesn't exist Asia/Baku is wrong for Yerevan Asia/Dhaka (Bangladesh) is wrong for Astana (Kazakhstan) Europe/Bucharest is wrong for Chisinau America/Mexico_City is wrong for Chetumal America/Buenos_Aires is wrong for Cayenne America/Caracas has its own zone, so poor fit for La Paz US/Eastern is wrong for Haiti US/Eastern is wrong for Indiana (East) Asia/Karachi is wrong for Tashkent Etc/UTC+12 doesn't exist Signs of Etc/GMT zones were backwards * Judgment calls: (These changes follow CLDR's choices, except for the first one) Use Europe/London for "Greenwich Standard Time", since that seems much more likely than Africa/Casablanca to be what people will think that zone name means. CLDR has Atlantic/Reykjavik here, but that's no better. Asia/Shanghai seems a better fit than Hong Kong for "China Standard Time". Europe/Sarajevo is now a link to Belgrade, ie "Central Europe Standard Time"; so use Warsaw for "Central European Standard Time". America/Sao_Paulo seems more representative than Araguaina for "E. South America Standard Time". Africa/Johannesburg seems more representative than Harare for "South Africa Standard Time". * New Windows zone names: "Israel Standard Time" "Kaliningrad Standard Time" "Russia Time Zone N" for various N "Singapore Standard Time" "South Sudan Standard Time" "W. Central Africa Standard Time" "West Bank Standard Time" "Yukon Standard Time" Some of these replace older spellings, but I kept the older spellings too in case our code runs on a machine with the older data. * Replace aliases (tzdb Links) with underlying city-named zones: (This tracks tzdb's longstanding practice, and reduces inconsistency with the rest of the entries, as well as with CLDR.) US/Alaska Asia/Kuwait Asia/Muscat Canada/Atlantic Australia/Canberra Canada/Saskatchewan US/Central US/Eastern US/Hawaii US/Mountain Canada/Newfoundland US/Pacific Back-patch to all supported branches, as is our usual practice for time zone data updates. Discussion:

  • Fix checking of query type in plpgsql's RETURN QUERY command. Prior to v14, we insisted that the query in RETURN QUERY be of a type that returns tuples. (For instance, INSERT RETURNING was allowed, but not plain INSERT.) That happened indirectly because we opened a cursor for the query, so spi.c checked SPI_is_cursor_plan(). As a consequence, the error message wasn't terribly on-point, but at least it was there. Commit 2f48ede08 lost this detail. Instead, plain RETURN QUERY insisted that the query be a SELECT (by checking for SPI_OK_SELECT) while RETURN QUERY EXECUTE failed to check the query type at all. Neither of these changes was intended. The only convenient place to check this in the EXECUTE case is inside _SPI_execute_plan, because we haven't done parse analysis until then. So we need to pass down a flag saying whether to enforce that the query returns tuples. Fortunately, we can squeeze another boolean into struct SPIExecuteOptions without an ABI break, since there's padding space there. (It's unlikely that any extensions would already be using this new struct, but preserving ABI in v14 seems like a smart idea anyway.) Within spi.c, it seemed like _SPI_execute_plan's parameter list was already ridiculously long, and I didn't want to make it longer. So I thought of passing SPIExecuteOptions down as-is, allowing that parameter list to become much shorter. This makes the patch a bit more invasive than it might otherwise be, but it's all internal to spi.c, so that seems fine. Per report from Marc Bachmann. Back-patch to v14 where the faulty code came in. Discussion:

Peter Eisentraut pushed:

Magnus Hagander pushed:

Fujii Masao pushed:

Álvaro Herrera pushed:

  • Fix WAL replay in presence of an incomplete record. Physical replication always ships WAL segment files to replicas once they are complete. This is a problem if one WAL record is split across a segment boundary and the primary server crashes before writing down the segment with the next portion of the WAL record: WAL writing after crash recovery would happily resume at the point where the broken record started, overwriting that record ... but any standby or backup may have already received a copy of that segment, and they are not rewinding. This causes standbys to stop following the primary after the latter crashes: LOG: invalid contrecord length 7262 at A8/D9FFFBC8 because the standby is still trying to read the continuation record (contrecord) for the original long WAL record, but it is not there and it will never be. A workaround is to stop the replica, delete the WAL file, and restart it -- at which point a fresh copy is brought over from the primary. But that's pretty labor intensive, and I bet many users would just give up and re-clone the standby instead. A fix for this problem was already attempted in commit 515e3d84a0b5, but it only addressed the case for the scenario of WAL archiving, so streaming replication would still be a problem (as well as other things such as taking a filesystem-level backup while the server is down after having crashed), and it had performance scalability problems too; so it had to be reverted. This commit fixes the problem using an approach suggested by Andres Freund, whereby the initial portion(s) of the split-up WAL record are kept, and a special type of WAL record is written where the contrecord was lost, so that WAL replay in the replica knows to skip the broken parts. With this approach, we can continue to stream/archive segment files as soon as they are complete, and replay of the broken records will proceed across the crash point without a hitch. Because a new type of WAL record is added, users should be careful to upgrade standbys first, primaries later. Otherwise they risk the standby being unable to start if the primary happens to write such a record. A new TAP test that exercises this is added, but the portability of it is yet to be seen. This has been wrong since the introduction of physical replication, so backpatch all the way back. In stable branches, keep the new XLogReaderState members at the end of the struct, to avoid an ABI break. Author: Álvaro Herrera Reviewed-by: Kyotaro Horiguchi Reviewed-by: Nathan Bossart Discussion:

  • Repair two portability oversights of new test. First, as pointed out by Tom Lane and Michael Paquier, I failed to realize that Windows' PostgresNode needs an extra pg_hba.conf line (added by PostgresNode->set_replication_conf, called internally by ->init() when 'allows_streaming=>1' is given -- but I purposefully omitted that). I think a good fix should be to have nodes with only 'has_archiving=>1' set up for replication too, but that's a bigger discussion. Fix it by calling ->set_replication_conf, which is not unprecedented, as pointed out by Andrew Dunstan. I also forgot to uncomment a ->finish() call for a pumpable IPC::Run file descriptor. Apparently this is innocuous in almost all platforms. Backpatch to 14. The older branches were added this file too, but not this particular part of the test. Discussion: Discussion:

  • Remove unstable, unnecessary test; fix typo. Commit ff9f111bce24 added some test code that's unportable and doesn't add meaningful coverage. Remove it rather than try and get it to work everywhere. While at it, fix a typo in a log message added by the aforementioned commit. Backpatch to 14. Discussion:

  • Error out if SKIP LOCKED and WITH TIES are both specified. Both bugs #16676[1] and #17141[2] illustrate that the combination of SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes to rows returned to other sessions accessing the same row. Since this situation is detectable from the syntax and hard to fix otherwise, forbid for now, with the potential to fix in the future. [1] [2] Backpatch-through: 13, where WITH TIES was introduced Author: David Christensen Discussion:

David Rowley pushed:

Amit Kapila pushed:

Daniel Gustafsson pushed:

Andres Freund pushed: