PostgreSQL Weekly News - November 14, 2021

Posted on 2021-11-15 by PWN

PostgreSQL Weekly News - November 14, 2021

PostgreSQL 14.1, 13.5, 12.9, 11.14, 10.19, and 9.6.24 released. This is the final release in the 9.6 series, so put those upgrade plans in action if you haven't already.

PostgreSQL Product News

Pgpool-II 4.3 beta1 a connection pooler and statement replication system for PostgreSQL, released

Odyssey 1.2, a multi-threaded connection pooler for PostgreSQL, released.

pgbouncer 1.16.1, a connection pooler and more for PostgreSQL, released

PostgreSQL Jobs for November

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

David Rowley pushed:

Tom Lane pushed:

  • Reject extraneous data after SSL or GSS encryption handshake. The server collects up to a bufferload of data whenever it reads data from the client socket. When SSL or GSS encryption is requested during startup, any additional data received with the initial request message remained in the buffer, and would be treated as already-decrypted data once the encryption handshake completed. Thus, a man-in-the-middle with the ability to inject data into the TCP connection could stuff some cleartext data into the start of a supposedly encryption-protected database session. This could be abused to send faked SQL commands to the server, although that would only work if the server did not demand any authentication data. (However, a server relying on SSL certificate authentication might well not do so.) To fix, throw a protocol-violation error if the internal buffer is not empty after the encryption handshake. Our thanks to Jacob Champion for reporting this problem. Security: CVE-2021-23214

  • libpq: reject extraneous data after SSL or GSS encryption handshake. libpq collects up to a bufferload of data whenever it reads data from the socket. When SSL or GSS encryption is requested during startup, any additional data received with the server's yes-or-no reply remained in the buffer, and would be treated as already-decrypted data once the encryption handshake completed. Thus, a man-in-the-middle with the ability to inject data into the TCP connection could stuff some cleartext data into the start of a supposedly encryption-protected database session. This could probably be abused to inject faked responses to the client's first few queries, although other details of libpq's behavior make that harder than it sounds. A different line of attack is to exfiltrate the client's password, or other sensitive data that might be sent early in the session. That has been shown to be possible with a server vulnerable to CVE-2021-23214. To fix, throw a protocol-violation error if the internal buffer is not empty after the encryption handshake. Our thanks to Jacob Champion for reporting this problem. Security: CVE-2021-23222

  • Fix incorrect format placeholder. Per buildfarm warnings.

  • Fix instability in test. We've seen intermittent failures in this test on slower buildfarm machines, which I think can be explained by assuming that autovacuum emitted some additional WAL. Disable autovacuum to stabilize it. In passing, use stringwise not numeric comparison to compare WAL file names. Doesn't matter at present, but they are hex strings not decimal ... Discussion:

  • Doc: improve protocol spec for logical replication Type messages. protocol.sgml documented the layout for Type messages, but completely dropped the ball otherwise, failing to explain what they are, when they are sent, or what they're good for. While at it, do a little copy-editing on the description of Relation messages. In passing, adjust the comment for apply_handle_type() to make it clearer that we choose not to do anything when receiving a Type message, not that we think it has no use whatsoever. Per question from Stefen Hillman. Discussion:

  • Fall back to unsigned int, not int, for socklen_t. It's a coin toss which of these is a better default assumption. However, of the machines we have in the buildfarm, the only ones relying on the fallback socklen_t definition are ancient HPUX, and on that platform unsigned int is the right choice. Minor tweak to ee3a1a5b6. Discussion:

  • postgres_fdw: suppress casts on constants in limited cases. When deparsing an expression of the form "remote_var OP constant", we'd normally apply a cast to the constant to make sure that the remote parser thinks it's of the same type we do. However, doing so is often not necessary, and it causes problems if the user has intentionally declared the local column as being of a different type than the remote column. A plausible use-case for that is using text to represent a type that's an enum on the remote side. A comparison on such a column will get shipped as "var = 'foo'::text", which blows up on the remote side because there's no enum = text operator. But if we simply leave off the explicit cast, the comparison will do exactly what the user wants. It's possible to do this without major risk of semantic problems, by relying on the longstanding parser heuristic that "if one operand of an operator is of type unknown, while the other one has a known type, assume that the unknown operand is also of that type". Hence, this patch leaves off the cast only if (a) the operator inputs have the same type locally; (b) the constant will print as a string literal or NULL, both of which are initially taken as type unknown; and (c) the non-Const input is a plain foreign Var. Rule (c) guarantees that the remote parser will know the type of the non-Const input; moreover, it means that if this cast-omission does cause any semantic surprises, that can only happen in cases where the local column has a different type than the remote column. That wasn't guaranteed to work anyway, and this patch should represent a net usability gain for such cases. One point that I (tgl) remain slightly uncomfortable with is that we will ignore an implicit RelabelType when deciding if the non-Const input is a plain Var. That makes it a little squishy to argue that the remote should resolve the Const as being of the same type as its Var, because then our Const is not the same type as our Var. However, if we don't do that, then this hack won't work as desired if the user chooses to use varchar rather than text to represent some remote column. That seems useful, so do it like this for now. We might have to give up the RelabelType-ignoring bit if any problems surface. Dian Fay, with review and kibitzing by me Discussion:

  • Make psql's \password default to CURRENT_USER, not PQuser(conn). The documentation says plainly that \password acts on "the current user" by default. What it actually acted on, or tried to, was the username used to log into the current session. This is not the same thing if one has since done SET ROLE or SET SESSION AUTHENTICATION. Aside from the possible surprise factor, it's quite likely that the current role doesn't have permissions to set the password of the original role. To fix, use "SELECT CURRENT_USER" to get the role name to act on. (This syntax works with servers at least back to 7.0.) Also, in hopes of reducing confusion, include the role name that will be acted on in the password prompt. The discrepancy from the documentation makes this a bug, so back-patch to all supported branches. Patch by me; thanks to Nathan Bossart for review. Discussion:

Robert Haas pushed:

  • Minimal fix for unterminated tar archive problem. Commit 23a1c6578c87fca0e361c4f5f9a07df5ae1f9858 improved pg_basebackup's ability to parse tar archives, but also arranged to parse them only when we need to make some modification to the contents of the archive. That's a problem, because the server doesn't actually terminate tar archives. When the new parsing logic was engaged, pg_basebackup would properly terminate the tar file, but when it was skipped, pg_basebackup would just write whatever it got from the server, meaning that the terminator was missing. Most versions of tar are willing to overlook the missing terminator, but the AIX buildfarm animals were not. Fix by inventing a new kind of bbstreamer that just blindly adds a terminator, and using it whenever we don't parse the tar archive. Discussion:

  • Have the server properly terminate tar archives. Earlier versions of PostgreSQL featured a version of pg_basebackup that wanted to edit tar archives but was too dumb to parse them properly. The server made things easier for the client by failing to add the two blocks of zero bytes that ought to end a tar file, leaving it up to the client to do that. But since commit 23a1c6578c87fca0e361c4f5f9a07df5ae1f9858, we don't need this hack any more, because pg_basebackup is now smarter and can parse tar files even if they are properly terminated! So change the server to always properly terminate the tar files. Older versions of pg_basebackup can't talk to new servers anyway, so there's no compatibility break. On the pg_basebackup side, we see still need to add the terminating zero bytes if we're talking to an older server, but not when the server is v15+. Hopefully at some point we'll be able to remove some of this compatibility cruft, but it seems best to hang on to it for now. In passing, add a file header comment to bbstreamer_tar.c, to make it clearer what's going on here. Discussion:

  • More cleanup of 'ThisTimeLineID'. In XLogCtlData, rename the structure member ThisTimeLineID to InsertTimeLineID and update the comments to make clear that it's only expected to be set after recovery is complete. In StartupXLOG, replace the local variables ThisTimeLineID and PrevTimeLineID with new local variables replayTLI and newTLI. In the old scheme, ThisTimeLineID was the replay TLI until we created a new timeline, and after that the replay TLI was in PrevTimeLineID. Now, replayTLI is the TLI from which we last replayed WAL throughout the entire function, and newTLI is either that, or the new timeline created upon promotion. Remove some misleading comments from the comment block just above where recoveryTargetTimeLineGoal and friends are declared. It's become incorrect, not only because ThisTimeLineID as a variable is now gone, but also because the rmgr code does not care about ThisTimeLineID and has not since what used to be the TLI field in the page header was repurposed to store the page checksum. Add a comment GetFlushRecPtr that it's only supposed to be used in normal running, and an assertion to verify that this is so. Per some ideas from Michael Paquier and some of my own. Review by Michael Paquier also. Discussion:

  • Fix thinko in assertion in basebackup.c. Commit 5a1007a5088cd6ddf892f7422ea8dbaef362372f tried to introduce an assertion that the block size was at least twice the size of a tar block, but I got the math wrong. My error was reported to me off-list.

  • Improve performance of pgarch_readyXlog() with many status files. Presently, the archive_status directory was scanned for each file to archive. When there are many status files, say because archive_command has been failing for a long time, these directory scans can get very slow. With this change, the archiver remembers several files to archive during each directory scan, speeding things up. To ensure timeline history files are archived as quickly as possible, XLogArchiveNotify() forces the archiver to do a new directory scan as soon as the .ready file for one is created. Nathan Bossart, per a long discussion involving many people. It is not clear to me exactly who out of all those people reviewed this particular patch. Discussion: Discussion:

Amit Kapila pushed:

Fujii Masao pushed:

Michaël Paquier pushed:

  • Make some comments use the term "ProcSignal" for consistency. The surroundings in procsignal.c prefer using "ProcSignal" rather than "procsignal". Author: Bharath Rupireddy Discussion:

  • Improve error messages for some callers of XLogReadRecord(). A couple of code paths related to logical decoding (WAL sender, slot advancing, etc.) use XLogReadRecord(), feeding on error messages generated by walreader.c on a failure. All those messages have no context, making it harder to spot from where an error could come even if these should not happen. All the other callers of XLogReadRecord() do that already. Reviewed-by: Kyotaro Horiguchi Discussion:

  • Clean up compilation warnings coming from PL/Perl with clang-12~. clang-12 has introduced -Wcompound-token-split-by-macro, that is causing a large amount of warnings when building PL/Perl because of its interactions with upstream Perl. This commit adds one -Wno to CFLAGS at ./configure time if the flag is supported by the compiler to silence all those warnings. Upstream perl has fixed this issue, but it is going to take some time before this is spread across the buildfarm, and we have noticed that some animals would be useful with an extra -Werror to help with the detection of incorrect placeholders (see b0cf544), dangomushi being one. Reviewed-by: Tom Lane Discussion: Backpatch-through: 10

  • Fix buffer overrun in unicode string normalization with empty input. PostgreSQL 13 and newer versions are directly impacted by that through the SQL function normalize(), which would cause a call of this function to write one byte past its allocation if using in input an empty string after recomposing the string with NFC and NFKC. Older versions (v10~v12) are not directly affected by this problem as the only code path using normalization is SASLprep in SCRAM authentication that forbids the case of an empty string, but let's make the code more robust anyway there so as any out-of-core callers of this function are covered. The solution chosen to fix this issue is simple, with the addition of a fast-exit path if the decomposed string is found as empty. This would only happen for an empty string as at its lowest level a codepoint would be decomposed as itself if it has no entry in the decomposition table or if it has a decomposition size of 0. Some tests are added to cover this issue in v13~. Note that an empty string has always been considered as normalized (grammar "IS NF[K]{C,D} NORMALIZED", through the SQL function is_normalized()) for all the operations allowed (NFC, NFD, NFKC and NFKD) since this feature has been introduced as of 2991ac5. This behavior is unchanged but some tests are added in v13~ to check after that. I have also checked "make normalization-check" in src/common/unicode/, while on it (works in 13~, and breaks in older stable branches independently of this commit). The release notes should just mention this commit for v13~. Reported-by: Matthijs van der Vleuten Discussion: Backpatch-through: 10

  • Fix memory overrun when querying pg_stat_slru. pg_stat_get_slru() in pgstatfuncs.c would point to one element after the end of the array PgStat_SLRUStats when finishing to scan its entries. This had no direct consequences as no data from the extra memory area was read, but static analyzers would rightfully complain here. So let's be clean. While on it, this adds one regression test in the area reserved for system views. Reported-by: Alexander Kozhemyakin, via AddressSanitizer Author: Kyotaro Horiguchi Discussion: Backpatch-through: 13

Peter Eisentraut pushed:

Jeff Davis pushed:

Álvaro Herrera pushed:

Peter Geoghegan pushed:

Noah Misch pushed:

Andrew Dunstan pushed:

Daniel Gustafsson pushed: