Quick Links

PostgreSQL Weekly News - April 11, 2021

Posted on 2021-04-12 by PWN

PWN

PostgreSQL Weekly News - April 11, 2021

Feature freeze for PostgreSQL 14 has arrived. Any new feature that could be in PostgreSQL 14 is in the git repository.

PostgreSQL Product News

AGE 0.4.0, a PostgreSQL extension that provides graph database functionality, released. https://github.com/apache/incubator-age/releases/tag/0.4.0

PostgreSQL Jobs for April

https://archives.postgresql.org/pgsql-jobs/2021-04/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Tom Lane pushed:

Fix more confusion in SP-GiST. spg_box_quad_leaf_consistent unconditionally returned the leaf datum as leafValue, even though in its usage for poly_ops that value is of completely the wrong type. In versions before 12, that was harmless because the core code did nothing with leafValue in non-index-only scans ... but since commit 2a6368343, if we were doing a KNN-style scan, spgNewHeapItem would unconditionally try to copy the value using the wrong datatype parameters. Said copying is a waste of time and space if we're not going to return the data, but it accidentally failed to fail until I fixed the datatype confusion in ac9099fc1. Hence, change spgNewHeapItem to not copy the datum unless we're actually going to return it later. This saves cycles and dodges the question of whether lossy opclasses are returning the right type. Also change spg_box_quad_leaf_consistent to not return data that might be of the wrong type, as insurance against somebody introducing a similar bug into the core code in future. It seems like a good idea to back-patch these two changes into v12 and v13, although I'm afraid to change spgNewHeapItem's mistaken idea of which datatype to use in those branches. Per buildfarm results from ac9099fc1. Discussion: https://postgr.es/m/3728741.1617381471@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/dfc843d465689d2c2af8b0e01c66c51ccaae2343
Support INCLUDE'd columns in SP-GiST. Not much to say here: does what it says on the tin. We steal a previously-always-zero bit from the nextOffset field of leaf index tuples in order to track whether there is a nulls bitmap. Otherwise it works about like included columns in other index types. Pavel Borisov, reviewed by Andrey Borodin and Anastasia Lubennikova, and rather heavily editorialized on by me Discussion: https://postgr.es/m/CALT9ZEFi-vMp4faht9f9Junb1nO3NOSjhpxTmbm1UGLMsLqiEQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/09c1c6ab4bc5764dd69c53ccfd43b2060b1fd090
Clean up treatment of missing default and CHECK-constraint records. Andrew Gierth reported that it's possible to crash the backend if no pg_attrdef record is found to match an attribute that has atthasdef set. AttrDefaultFetch warns about this situation, but then leaves behind a relation tupdesc that has null "adbin" pointer(s), which most places don't guard against. We considered promoting the warning to an error, but throwing errors during relcache load is pretty drastic: it effectively locks one out of using the relation at all. What seems better is to leave the load-time behavior as a warning, but then throw an error in any code path that wants to use a default and can't find it. This confines the error to a subset of INSERT/UPDATE operations on the table, and in particular will at least allow a pg_dump to succeed. Also, we should fix AttrDefaultFetch to not leave any null pointers in the tupdesc, because that just creates an untested bug hazard. While at it, apply the same philosophy of "warn at load, throw error only upon use of the known-missing info" to CHECK constraints. CheckConstraintFetch is very nearly the same logic as AttrDefaultFetch, but for reasons lost in the mists of time, it was throwing ERROR for the same cases that AttrDefaultFetch treats as WARNING. Make the two functions more nearly alike. In passing, get rid of potentially-O(N^2) loops in equalTupleDesc by making AttrDefaultFetch sort the entries after fetching them, so that equalTupleDesc can assume that entries in two equal tupdescs must be in matching order. (CheckConstraintFetch already was sorting CHECK constraints, but equalTupleDesc hadn't been told about it.) There's some argument for back-patching this, but with such a small number of field reports, I'm content to fix it in HEAD. Discussion: https://postgr.es/m/87pmzaq4gx.fsf@news-spur.riddles.org.uk https://git.postgresql.org/pg/commitdiff/091e22b2e673e3e8480abd68fbb827c5d6979615
Fix missing #include in nodeResultCache.h. Per cpluspluscheck. https://git.postgresql.org/pg/commitdiff/789d81de8a50d9a23cc1a3b8ea5d839246020689
Postpone some stuff out of ExecInitModifyTable. Arrange to do some things on-demand, rather than immediately during executor startup, because there's a fair chance of never having to do them at all: * Don't open result relations' indexes until needed. * Don't initialize partition tuple routing, nor the child-to-root tuple conversion map, until needed. This wins in UPDATEs on partitioned tables when only some of the partitions will actually receive updates; with larger partition counts the savings is quite noticeable. Also, we can remove some sketchy heuristics in ExecInitModifyTable about whether to set up tuple routing. Also, remove execPartition.c's private hash table tracking which partitions were already opened by the ModifyTable node. Instead use the hash added to ModifyTable itself by commit 86dc90056. To allow lazy computation of the conversion maps, we now set ri_RootResultRelInfo in all child ResultRelInfos. We formerly set it only in some, not terribly well-defined, cases. This has user-visible side effects in that now more error messages refer to the root relation instead of some partition (and provide error data in the root's column order, too). It looks to me like this is a strict improvement in consistency, so I don't have a problem with the output changes visible in this commit. Extracted from a larger patch, which seemed to me to be too messy to push in one commit. Amit Langote, reviewed at different times by Heikki Linnakangas and myself Discussion: https://postgr.es/m/CA+HiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/c5b7ba4e67aeb5d6f824b74f94114d99ed6e42b7
Postpone some more stuff out of ExecInitModifyTable. Delay creation of the projections for INSERT and UPDATE tuples until they're needed. This saves a pretty fair amount of work when only some of the partitions are actually touched. The logic associated with identifying junk columns in UPDATE/DELETE is moved to another loop, allowing removal of one loop over the target relations; but it didn't actually change at all. Extracted from a larger patch, which seemed to me to be too messy to push in one commit. Amit Langote, reviewed at different times by Heikki Linnakangas and myself Discussion: https://postgr.es/m/CA+HiwqG7ZruBmmih3wPsBZ4s0H2EhywrnXEduckY5Hr3fWzPWA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/a1115fa0782378a8238045d238ae70cac36be8ae
Tighten up allowed names for custom GUC parameters. Formerly we were pretty lax about what a custom GUC's name could be; so long as it had at least one dot in it, we'd take it. However, corner cases such as dashes or equal signs in the name would cause various bits of functionality to misbehave. Rather than trying to make the world perfectly safe for that, let's just require that custom names look like "identifier.identifier", where "identifier" means something that scan.l would accept without double quotes. Along the way, this patch refactors things slightly in guc.c so that find_option() is responsible for reporting GUC-not-found cases, allowing removal of duplicative code from its callers. Per report from Hubert Depesz Lubaczewski. No back-patch, since the consequences of the problem don't seem to warrant changing behavior in stable branches. Discussion: https://postgr.es/m/951335.1612910077@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/3db826bd55cd1df0dd8c3d811f8e5b936d7ba1e4
Comment cleanup for a1115fa07. Amit Langote Discussion: https://postgr.es/m/CA+HiwqEcawatEaUh1uTbZMEZTJeLzbroRTz9_X9Z5CFjTWJkhw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/0d46771eaaf77ad08555cf34e421234d5943edfa
Remove channel binding requirement from clientcert=verify-full test. This fails on older OpenSSL versions that lack channel binding support. Since that feature is not essential to this test case, just remove it, instead of complicating matters. Per buildfarm. Jacob Champion Discussion: https://postgr.es/m/fa8dbbb58c20b1d1adf0082769f80d5466eaf485.camel@vmware.com https://git.postgresql.org/pg/commitdiff/a282ee68a070a8adc6e6d45e8e643769c587ecc3
Allow psql's \df and \do commands to specify argument types. When dealing with overloaded function or operator names, having to look through a long list of matches is tedious. Let's extend these commands to allow specification of (input) argument types to let such results be trimmed down. Each additional argument is treated the same as the pattern argument of \dT and matched against the appropriate argument's type name. While at it, fix \dT (and these new options) to recognize the usual notation of "foo[]" for "the array type over foo", and to handle the special abbreviations allowed by the backend grammar, such as "int" for "integer". Greg Sabino Mullane, revised rather significantly by me Discussion: https://postgr.es/m/CAKAnmmLF9Hhu02N+s7uAyLc5J1xZReg72HQUoiKhNiJV3_jACQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/a3027e1e7f3d3a107ecd72d3b4d6333ea2aab6a5
Suppress uninitialized-variable warning. Several buildfarm critters that don't usually produce such warnings are complaining about e717a9a18. I think it's actually safe, but move initialization to silence the warning. https://git.postgresql.org/pg/commitdiff/01add89454d5dc289ed3126d5de03169bdeff41b
Add support for tab-completion of type arguments in \df, \do. Oversight in commit a3027e1e7. https://git.postgresql.org/pg/commitdiff/d1fcbde579d440c35023baa0de7ebf27f644a314
Doc: update documentation of check_function_bodies. Adjust docs and description string to note that check_function_bodies applies to procedures too. (In hindsight it should have been named check_routine_bodies, but it seems too late for that now.) Daniel Westermann Discussion: https://postgr.es/m/GV0P278MB04834A9EB9A74B036DC7CE49D2739@GV0P278MB0483.CHEP278.PROD.OUTLOOK.COM https://git.postgresql.org/pg/commitdiff/07b76833b15163c6574ea2c12d05d9a0800665e2
Fix failure of xlogprefetch.h to include all prerequisite headers. Per cpluspluscheck. https://git.postgresql.org/pg/commitdiff/99964c4ade468c35a3f6e248a2380a1ff67d9cd3
Fix uninitialized variable from commit a4d75c86b. The path for *exprs != NIL would misbehave, and likely crash, since pull_varattnos expects its last argument to be valid at call. Found by Coverity --- we have no coverage of this path in the regression tests. https://git.postgresql.org/pg/commitdiff/9cb92334092fa75afc62a71243bbc1f4612ecfa4
Add macro PGWARNING, and make PGERROR available on all platforms. We'd previously noted the need for coping with Windows headers that provide some other definition of macro "ERROR" than elog.h does. It turns out that R also wants to define ERROR, and WARNING too. PL/R has been working around this in a hacky way that broke when we recently changed the numeric value of ERROR. To let them have a more future-proof solution, provide an alternate macro PGWARNING for WARNING, and make PGERROR visible always, not only when #ifdef WIN32. Discussion: https://postgr.es/m/CADK3HHK6iMChd1yoOqssxBn5Z14Zar8Ztr3G-N_fuG7F8YTP3w@mail.gmail.com https://git.postgresql.org/pg/commitdiff/d7cff12c4c035b7cf12bb8454824f48f13018730

Michaël Paquier pushed:

Refactor all TAP test suites doing connection checks. This commit refactors more TAP tests to adapt with the recent introduction of connect_ok() and connect_fails() in PostgresNode, introduced by 0d1a3343. This changes the following test suites to use the same code paths for connection checks: - Kerberos - LDAP - SSL - Authentication Those routines are extended to be able to handle optional parameters that are set depending on each suite's needs, as of: - custom SQL query. - expected stderr matching pattern. - expected stdout matching pattern. The new design is extensible with more parameters, and there are some plans for those routines in the future with checks based on the contents of the backend logs. Author: Jacob Champion, Michael Paquier Discussion: https://postgr.es/m/d17b919e27474abfa55d97786cb9cfadfe2b59e9.camel@vmware.com https://git.postgresql.org/pg/commitdiff/c50624cdd248c13b4ba199f95e24c88d2cc8a097
Fix typo in collationcmds.c. Introduced by 51e225d. Author: Anton Voloshin Discussion: https://postgr.es/m/05477da0-703c-7de7-998c-5879738e8f39@postgrespro.ru https://git.postgresql.org/pg/commitdiff/9f6f1f9b8e61f9ce47e1936fc68c21a4a8d6722c
Change PostgresNode::connect_fails() to never send down queries. This type of failure is similar to what has been fixed in c757a3da, where an authentication failure combined with psql pushing a command down its communication pipe causes a test failure. This routine is designed to fail, so sending a query has little sense anyway. Per buildfarm members gaur and hoverfly, based on an analysis and fix from Tom Lane. Discussion: https://postgr.es/m/513200.1617634642@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/6d41dd045ada28ee14182112fc4cf50fb3879d28
Fix some issues with SSL and Kerberos tests. The recent refactoring done in c50624c accidentally broke a portion of the kerberos tests checking after a query, so add its functionality back. Some inactive SSL tests had their arguments in an incorrect order, which would cause them to fail if they were to run. Author: Jacob Champion Discussion: https://postgr.es/m/4f5b0b3dc0b6fe9ae6a34886b4d4000f61eb567e.camel@vmware.com https://git.postgresql.org/pg/commitdiff/5a71964a832febfee23cedc3bb354049d6ca78a7
Add some information about authenticated identity via log_connections. The "authenticated identity" is the string used by an authentication method to identify a particular user. In many common cases, this is the same as the PostgreSQL username, but for some third-party authentication methods, the identifier in use may be shortened or otherwise translated (e.g. through pg_ident user mappings) before the server stores it. To help administrators see who has actually interacted with the system, this commit adds the capability to store the original identity when authentication succeeds within the backend's Port, and generates a log entry when log_connections is enabled. The log entries generated look something like this (where a local user named "foouser" is connecting to the database as the database user called "admin"): LOG: connection received: host=[local] LOG: connection authenticated: identity="foouser" method=peer (/data/pg_hba.conf:88) LOG: connection authorized: user=admin database=postgres application_name=psql Port->authn_id is set according to the authentication method: bsd: the PostgreSQL username (aka the local username) cert: the client's Subject DN gss: the user principal ident: the remote username ldap: the final bind DN pam: the PostgreSQL username (aka PAM username) password (and all pw-challenge methods): the PostgreSQL username peer: the peer's pw_name radius: the PostgreSQL username (aka the RADIUS username) sspi: either the down-level (SAM-compatible) logon name, if compat_realm=1, or the User Principal Name if compat_realm=0 The trust auth method does not set an authenticated identity. Neither does clientcert=verify-full. Port->authn_id could be used for other purposes, like a superuser-only extra column in pg_stat_activity, but this is left as future work. PostgresNode::connect_{ok,fails}() have been modified to let tests check the backend log files for required or prohibited patterns, using the new log_like and log_unlike parameters. This uses a method based on a truncation of the existing server log file, like issues_sql_like(). Tests are added to the ldap, kerberos, authentication and SSL test suites. Author: Jacob Champion Reviewed-by: Stephen Frost, Magnus Hagander, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/c55788dd1773c521c862e8e0dddb367df51222be.camel@vmware.com https://git.postgresql.org/pg/commitdiff/9afffcb833d3c5e59a328a2af674fac7e7334fc1
Remove redundant memset(0) calls for page init of some index AMs. Bloom, GIN, GiST and SP-GiST rely on PageInit() to initialize the contents of a page, and this routine fills entirely a page with zeros for a size of BLCKSZ, including the special space. Those index AMs have been using an extra memset() call to fill with zeros the special page space, or even the whole page, which is not necessary as PageInit() already does this work, so let's remove them. GiST was not doing this extra call, but has commented out a system call that did so since 6236991. While on it, remove one MAXALIGN() for SP-GiST as PageInit() takes care of that. This makes the whole page initialization logic more consistent across all index AMs. Author: Bharath Rupireddy Reviewed-by: Vignesh C, Mahendra Singh Thalor Discussion: https://postgr.es/m/CALj2ACViOo2qyaPT7krWm4LRyRTw9kOXt+g6PfNmYuGA=YHj9A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/4c0239cb7a7775e3183cb575e62703d71bf3302d
Fix some failures with connection tests on Windows hosts. The truncation of the log file, that this set of tests relies on to make sure that a connection attempt matches with its expected backend log pattern, fails, as reported by buildfarm member fairywren. Instead of a truncation, do a rotation of the log file and restart the node. This will ensure that the connection attempt data is unique for each test. Discussion: https://postgr.es/m/YG05nCI8x8B+Ad3G@paquier.xyz https://git.postgresql.org/pg/commitdiff/c7578fa64019f27edc31261ea49066a4b2569a6c
Fix typos and grammar in documentation and code comments. Comment fixes are applied on HEAD, and documentation improvements are applied on back-branches where needed. Author: Justin Pryzby Discussion: https://postgr.es/m/20210408164008.GJ6592@telsasoft.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/609b0652af00374b89411ea2613fd5bb92bca92c

Peter Eisentraut pushed:

Renumber cursor option flags. Move the planner-control flags up so that there is more room for parse options. Some pending patches need some room there, so do this renumbering separately so that there is less potential for conflicts. https://git.postgresql.org/pg/commitdiff/a63dd8afe2b859b853d857cd8a0392ca1e04ab6c
Change return type of EXTRACT to numeric. The previous implementation of EXTRACT mapped internally to date_part(), which returned type double precision (since it was implemented long before the numeric type existed). This can lead to imprecise output in some cases, so returning numeric would be preferrable. Changing the return type of an existing function is a bit risky, so instead we do the following: We implement a new set of functions, which are now called "extract", in parallel to the existing date_part functions. They work the same way internally but use numeric instead of float8. The EXTRACT construct is now mapped by the parser to these new extract functions. That way, dumps of views etc. from old versions (which would use date_part) continue to work unchanged, but new uses will map to the new extract functions. Additionally, the reverse compilation of EXTRACT now reproduces the original syntax, using the new mechanism introduced in 40c24bfef92530bd846e111c1742c2a54441c62c. The following minor changes of behavior result from the new implementation: - The column name from an isolated EXTRACT call is now "extract" instead of "date_part". - Extract from date now rejects inappropriate field names such as HOUR. It was previously mapped internally to extract from timestamp, so it would silently accept everything appropriate for timestamp. - Return values when extracting fields with possibly fractional values, such as second and epoch, now have the full scale that the value has internally (so, for example, '1.000000' instead of just '1'). Reported-by: Petr Fedorov petr.fedorov@phystech.edu Reviewed-by: Tom Lane tgl@sss.pgh.pa.us Discussion: https://www.postgresql.org/message-id/flat/42b73d2d-da12-ba9f-570a-420e0cce19d9@phystech.edu https://git.postgresql.org/pg/commitdiff/a2da77cdb4661826482ebf2ddba1f953bc74afe4
ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION. At present, if we want to update publications in a subscription, we can use SET PUBLICATION. However, it requires supplying all publications that exists and the new publications. If we want to add new publications, it's inconvenient. The new syntax only supplies the new publications. When the refresh is true, it only refreshes the new publications. Author: Japin Li japinli@hotmail.com Author: Bharath Rupireddy bharath.rupireddyforpostgres@gmail.com Discussion: https://www.postgresql.org/message-id/flat/MEYP282MB166939D0D6C480B7FBE7EFFBB6BC0@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM https://git.postgresql.org/pg/commitdiff/82ed7748b710e3ddce3f7ebc74af80fe4869492f
psql: Show all query results by default. Previously, psql printed only the last result if a command string returned multiple result sets. Now it prints all of them. The previous behavior can be obtained by setting the psql variable SHOW_ALL_RESULTS to off. Author: Fabien COELHO coelho@cri.ensmp.fr Reviewed-by: "Iwata, Aya" iwata.aya@jp.fujitsu.com Reviewed-by: Daniel Verite daniel@manitou-mail.org Reviewed-by: Peter Eisentraut peter.eisentraut@2ndquadrant.com Reviewed-by: Kyotaro Horiguchi horikyota.ntt@gmail.com Reviewed-by: vignesh C vignesh21@gmail.com Discussion: https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1904132231510.8961@lancre https://git.postgresql.org/pg/commitdiff/3a5130672296ed4e682403a77a9a3ad3d21cef75
Message improvement. The previous wording contained a superfluous comma. Adjust phrasing for grammatical correctness and clarity. https://git.postgresql.org/pg/commitdiff/0b5e8245283eef67e88fb5380836cdc2c743d848
Fix use of cursor sensitivity terminology. Documentation and comments in code and tests have been using the terms sensitive/insensitive cursor incorrectly relative to the SQL standard. (Cursor sensitivity is only relevant for changes made in the same transaction as the cursor, not for concurrent changes in other sessions.) Moreover, some of the behavior of PostgreSQL is incorrect according to the SQL standard, confusing the issue further. (WHERE CURRENT OF changes are not visible in insensitive cursors, but they should be.) This change corrects the terminology and removes the claim that sensitive cursors are supported. It also adds a test case that checks the insensitive behavior in a "correct" way, using a change command not using WHERE CURRENT OF. Finally, it adds the ASENSITIVE cursor option to select the default asensitive behavior, per SQL standard. There are no changes to cursor behavior in this patch. Discussion: https://www.postgresql.org/message-id/flat/96ee8b30-9889-9e1b-b053-90e10c050e85%40enterprisedb.com https://git.postgresql.org/pg/commitdiff/dd13ad9d39a1ba41cf329b6fe408b49be57c7b88
doc: Improve wording. Discussion: https://www.postgresql.org/message-id/flat/161626776179.652.11944895442156126506%40wrigleys.postgresql.org https://git.postgresql.org/pg/commitdiff/4560e0acdafd57f3ba109b98e15ac047798d960c
libpq: Set Server Name Indication (SNI) for SSL connections. By default, have libpq set the TLS extension "Server Name Indication" (SNI). This allows an SNI-aware SSL proxy to route connections. (This requires a proxy that is aware of the PostgreSQL protocol, not just any SSL proxy.) In the future, this could also allow the server to use different SSL certificates for different host specifications. (That would require new server functionality. This would be the client-side functionality for that.) Since SNI makes the host name appear in cleartext in the network traffic, this might be undesirable in some cases. Therefore, also add a libpq connection option "sslsni" to turn it off. Discussion: https://www.postgresql.org/message-id/flat/7289d5eb-62a5-a732-c3b9-438cee2cb709%40enterprisedb.com https://git.postgresql.org/pg/commitdiff/5c55dc8b47338e72a4e598c155d2048d756fd10e
SQL-standard function body. This adds support for writing CREATE FUNCTION and CREATE PROCEDURE statements for language SQL with a function body that conforms to the SQL standard and is portable to other implementations. Instead of the PostgreSQL-specific AS $$ string literal $$ syntax, this allows writing out the SQL statements making up the body unquoted, either as a single statement: CREATE FUNCTION add(a integer, b integer) RETURNS integer LANGUAGE SQL RETURN a + b; or as a block CREATE PROCEDURE insert_data(a integer, b integer) LANGUAGE SQL BEGIN ATOMIC INSERT INTO tbl VALUES (a); INSERT INTO tbl VALUES (b); END; The function body is parsed at function definition time and stored as expression nodes in a new pg_proc column prosqlbody. So at run time, no further parsing is required. However, this form does not support polymorphic arguments, because there is no more parse analysis done at call time. Dependencies between the function and the objects it uses are fully tracked. A new RETURN statement is introduced. This can only be used inside function bodies. Internally, it is treated much like a SELECT statement. psql needs some new intelligence to keep track of function body boundaries so that it doesn't send off statements when it sees semicolons that are inside a function body. Tested-by: Jaime Casanova jcasanov@systemguards.com.ec Reviewed-by: Julien Rouhaud rjuju123@gmail.com Discussion: https://www.postgresql.org/message-id/flat/1c11f1eb-f00c-43b7-799d-2d44132c02d7@2ndquadrant.com https://git.postgresql.org/pg/commitdiff/e717a9a18b2e34c9c40e5259ad4d31cd7e420750
Update Unicode data to CLDR 39. https://git.postgresql.org/pg/commitdiff/2e0e0666790e48cec716d4947f89d067ef53490c
doc: Prefer explicit JOIN syntax over old implicit syntax in tutorial. Update src/tutorial/basics.source to match. Author: Jürgen Purtz juergen@purtz.de Reviewed-by: Thomas Munro thomas.munro@gmail.com Reviewed-by: "David G. Johnston" david.g.johnston@gmail.com Discussion: https://www.postgresql.org/message-id/flat/158996922318.7035.10603922579567326239@wrigleys.postgresql.org https://git.postgresql.org/pg/commitdiff/fb310f17812e7321599845a29af2f36c7f1191c3
Add ORDER BY to some regression test queries. Apparently, an unrelated patch introduced some variation on the build farm. Reported-by: Magnus Hagander magnus@hagander.net https://git.postgresql.org/pg/commitdiff/7e3c54168d9ec058cb3c9d47f8105b1d32dc8ceb
doc: Additional documentation for date_bin. Reported-by: Justin Pryzby pryzby@telsasoft.com Author: John Naylor john.naylor@enterprisedb.com Discussion: https://www.postgresql.org/message-id/CAFBsxsEEm1nuhZmfVQxvu_i3nDDEuvNJ_WMrDo9whFD_jusp-A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/49fb4e6b249029e75ccc6b490d76f15cd53d553b
doc: Fix man page whitespace issues. Whitespace between tags is significant, and in some cases it creates extra vertical space in man pages. The fix is to remove some newlines in the markup. https://git.postgresql.org/pg/commitdiff/9cae39b8e6a53decb37cce22852bb2e6d0e3f5d3
Improve behavior of date_bin with origin in the future. Currently, when the origin is after the input, the result is the timestamp at the end of the bin, rather than the beginning as expected. This puts the result consistently at the beginning of the bin. Author: John Naylor john.naylor@enterprisedb.com Discussion: https://www.postgresql.org/message-id/CAFBsxsGjLDxQofRfH+d4KSAXxPf3MMevUG7s6EDfdBOvHLDLjw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/496e58bb0e5e939e6ed5839c92b05e3ab11b54bb

Álvaro Herrera pushed:

Align some terms in arch-dev.sgml to glossary. This mostly adds links to the glossary to the existing text, instead of using <firstterm>. Heikki left this out of 29ad6595ef7f out of stylistic concerns; these have since been addressed. Author: Jürgen Purtz juergen@purtz.de Discussion: https://postgr.es/m/67d7240f-8596-83fc-5e15-af06c128a0f5@purtz.de https://git.postgresql.org/pg/commitdiff/6734e806958c4ebd253adb30b255983981818710
Repair find_inheritance_children with no active snapshot. When working on a scan with only a catalog snapshot, we may not have an ActiveSnapshot set. If we were to come across a detached partition, that would cause a crash. Fix by only ignoring detached partitions when there's an active snapshot. https://git.postgresql.org/pg/commitdiff/4131f755d548f74eba56285dc674f1f26e4ed6b4
autovacuum: handle analyze for partitioned tables. Previously, autovacuum would completely ignore partitioned tables, which is not good regarding analyze -- failing to analyze those tables means poor plans may be chosen. Make autovacuum aware of those tables by propagating "changes since analyze" counts from the leaf partitions up the partitioning hierarchy. This also introduces necessary reloptions support for partitioned tables (autovacuum_enabled, autovacuum_analyze_scale_factor, autovacuum_analyze_threshold). It's unclear how best to document this aspect. Author: Yuzuko Hosoya yuzukohosoya@gmail.com Reviewed-by: Kyotaro Horiguchi horikyota.ntt@gmail.com Reviewed-by: Tomas Vondra tomas.vondra@enterprisedb.com Reviewed-by: Álvaro Herrera alvherre@alvh.no-ip.org Discussion: https://postgr.es/m/CAKkQ508_PwVgwJyBY=0Lmkz90j8CmWNPUxgHvCUwGhMrouz6UA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/0827e8af70f4653ba17ed773f123a60eadd9f9c9
Set pg_class.reltuples for partitioned tables. When commit 0827e8af70f4 added auto-analyze support for partitioned tables, it included code to obtain reltuples for the partitioned table as a number of catalog accesses to read pg_class.reltuples for each partition. That's not only very inefficient, but also problematic because autovacuum doesn't hold any locks on any of those tables -- and doesn't want to. Replace that code with a read of pg_class.reltuples for the partitioned table, and make sure ANALYZE and TRUNCATE properly maintain that value. I found no code that would be affected by the change of relpages from zero to non-zero for partitioned tables, and no other code that should be maintaining it, but if there is, hopefully it'll be an easy fix. Per buildfarm. Author: Álvaro Herrera alvherre@alvh.no-ip.org Reviewed-by: Zhihong Yu zyu@yugabyte.com Discussion: https://postgr.es/m/1823909.1617862590@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/0e69f705cc1a3df273b38c9883fb5765991e04fe
Document ANALYZE storage parameters for partitioned tables. Commit 0827e8af70f4 added parameters for autovacuum to support partitioned tables, but didn't add any docs. Add them. Discussion: https://postgr.es/m/20210408213051.GL6592@telsasoft.com https://git.postgresql.org/pg/commitdiff/41badeaba8beee7648ebe7923a41c04f1f3cb302
Suppress length of Notice/Error msgs in PQtrace regress mode. A (relatively minor) annoyance of ErrorResponse/NoticeResponse messages as printed by PQtrace() is that their length might vary when we move error messages from one source file to another, one function to another, or even when their location line numbers change number of digits. To avoid having to adjust expected files for some tests, make the regress mode of PQtrace() suppress the length word of NoticeResponse and ErrorResponse messages. Discussion: https://postgr.es/m/20210402023010.GA13563@alvherre.pgsql Reviewed-by: Tom Lane tgl@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/e7e341409a3d85aba4cf754ba9cf722a4d8e6676

Fujii Masao pushed:

Shut down transaction tracking at startup process exit. Maxim Orlov reported that the shutdown of standby server could result in the following assertion failure. The cause of this issue was that, when the shutdown caused the startup process to exit, recovery-time transaction tracking was not shut down even if it's already initialized, and some locks the tracked transactions were holding could not be released. At this situation, if other process was invoked and the PGPROC entry that the startup process used was assigned to it, it found such unreleased locks and caused the assertion failure, during the initialization of it. TRAP: FailedAssertion("SHMQueueEmpty(&(MyProc->myProcLocks[i]))" This commit fixes this issue by making the startup process shut down transaction tracking and release all locks, at the exit of it. Back-patch to all supported branches. Reported-by: Maxim Orlov Author: Fujii Masao Reviewed-by: Maxim Orlov Discussion: https://postgr.es/m/ad4ce692cc1d89a093b471ab1d969b0b@postgrespro.ru https://git.postgresql.org/pg/commitdiff/ad8b674922eb70dc5cd02951dd82fe2c4c37c80a
Add function to log the memory contexts of specified backend process. Commit 3e98c0bafb added pg_backend_memory_contexts view to display the memory contexts of the backend process. However its target process is limited to the backend that is accessing to the view. So this is not so convenient when investigating the local memory bloat of other backend process. To improve this situation, this commit adds pg_log_backend_memory_contexts() function that requests to log the memory contexts of the specified backend process. This information can be also collected by calling MemoryContextStats(TopMemoryContext) via a debugger. But this technique cannot be used in some environments because no debugger is available there. So, pg_log_backend_memory_contexts() allows us to see the memory contexts of specified backend more easily. Only superusers are allowed to request to log the memory contexts because allowing any users to issue this request at an unbounded rate would cause lots of log messages and which can lead to denial of service. On receipt of the request, at the next CHECK_FOR_INTERRUPTS(), the target backend logs its memory contexts at LOG_SERVER_ONLY level, so that these memory contexts will appear in the server log but not be sent to the client. It logs one message per memory context. Because if it buffers all memory contexts into StringInfo to log them as one message, which may require the buffer to be enlarged very much and lead to OOM error since there can be a large number of memory contexts in a backend. When a backend process is consuming huge memory, logging all its memory contexts might overrun available disk space. To prevent this, now this patch limits the number of child contexts to log per parent to 100. As with MemoryContextStats(), it supposes that practical cases where the log gets long will typically be huge numbers of siblings under the same parent context; while the additional debugging value from seeing details about individual siblings beyond 100 will not be large. There was another proposed patch to add the function to return the memory contexts of specified backend as the result sets, instead of logging them, in the discussion. However that patch is not included in this commit because it had several issues to address. Thanks to Tatsuhito Kasahara, Andres Freund, Tom Lane, Tomas Vondra, Michael Paquier, Kyotaro Horiguchi and Zhihong Yu for the discussion. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Kyotaro Horiguchi, Zhihong Yu, Fujii Masao Discussion: https://postgr.es/m/0271f440ac77f2a4180e0e56ebd944d1@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/43620e328617c1f41a2a54c8cee01723064e3ffa
Fix typo in pgstat.c. Introduced by 9868167500. Author: Vignesh C Discussion: https://postgr.es/m/CALDaNm1DqgaLBAJrtGznKk1sR1mH-augmp7LfGvxWwTUhah+rg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/f5d94e405e17a49487672316610630be2f9d0bb7
Stop archive recovery if WAL generated with wal_level=minimal is found. Previously if hot standby was enabled, archive recovery exited with an error when it found WAL generated with wal_level=minimal. But if hot standby was disabled, it just reported a warning and continued in that case. Which could lead to data loss or errors during normal operation. A warning was emitted, but users could easily miss that and not notice this serious situation until they encountered the actual errors. To improve this situation, this commit changes archive recovery so that it exits with FATAL error when it finds WAL generated with wal_level=minimal whatever the setting of hot standby. This enables users to notice the serious situation soon. The FATAL error is thrown if archive recovery starts from a base backup taken before wal_level is changed to minimal. When archive recovery exits with the error, if users have a base backup taken after setting wal_level to higher than minimal, they can recover the database by starting archive recovery from that newer backup. But note that if such backup doesn't exist, there is no easy way to complete archive recovery, which may make the database server unstartable and users may lose whole database. The commit adds the note about this risk into the document. Even in the case of unstartable database server, previously by just disabling hot standby users could avoid the error during archive recovery, forcibly start up the server and salvage data from it. But note that this commit makes this procedure unavailable at all. Author: Takamichi Osumi Reviewed-by: Laurenz Albe, Kyotaro Horiguchi, David Steele, Fujii Masao Discussion: https://postgr.es/m/OSBPR01MB4888CBE1DA08818FD2D90ED8EDF90@OSBPR01MB4888.jpnprd01.prod.outlook.com https://git.postgresql.org/pg/commitdiff/9de9294b0c4dac77edb80f029648afca79d14653
postgres_fdw: Allow partitions specified in LIMIT TO to be imported. Commit f49bcd4ef3 disallowed postgres_fdw to import table partitions. Because all data can be accessed through the partitioned table which is the root of the partitioning hierarchy, importing only partitioned table should allow access to all the data without creating extra objects. This is a reasonable default when importing a whole schema. But there may be the case where users want to explicitly import one of a partitioned tables' partitions. For that use case, this commit allows postgres_fdw to import tables or foreign tables which are partitions of some other table only when they are explicitly specified in LIMIT TO clause. It doesn't change the behavior that any partitions not specified in LIMIT TO are automatically excluded in IMPORT FOREIGN SCHEMA command. Author: Matthias van de Meent Reviewed-by: Bernd Helmle, Amit Langote, Michael Paquier, Fujii Masao Discussion: https://postgr.es/m/CAEze2Whwg4i=mzApMe+PXxCEfgoZmHGqdqQFW7J4bmj_5p6t1A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/a3740c48eb2f91663c7c06c948dfcfb6493d2588
Fix test added by commit 9de9294b0c. The buildfarm members "drongo" and "fairywren" reported that the regression test (024_archive_recovery.pl) added by commit 9de9294b0c failed. The cause of this failure is that the test calls $node->init() without "allows_streaming => 1" and which doesn't add pg_hba.conf entry for TCP/IP connection from pg_basebackup. This commit fixes the issue by specifying "allows_streaming => 1" when calling $node->init(). Author: Fujii Masao Discussion: https://postgr.es/m/3cc3909d-f779-7a74-c201-f1f7f62c7497@oss.nttdata.com https://git.postgresql.org/pg/commitdiff/8ee9b662daa6d51b54d21ec274f22a218462ad2d
Allow TRUNCATE command to truncate foreign tables. This commit introduces new foreign data wrapper API for TRUNCATE. It extends TRUNCATE command so that it accepts foreign tables as the targets to truncate and invokes that API. Also it extends postgres_fdw so that it can issue TRUNCATE command to foreign servers, by adding new routine for that TRUNCATE API. The information about options specified in TRUNCATE command, e.g., ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to truncate is also passed to FDW. FDW truncates the foreign data sources that the passed foreign tables specify, based on those information. For example, postgres_fdw constructs TRUNCATE command using them and issues it to the foreign server. For performance, TRUNCATE command invokes the FDW routine for TRUNCATE once per foreign server that foreign tables to truncate belong to. Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com https://git.postgresql.org/pg/commitdiff/8ff1c94649f5c9184ac5f07981d8aea9dfd7ac19
Remove COMMIT_TS_SETTS record. Commit 438fc4a39c prevented the WAL replay from writing COMMIT_TS_SETTS record. By this change there is no code that generates COMMIT_TS_SETTS record in PostgreSQL core. Also we can think that there are no extensions using the record because we've not received so far any complaints about the issue that commit 438fc4a39c fixed. Therefore this commit removes COMMIT_TS_SETTS record and its related code. Even without this record, the timestamp required for commit timestamp feature can be acquired from the COMMIT record. Bump WAL page magic. Reported-by: lx zou zoulx1982@163.com Author: Fujii Masao Reviewed-by: Alvaro Herrera Discussion: https://postgr.es/m/16931-620d0f2fdc6108f1@postgresql.org https://git.postgresql.org/pg/commitdiff/08aa89b326261b669648df97d4f2a6edba22d26a
Avoid unnecessary table open/close in TRUNCATE command. ExecuteTruncate() filters out the duplicate tables specified in the TRUNCATE command, for example in the case where "TRUNCATE foo, foo" is executed. Such duplicate tables obviously don't need to be opened and closed because they are skipped. But previously it always opened the tables before checking whether they were duplicated ones or not, and then closed them if they were. That is, the duplicated tables were opened and closed unnecessarily. This commit changes ExecuteTruncate() so that it opens the table after it confirms that table is not duplicated one, which leads to avoid unnecessary table open/close. Do not back-patch because such unnecessary table open/close is not a bug though it exists in older versions. Author: Bharath Rupireddy Reviewed-by: Amul Sul, Fujii Masao Discussion: https://postgr.es/m/CALj2ACUdBO_sXJTa08OZ0YT0qk7F_gAmRa9hT4dxRcgPS4nsZA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/81a23dd87999ec9fb62554328c69c5b678612d56

Stephen Frost pushed:

Add pg_read_all_data and pg_write_all_data roles. A commonly requested use-case is to have a role who can run an unfettered pg_dump without having to explicitly GRANT that user access to all tables, schemas, et al, without that role being a superuser. This address that by adding a "pg_read_all_data" role which implicitly gives any member of this role SELECT rights on all tables, views and sequences, and USAGE rights on all schemas. As there may be cases where it's also useful to have a role who has write access to all objects, pg_write_all_data is also introduced and gives users implicit INSERT, UPDATE and DELETE rights on all tables, views and sequences. These roles can not be logged into directly but instead should be GRANT'd to a role which is able to log in. As noted in the documentation, if RLS is being used then an administrator may (or may not) wish to set BYPASSRLS on the login role which these predefined roles are GRANT'd to. Reviewed-by: Georgios Kokolatos Discussion: https://postgr.es/m/20200828003023.GU29590@tamriel.snowman.net https://git.postgresql.org/pg/commitdiff/6c3ffd697e2242f5497ea4b40fffc8f6f922ff60

Peter Geoghegan pushed:

Simplify state managed by VACUUM. Reorganize the state struct used by VACUUM -- group related items together to make it easier to understand. Also stop relying on stack variables inside lazy_scan_heap() -- move those into the state struct instead. Doing things this way simplifies large groups of related functions whose function signatures had a lot of unnecessary redundancy. Switch over to using int64 for the struct fields used to count things that are reported to the user via log_autovacuum and VACUUM VERBOSE output. We were using double, but that doesn't seem to have any advantages. Using int64 makes it possible to add assertions that verify that the first pass over the heap (pruning) encounters precisely the same number of LP_DEAD items that get deleted from indexes later on, in the second pass over the heap. These assertions will be added in later commits. Finally, adjust the signatures of functions with IndexBulkDeleteResult pointer arguments in cases where there was ambiguity about whether or not the argument relates to a single index or all indexes. Functions now use the idiom that both ambulkdelete() and amvacuumcleanup() have always used (where appropriate): accept a mutable IndexBulkDeleteResult pointer argument, and return a result IndexBulkDeleteResult pointer to caller. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/CAH2-WzkeOSYwC6KNckbhk2b1aNnWum6Yyn0NKP9D-Hq1LGTDPw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/b4af70cb210393c9c8f41643acf6b213e21178e7
Propagate parallel VACUUM's buffer access strategy. Parallel VACUUM relied on global variable state from the leader process being propagated to workers on fork(). Commit b4af70cb removed most uses of global variables inside vacuumlazy.c, but did not account for the buffer access strategy state. To fix, propagate the state through shared memory instead. Per buildfarm failures on elver, curculio, and morepork. Many thanks to Thomas Munro for off-list assistance with this issue. https://git.postgresql.org/pg/commitdiff/49f49defe7c0a330cca084de5da14ccdfdafc6a3
Allocate access strategy in parallel VACUUM workers. Commit 49f49def took entirely the wrong approach to fixing this issue. Just allocate a local buffer access strategy in each individual worker instead of trying to propagate state. This state was never propagated by parallel VACUUM in the first place. It looks like the only reason that this worked following commit 40d964ec was that it involved static global variables, which are initialized to 0 per the C standard. A more comprehensive fix may be necessary, even on HEAD. This fix should at least get the buildfarm green once again. Thanks once again to Thomas Munro for continued off-list assistance with the issue. https://git.postgresql.org/pg/commitdiff/f6b8f19a084ce949522fcbc940dc116c034cfc47
Refactor lazy_scan_heap() loop. Add a lazy_scan_heap() subsidiary function that handles heap pruning and tuple freezing: lazy_scan_prune(). This is a great deal cleaner. The code that remains in lazy_scan_heap()'s per-block loop can now be thought of as code that either comes before or after the call to lazy_scan_prune(), which is now the clear focal point. This division is enforced by the way in which we now manage state. lazy_scan_prune() outputs state (using its own struct) that describes what to do with the page following pruning and freezing (e.g., visibility map maintenance, recording free space in the FSM). It doesn't get passed any special instructional state from the preamble code, though. Also cleanly separate the logic used by a VACUUM with INDEX_CLEANUP=off from the logic used by single-heap-pass VACUUMs. The former case is now structured as the omission of index and heap vacuuming by a two pass VACUUM. The latter case goes back to being used only when the table happens to have no indexes (just as it was before commit a96c41fe). This structure is much more natural, since the whole point of INDEX_CLEANUP=off is to skip the index and heap vacuuming that would otherwise take place. The single-heap-pass case doesn't skip any useful work, though -- it just does heap pruning and heap vacuuming together when the table happens to have no indexes. Both of these changes are preparation for an upcoming patch that generalizes the mechanism used by INDEX_CLEANUP=off. The later patch will allow VACUUM to give up on index and heap vacuuming dynamically, as problems emerge (e.g., with wraparound), so that an affected VACUUM operation can finish up as soon as possible. Also fix a very old bug in single-pass VACUUM VERBOSE output. We were reporting the number of tuples deleted via pruning as a direct substitute for reporting the number of LP_DEAD items removed in a function that deals with the second pass over the heap. But that doesn't work at all -- they're two different things. To fix, start tracking the total number of LP_DEAD items encountered during pruning, and use that in the report instead. A single pass VACUUM will always vacuum away whatever LP_DEAD items a heap page has immediately after it is pruned, so the total number of LP_DEAD items encountered during pruning equals the total number vacuumed-away. (They are not equal in the INDEX_CLEANUP=off case, but that's okay because skipping index vacuuming is now a totally orthogonal concept to one-pass VACUUM.) Also stop reporting the count of LP_UNUSED items in VACUUM VERBOSE output. This makes the output of VACUUM VERBOSE more consistent with log_autovacuum's output (because it never showed information about LP_UNUSED items). VACUUM VERBOSE reported LP_UNUSED items left behind by the last VACUUM, and LP_UNUSED items created via pruning HOT chains during the current VACUUM (it never included LP_UNUSED items left behind by the current VACUUM's second pass over the heap). This makes it useless as an indicator of line pointer bloat, which must have been the original intention. (Like the first VACUUM VERBOSE issue, this issue was arguably an oversight in commit 282d2a03, which added the heap-only tuple optimization.) Finally, stop reporting empty_pages in VACUUM VERBOSE output, and start reporting pages_removed instead. This also makes the output of VACUUM VERBOSE more consistent with log_autovacuum's output (which does not show empty_pages, but does show pages_removed). An empty page isn't meaningfully different to a page that is almost empty, or a page that is empty but for only a small number of remaining LP_UNUSED items. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Robert Haas robertmhaas@gmail.com Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Discussion: https://postgr.es/m/CAH2-WznneCXTzuFmcwx_EyRQgfsfJAAsu+CsqRFmFXCAar=nJw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/7ab96cf6b312cfcd79cdc1a69c6bdb75de0ed30f
Remove tupgone special case from vacuumlazy.c. Retry the call to heap_prune_page() in rare cases where there is disagreement between the heap_prune_page() call and the call to HeapTupleSatisfiesVacuum() that immediately follows. Disagreement is possible when a concurrently-aborted transaction makes a tuple DEAD during the tiny window between each step. This was the only case where a tuple considered DEAD by VACUUM still had storage following pruning. VACUUM's definition of dead tuples is now uniformly simple and unambiguous: dead tuples from each page are always LP_DEAD line pointers that were encountered just after we performed pruning (and just before we considered freezing remaining items with tuple storage). Eliminating the tupgone=true special case enables INDEX_CLEANUP=off style skipping of index vacuuming that takes place based on flexible, dynamic criteria. The INDEX_CLEANUP=off case had to know about skipping indexes up-front before now, due to a subtle interaction with the special case (see commit dd695979) -- this was a special case unto itself. Now there are no special cases. And so now it won't matter when or how we decide to skip index vacuuming: it won't affect how pruning behaves, and it won't be affected by any of the implementation details of pruning or freezing. Also remove XLOG_HEAP2_CLEANUP_INFO records. These are no longer necessary because we now rely entirely on heap pruning taking care of recovery conflicts. There is no longer any need to generate recovery conflicts for DEAD tuples that pruning just missed. This also means that heap vacuuming now uses exactly the same strategy for recovery conflicts as index vacuuming always has: REDO routines never need to process a latestRemovedXid from the WAL record, since earlier REDO of the WAL record from pruning is sufficient in all cases. The generic XLOG_HEAP2_CLEAN record type is now split into two new record types to reflect this new division (these are called XLOG_HEAP2_PRUNE and XLOG_HEAP2_VACUUM). Also stop acquiring a super-exclusive lock for heap pages when they're vacuumed during VACUUM's second heap pass. A regular exclusive lock is enough. This is correct because heap page vacuuming is now strictly a matter of setting the LP_DEAD line pointers to LP_UNUSED. No other backend can have a pointer to a tuple located in a pinned buffer that can be invalidated by a concurrent heap page vacuum operation. Heap vacuuming can now be thought of as conceptually similar to index vacuuming and conceptually dissimilar to heap pruning. Heap pruning now has sole responsibility for anything involving the logical contents of the database (e.g., managing transaction status information, recovery conflicts, considering what to do with HOT chains). Index vacuuming and heap vacuuming are now only concerned with recycling garbage items from physical data structures that back the logical database. Bump XLOG_PAGE_MAGIC due to pruning and heap page vacuum WAL record changes. Credit for the idea of retrying pruning a page to avoid the tupgone case goes to Andres Freund. Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Andres Freund andres@anarazel.de Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Discussion: https://postgr.es/m/CAH2-WznneCXTzuFmcwx_EyRQgfsfJAAsu+CsqRFmFXCAar=nJw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/8523492d4e349c4714aa2ab0291be175a88cb4fc
Truncate line pointer array during VACUUM. Teach VACUUM to truncate the line pointer array of each heap page when a contiguous group of LP_UNUSED line pointers appear at the end of the array -- these unused and unreferenced items are excluded. This process occurs during VACUUM's second pass over the heap, right after LP_DEAD line pointers on the page (those encountered/pruned during the first pass) are marked LP_UNUSED. Truncation avoids line pointer bloat with certain workloads, particularly those involving continual range DELETEs and bulk INSERTs against the same table. Also harden heapam code to check for an out-of-range page offset number in places where we weren't already doing so. Author: Matthias van de Meent boekewurm+postgres@gmail.com Author: Peter Geoghegan pg@bowt.ie Reviewed-By: Masahiko Sawada sawada.mshk@gmail.com Reviewed-By: Peter Geoghegan pg@bowt.ie Discussion: https://postgr.es/m/CAEze2WjgaQc55Y5f5CQd3L=eS5CZcff2Obxp=O6pto8-f0hC4w@mail.gmail.com Discussion: https://postgr.es/m/CAH2-Wzn6a64PJM1Ggzm=uvx2otsopJMhFQj_g1rAj4GWr3ZSzw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/3c3b8a4b26891892bccf3d220580a7f413c0b9ca
Add wraparound failsafe to VACUUM. Add a failsafe mechanism that is triggered by VACUUM when it notices that the table's relfrozenxid and/or relminmxid are dangerously far in the past. VACUUM checks the age of the table dynamically, at regular intervals. When the failsafe triggers, VACUUM takes extraordinary measures to finish as quickly as possible so that relfrozenxid and/or relminmxid can be advanced. VACUUM will stop applying any cost-based delay that may be in effect. VACUUM will also bypass any further index vacuuming and heap vacuuming -- it only completes whatever remaining pruning and freezing is required. Bypassing index/heap vacuuming is enabled by commit 8523492d, which made it possible to dynamically trigger the mechanism already used within VACUUM when it is run with INDEX_CLEANUP off. It is expected that the failsafe will almost always trigger within an autovacuum to prevent wraparound, long after the autovacuum began. However, the failsafe mechanism can trigger in any VACUUM operation. Even in a non-aggressive VACUUM, where we're likely to not advance relfrozenxid, it still seems like a good idea to finish off remaining pruning and freezing. An aggressive/anti-wraparound VACUUM will be launched immediately afterwards. Note that the anti-wraparound VACUUM that follows will itself trigger the failsafe, usually before it even begins its first (and only) pass over the heap. The failsafe is controlled by two new GUCs: vacuum_failsafe_age, and vacuum_multixact_failsafe_age. There are no equivalent reloptions, since that isn't expected to be useful. The GUCs have rather high defaults (both default to 1.6 billion), and are expected to generally only be used to make the failsafe trigger sooner/more frequently. Author: Masahiko Sawada sawada.mshk@gmail.com Author: Peter Geoghegan pg@bowt.ie Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzmgH3ySGYeC-m-eOBsa2=sDwa292-CFghV4rESYo39FsQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/1e55e7d1755cefbb44982fbacc7da461fa8684e6
Teach VACUUM to bypass unnecessary index vacuuming. VACUUM has never needed to call ambulkdelete() for each index in cases where there are precisely zero TIDs in its dead_tuples array by the end of its first pass over the heap (also its only pass over the heap in this scenario). Index vacuuming is simply not required when this happens. Index cleanup will still go ahead, but in practice most calls to amvacuumcleanup() are usually no-ops when there were zero preceding ambulkdelete() calls. In short, VACUUM has generally managed to avoid index scans when there were clearly no index tuples to delete from indexes. But cases with close to no index tuples to delete were another matter -- a round of ambulkdelete() calls took place (one per index), each of which performed a full index scan. VACUUM now behaves just as if there were zero index tuples to delete in cases where there are in fact "virtually zero" such tuples. That is, it can now bypass index vacuuming and heap vacuuming as an optimization (though not index cleanup). Whether or not VACUUM bypasses indexes is determined dynamically, based on the just-observed number of heap pages in the table that have one or more LP_DEAD items (LP_DEAD items in heap pages have a 1:1 correspondence with index tuples that still need to be deleted from each index in the worst case). We only skip index vacuuming when 2% or less of the table's pages have one or more LP_DEAD items -- bypassing index vacuuming as an optimization must not noticeably impede setting bits in the visibility map. As a further condition, the dead_tuples array (i.e. VACUUM's array of LP_DEAD item TIDs) must not exceed 32MB at the point that the first pass over the heap finishes, which is also when the decision to bypass is made. (The VACUUM must also have been able to fit all TIDs in its maintenance_work_mem-bound dead_tuples space, though with a default maintenance_work_mem setting it can't matter.) This avoids surprising jumps in the duration and overhead of routine vacuuming with workloads where successive VACUUM operations consistently have almost zero dead index tuples. The number of LP_DEAD items may well accumulate over multiple VACUUM operations, before finally the threshold is crossed and VACUUM performs conventional index vacuuming. Even then, the optimization will have avoided a great deal of largely unnecessary index vacuuming. In the future we may teach VACUUM to skip index vacuuming on a per-index basis, using a much more sophisticated approach. For now we only consider the extreme cases, where we can be quite confident that index vacuuming just isn't worth it using simple heuristics. Also log information about how many heap pages have one or more LP_DEAD items when autovacuum logging is enabled. Author: Masahiko Sawada sawada.mshk@gmail.com Author: Peter Geoghegan pg@bowt.ie Discussion: https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzmkebqPd4MVGuPTOS9bMFvp9MDs5cRTCOsv1rQJ3jCbXw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5100010ee4d5c8ef46619dbd1d17090c627e6d0a
Silence another _bt_check_unique compiler warning. Per complaint from Tom Lane Discussion: https://postgr.es/m/1922884.1617909599@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/796092fb84c08162ae803e83a13aa8bd6d9b23d0

Amit Kapila pushed:

Refactor function parse_output_parameters. Instead of using multiple parameters in parse_ouput_parameters function signature, use the struct PGOutputData that encapsulates all pgoutput options. It will be useful for future work where we need to add other options in pgoutput. Author: Euler Taveira Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CADK3HHJ-+9SO7KuRLH=9Wa1rAo60Yreq1GFNkH_kd0=CdaWM+A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/531737ddad214cb8a675953208e2f3a6b1be122b
Allow pgoutput to send logical decoding messages. The output plugin accepts a new parameter (messages) that controls if logical decoding messages are written into the replication stream. It is useful for those clients that use pgoutput as an output plugin and needs to process messages that were written by pg_logical_emit_message(). Although logical streaming replication protocol supports logical decoding messages now, logical replication does not use this feature yet. Author: David Pirotte, Euler Taveira Reviewed-by: Euler Taveira, Andres Freund, Ashutosh Bapat, Amit Kapila Discussion: https://postgr.es/m/CADK3HHJ-+9SO7KuRLH=9Wa1rAo60Yreq1GFNkH_kd0=CdaWM+A@mail.gmail.com https://git.postgresql.org/pg/commitdiff/ac4645c0157fc5fcef0af8ff571512aa284a2cec
Fix the tests added by commit ac4645c015. In the tests, after disabling the subscription, we were not waiting for the replication connection to drop from the publisher. So when the test was trying to use the same slot to fetch the messages via SQL API, it sometimes gives an error that the replication slot is active for other PID. Per buildfarm. https://git.postgresql.org/pg/commitdiff/266b5673b4b6bed2e9ebfe73ca967f44d6dc0e6c
Fix typo in jsonfuncs.c. Author: Tatsuro Yamada Discussion: https://postgr.es/m/7c166a60-2808-6b89-9524-feefc6233748@nttcom.co.jp_1 https://git.postgresql.org/pg/commitdiff/8ffb003591ff02f59d92c36a5791307881863146

David Rowley pushed:

Fix compiler warning in fe-trace.c for MSVC. It seems that in MSVC timeval's tv_sec field is of type long. localtime() takes a time_t pointer. Since long is 32-bit even on 64-bit builds in MSVC, passing a long pointer instead of the correct time_t pointer generated a compiler warning. Fix that. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvoRG25X_=ZCGSPb4KN_j2iu=G2uXsRSg8NBZeuhkOSETg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/9bc9b4609a246ded5caf3f3d4c0013a002ba2323
Fix compiler warning for MSVC in libpq_pipeline.c. DEBUG was already defined by the MSVC toolchain for "Debug" builds. On these systems the unconditional #define DEBUG was causing a 'DEBUG': macro redefinition warning. Here we rename DEBUG to DEBUG_OUPUT and also get rid of the #define which defined this constant. This appears to have been left in the code by mistake. Discussion: https://postgr.es/m/CAApHDvqTTgDm38s4HRj03nhzhzQ1oMOj-RXFUB1pE6Bj07jyuQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/3b82d990ab784881153c0f127e4c1211e9b6065c
Cleanup partition pruning step generation. There was some code in gen_prune_steps_from_opexps that needlessly checked a list was not empty when it clearly had to contain at least one item. This prompted a further cleanup operation in partprune.c. Additionally, the previous code could end up adding additional needless INTERSECT steps. However, those do not appear to be able to cause any misbehavior. gen_prune_steps_from_opexps is now no longer in charge of generating combine pruning steps. Instead, gen_partprune_steps_internal, which already does some combine step creation has been given the sole responsibility of generating all combine steps. This means that when we recursively call gen_partprune_steps_internal, since it always now adds a combine step when it produces multiple steps, we can just pay attention to the final step returned. In passing, do quite a bit of work on the comments to try to more clearly explain the role of both gen_partprune_steps_internal and gen_prune_steps_from_opexps. This is fairly complex code so some extra effort to give any new readers an overview of how things work seems like a good idea. Author: Amit Langote Reported-by: Andy Fan Reviewed-by: Kyotaro Horiguchi, Andy Fan, Ryan Lambert, David Rowley Discussion: https://postgr.es/m/CAKU4AWqWoVii+bRTeBQmeVW+PznkdO8DfbwqNsu9Gj4ubt9A6w@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5ac9c4307337313bedeafc21dbbab93ba809241c
Speedup ScalarArrayOpExpr evaluation. ScalarArrayOpExprs with "useOr=true" and a set of Consts on the righthand side have traditionally been evaluated by using a linear search over the array. When these arrays contain large numbers of elements then this linear search could become a significant part of execution time. Here we add a new method of evaluating ScalarArrayOpExpr expressions to allow them to be evaluated by first building a hash table containing each element, then on subsequent evaluations, we just probe that hash table to determine if there is a match. The planner is in charge of determining when this optimization is possible and it enables it by setting hashfuncid in the ScalarArrayOpExpr. The executor will only perform the hash table evaluation when the hashfuncid is set. This means that not all cases are optimized. For example CHECK constraints containing an IN clause won't go through the planner, so won't get the hashfuncid set. We could maybe do something about that at some later date. The reason we're not doing it now is from fear that we may slow down cases where the expression is evaluated only once. Those cases can be common, for example, a single row INSERT to a table with a CHECK constraint containing an IN clause. In the planner, we enable this when there are suitable hash functions for the ScalarArrayOpExpr's operator and only when there is at least MIN_ARRAY_SIZE_FOR_HASHED_SAOP elements in the array. The threshold is currently set to 9. Author: James Coleman, David Rowley Reviewed-by: David Rowley, Tomas Vondra, Heikki Linnakangas Discussion: https://postgr.es/m/CAAaqYe8x62+=wn0zvNKCj55tPpg-JBHzhZFFc6ANovdqFw7-dA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/50e17ad281b8d1c1b410c9833955bc80fbad4078
Improve slightly misleading comments in nodeFuncs.c. There were some comments in nodeFuncs.c that, depending on your interpretation of the word "result", could lead you to believe that the comments were badly copied and pasted from somewhere else. If you thought of "result" as the return value of the function that the comment is written in, then you'd be misled. However, if you'd correctly interpreted "result" to mean the result type of the given node type, you'd not have seen any issues. Here we do a small cleanup to try to prevent any future misinterpretations. Per wording suggestion from Tom Lane. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvp+Bw=2Qiu5=uXMKfC7gd0+B=4JvexVgGJU=am2g9a1CA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/152d33bccec7176f50be225bdbedf2e6de214e54

Etsuro Fujita pushed:

Adjust input value to WaitEventSetWait() in ExecAppendAsyncEventWait(). Adjust the number of events given to WaitEventSetWait() so that it doesn't exceed the maximum number of events in the WaitEventSet given to that function (set->nevents_space) in hopes of making the buildfarm green. Per valgrind failure report from Tom Lane and the buildfarm. Author: Etsuro Fujita Discussion: https://postgr.es/m/3411577.1617289776%40sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/a8af856d3257138590788e40eb84049def147acf

Dean Rasheed pushed:

pgbench: Function to generate random permutations. This adds a new function, permute(), that generates pseudorandom permutations of arbitrary sizes. This can be used to randomly shuffle a set of values to remove unwanted correlations. For example, permuting the output from a non-uniform random distribution so that all the most common values aren't collocated, allowing more realistic tests to be performed. Formerly, hash() was recommended for this purpose, but that suffers from collisions that might alter the distribution, so recommend permute() for this purpose instead. Fabien Coelho and Hironobu Suzuki, with additional hacking be me. Reviewed by Thomas Munro, Alvaro Herrera and Muhammad Usama. Discussion: https://postgr.es/m/alpine.DEB.2.21.1807280944370.5142@lancre https://git.postgresql.org/pg/commitdiff/6b258e3d688db14aadb58dde2a72939362310684

Heikki Linnakangas pushed:

Mark test_enc_conversion() as STRICT. Reported-by: Jaime Casanova, using SQLsmith Discussion: https://www.postgresql.org/message-id/20210402235337.GA4082@ahch-to https://git.postgresql.org/pg/commitdiff/c4c393b3ec83ceb4b4d7f37cdd5302126377d069
Add sortsupport for gist_btree opclasses, for faster index builds. Commit 16fa9b2b30 introduced a faster way to build GiST indexes, by sorting all the data. This commit adds the sortsupport functions needed to make use of that feature for btree_gist. Author: Andrey Borodin Discussion: https://www.postgresql.org/message-id/2F3F7265-0D22-44DB-AD71-8554C743D943@yandex-team.ru https://git.postgresql.org/pg/commitdiff/9f984ba6d23dc6eecebf479ab1d3f2e550a4e9be
Revert "Add sortsupport for gist_btree opclasses, for faster index builds.". This reverts commit 9f984ba6d23dc6eecebf479ab1d3f2e550a4e9be. It was making the buildfarm unhappy, apparently setting client_min_messages in a regression test produces different output if log_statement='all'. Another issue is that I now suspect the bit sortsupport function was in fact not correct to call byteacmp(). Revert to investigate both of those issues. https://git.postgresql.org/pg/commitdiff/d92b1cdbab408d8f1299257125c9ae375f3ca644

Tomáš Vondra pushed:

Fix handling of clauses incompatible with extended statistics. Handling of incompatible clauses while applying extended statistics was a bit confused - while handling a mix of compatible and incompatible clauses it sometimes incorrectly treated the incompatible clauses as compatible, resulting in a crash. Fixed by reworking the code applying the selected statistics object to make it easier to understand, and adding a proper compatibility check. Reported-by: David Rowley Discussion: https://postgr.es/m/CAApHDvpYT10-nkSp8xXe-nbO3jmoaRyRFHbzh-RWMfAJynqgpQ%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/518442c7f334f3b05ea28b7ef50f1b551cfcc23e
Don't add non-existent pages to bitmap from BRIN. The code in bringetbitmap() simply added the whole matching page range to the TID bitmap, as determined by pages_per_range, even if some of the pages were beyond the end of the heap. The query then might fail with an error like this: ERROR: could not open file "base/20176/20228.2" (target block 262144): previous segment is only 131021 blocks In this case, the relation has 262093 pages (131072 and 131021 pages), but we're trying to acess block 262144, i.e. first block of the 3rd segment. At that point _mdfd_getseg() notices the preceding segment is incomplete, and fails. Hitting this in practice is rather unlikely, because:
Most indexes use power-of-two ranges, so segments and page ranges align perfectly (segment end is also a page range end). * The table size has to be just right, with the last segment being almost full - less than one page range from full segment, so that the last page range actually crosses the segment boundary. * Prefetch has to be enabled. The regular page access checks that pages are not beyond heap end, but prefetch does not. On older releases (before 12) the execution stops after hitting the first non-existent page, so the prefetch distance has to be sufficient to reach the first page in the next segment to trigger the issue. Since 12 it's enough to just have prefetch enabled, the prefetch distance does not matter. Fixed by not adding non-existent pages to the TID bitmap. Backpatch all the way back to 9.6 (BRIN indexes were introduced in 9.5, but that release is EOL). Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/23607a8156d595522c232ff3933d77041d3adaa1

Andres Freund pushed:

Increment xactCompletionCount during subtransaction abort. Snapshot caching, introduced in 623a9ba79b, did not increment xactCompletionCount during subtransaction abort. That could lead to an older snapshot being reused. That is, at least as far as I can see, not a correctness issue (for MVCC snapshots there's no difference between "in progress" and "aborted"). The only difference between the old and new snapshots would be a newer ->xmax. While HeapTupleSatisfiesMVCC makes the same visibility determination, reusing the old snapshot leads HeapTupleSatisfiesMVCC to not set HEAP_XMIN_INVALID. Which subsequently causes the kill_prior_tuple optimization to not kick in (via HeapTupleIsSurelyDead() returning false). The performance effects of doing the same index-lookups over and over again is how the issue was discovered... Fix the issue by incrementing xactCompletionCount in XidCacheRemoveRunningXids. It already acquires ProcArrayLock exclusively, making that an easy proposition. Add a test to ensure that kill_prior_tuple prevents index growth when it involves aborted subtransaction of the current transaction. Author: Andres Freund Discussion: https://postgr.es/m/20210406043521.lopeo7bbigad3n6t@alap3.anarazel.de Discussion: https://postgr.es/m/20210317055718.v6qs3ltzrformqoa%40alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/90c885cdab8bc5a5f12a243774fa0db51002a2fd
Cope with NULL query string in ExecInitParallelPlan(). It's far from clear that this is the right approach - but a good portion of the buildfarm has been red for a few hours, on the last day of the CF. And this fixes at least the obvious crash. So let's go with that for now. Discussion: https://postgr.es/m/20210407225806.majgznh4lk34hjvu%40alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/b3ee4c503872f3d0a5d6a7cbde48815f555af15b

Magnus Hagander pushed:

Refactor hba_authname. The previous implementation (from 9afffcb833) had an unnecessary check on the boundaries of the enum which trigtered compile warnings. To clean it up, move the pre-existing static assert to a central location and call that. Reported-By: Erik Rijkers Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/1056399262.13159.1617793249020@webmailclassic.xs4all.nl https://git.postgresql.org/pg/commitdiff/c1968426ba3de1fe37848863e35fff30261bf941
Track identical top vs nested queries independently in pg_stat_statements. Changing pg_stat_statements.track between 'all' and 'top' would control if pg_stat_statements tracked just top level statements or also statements inside functions, but when tracking all it would not differentiate between the two. Being table to differentiate this is useful both to track where the actual query is coming from, and to see if there are differences in executions between the two. To do this, add a boolean to the hash key indicating if the statement was top level or not. Experience from the pg_stat_kcache module shows that in at least some "reasonable worloads" only <5% of the queries show up both top level and nested. Based on this, admittedly small, dataset, this patch does not try to de-duplicate those query texts, and will just store one copy for the top level and one for the nested. Author: Julien Rohaud Reviewed-By: Magnus Hagander, Masahiro Ikeda Discussion: https://postgr.es/m/20201202040516.GA43757@nol https://git.postgresql.org/pg/commitdiff/6b4d23feef6e334fb85af077f2857f62ab781848
Add functions to wait for backend termination. This adds a function, pg_wait_for_backend_termination(), and a new timeout argument to pg_terminate_backend(), which will wait for the backend to actually terminate (with or without signaling it to do so depending on which function is called). The default behaviour of pg_terminate_backend() remains being timeout=0 which does not waiting. For pg_wait_for_backend_termination() the default wait is 5 seconds. Author: Bharath Rupireddy Reviewed-By: Fujii Masao, David Johnston, Muhammad Usama, Hou Zhijie, Magnus Hagander Discussion: https://postgr.es/m/CALj2ACUBpunmyhYZw-kXCYs5NM+h6oG_7Df_Tn4mLmmUQifkqA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/aaf043257205ec523f1ba09a3856464d17cf2281
Merge v1.10 of pg_stat_statements into v1.9. v1.9 is already new in this version of PostgreSQL, so turn it into just one change. Author: Julien Rohaud Discussion: https://postgr.es/m/20210408120505.7zinijtdexbyghvb@nol https://git.postgresql.org/pg/commitdiff/5844c23dc50508aefeb8183be45f4ee99e9dec17
Fix typo. Author: Daniel Westermann Backpatch-through: 9.6 Discussion: https://postgr.es/m/GV0P278MB0483A7AA85BAFCC06D90F453D2739@GV0P278MB0483.CHEP278.PROD.OUTLOOK.COM https://git.postgresql.org/pg/commitdiff/1798d8f8b6fbb8ff922f640ff2d5dbd3e47064a2

Robert Haas pushed:

amcheck: Remove duplicate XID/MXID bounds checks. Commit 3b6c1259f9ca8e21860aaf24ec6735a8e5598ea0 resulted in the same xmin and xmax bounds checking being performed in both check_tuple() and check_tuple_visibility(). Remove the duplication. While at it, adjust some code comments that were overlooked in that commit. Mark Dilger Discussion: http://postgr.es/m/AC5479E4-6321-473D-AC92-5EC36299FBC2@enterprisedb.com https://git.postgresql.org/pg/commitdiff/4573f6a9af6e232ba073392716a051ae2017d1e9
amcheck: fix multiple problems with TOAST pointer validation. First, don't perform database access while holding a buffer lock. When checking a heap, we can validate that TOAST pointers are sane by performing a scan on the TOAST index and looking up the chunks that correspond to each value ID that appears in a TOAST poiner in the main table. But, to do that while holding a buffer lock at least risks causing other backends to wait uninterruptibly, and probably can cause undetected and uninterruptible deadlocks. So, instead, make a list of checks to perform while holding the lock, and then perform the checks after releasing it. Second, adjust things so that we don't try to follow TOAST pointers for tuples that are already eligible to be pruned. The TOAST tuples become eligible for pruning at the same time that the main tuple does, so trying to check them may lead to spurious reports of corruption, as observed in the buildfarm. The necessary infrastructure to decide whether or not the tuple being checked is prunable was added by commit 3b6c1259f9ca8e21860aaf24ec6735a8e5598ea0, but it wasn't actually used for its intended purpose prior to this patch. Mark Dilger, adjusted by me to avoid a memory leak. Discussion: http://postgr.es/m/AC5479E4-6321-473D-AC92-5EC36299FBC2@enterprisedb.com https://git.postgresql.org/pg/commitdiff/ec7ffb8096e8eb90f4c9230f7ba9487f0abe1a9f

Bruce Momjian pushed:

Move pg_stat_statements query jumbling to core. Add compute_query_id GUC to control whether a query identifier should be computed by the core (off by default). It's thefore now possible to disable core queryid computation and use pg_stat_statements with a different algorithm to compute the query identifier by using a third-party module. To ensure that a single source of query identifier can be used and is well defined, modules that calculate a query identifier should throw an error if compute_query_id specified to compute a query id and if a query idenfitier was already calculated. Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol Author: Julien Rouhaud Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu https://git.postgresql.org/pg/commitdiff/5fd9dfa5f50e4906c35133a414ebec5b6d518493
Make use of in-core query id added by commit 5fd9dfa5f5. Use the in-core query id computation for pg_stat_activity, log_line_prefix, and EXPLAIN VERBOSE. Similar to other fields in pg_stat_activity, only the queryid from the top level statements are exposed, and if the backends status isn't active then the queryid from the last executed statements is displayed. Add a %Q placeholder to include the queryid in log_line_prefix, which will also only expose top level statements. For EXPLAIN VERBOSE, if a query identifier has been computed, either by enabling compute_query_id or using a third-party module, display it. Bump catalog version. Discussion: https://postgr.es/m/20210407125726.tkvjdbw76hxnpwfi@nol Author: Julien Rouhaud Reviewed-by: Alvaro Herrera, Nitin Jadhav, Zhihong Yu https://git.postgresql.org/pg/commitdiff/4f0b0966c866ae9f0e15d7cc73ccf7ce4e1af84b
Fix regression test failure caused by commit 4f0b0966c8. The query originally used was too simple, cause explain_filter() to be unable to remove JIT output text. Reported-by: Tom Lane Author: Julien Rouhaud https://git.postgresql.org/pg/commitdiff/bc70728693bc2d28db7125e7a24d78ad7612f58c
Add csvlog output for the new query_id value. This also adjusts the printf format for query id used by log_line_prefix (%Q). Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20210408005402.GG24239@momjian.us Author: Julien Rouhaud, Bruce Momjian https://git.postgresql.org/pg/commitdiff/f57a2f5e03054ade221e554c70e628e1ffae1b66
Fixes for query_id feature. Ignore parallel workers in pg_stat_statements Oversight in 4f0b0966c8 which exposed queryid in parallel workers. Counters are aggregated by the main backend process so parallel workers would report duplicated activity, and could also report activity for the wrong entry as they are only aware of the top level queryid. Fix thinko in pg_stat_get_activity when retrieving the queryid. Remove unnecessary call to pgstat_report_queryid(). Reported-by: Amit Kapila, Andres Freund, Thomas Munro Discussion: https://postgr.es/m/20210408051735.lfbdzun5zdlax5gd@alap3.anarazel.de p634GTSOqnDW86Owrn6qDAVosC5dJjXjp7BMfc5Gz1Q@mail.gmail.com Author: Julien Rouhaud https://git.postgresql.org/pg/commitdiff/0f61727b75b93915ca9a9f20c996ed7020996a38

Thomas Munro pushed:

Provide ReadRecentBuffer() to re-pin buffers by ID. If you know the ID of a buffer that recently held a block that you would like to pin, this function can be used check if it's still there. It can be used to avoid a second lookup in the buffer mapping table after PrefetchBuffer() reports a cache hit. Reviewed-by: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/CA+hUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq=AovOddfHpA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/2f27f8c511494cca9e0e9a4eeeb102fa9f193002
Optionally prefetch referenced data in recovery. Introduce a new GUC recovery_prefetch, disabled by default. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Better mechanisms will follow in later work on the I/O subsystem. The GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os we allow ourselves to initiate, based on pessimistic heuristics used to infer that I/Os have begun and completed. The GUC wal_decode_buffer_size is used to limit the maximum distance we are prepared to read ahead in the WAL to find uncached blocks. Reviewed-by: Alvaro Herrera alvherre@2ndquadrant.com (parts) Reviewed-by: Andres Freund andres@anarazel.de (parts) Reviewed-by: Tomas Vondra tomas.vondra@2ndquadrant.com (parts) Tested-by: Tomas Vondra tomas.vondra@2ndquadrant.com Tested-by: Jakub Wartak Jakub.Wartak@tomtom.com Tested-by: Dmitry Dolgov 9erthalion6@gmail.com Tested-by: Sait Talha Nisanci Sait.Nisanci@microsoft.com Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/1d257577e08d3e598011d6850fd1025858de8c8c
Remove read_page callback from XLogReader. Previously, the XLogReader module would fetch new input data using a callback function. Redesign the interface so that it tells the caller to insert more data with a special return value instead. This API suits later patches for prefetching, encryption and maybe other future projects that would otherwise require continually extending the callback interface. As incidental cleanup work, move global variables readOff, readLen and readSegNo inside XlogReaderState. Author: Kyotaro HORIGUCHI horiguchi.kyotaro@lab.ntt.co.jp Author: Heikki Linnakangas hlinnaka@iki.fi (parts of earlier version) Reviewed-by: Antonin Houska ah@cybertec.at Reviewed-by: Alvaro Herrera alvherre@2ndquadrant.com Reviewed-by: Takashi Menjo takashi.menjo@gmail.com Reviewed-by: Andres Freund andres@anarazel.de Reviewed-by: Thomas Munro thomas.munro@gmail.com Discussion: https://postgr.es/m/20190418.210257.43726183.horiguchi.kyotaro%40lab.ntt.co.jp https://git.postgresql.org/pg/commitdiff/323cbe7c7ddcf18aaf24b7f6d682a45a61d4e31b
Add circular WAL decoding buffer. Teach xlogreader.c to decode its output into a circular buffer, to support optimizations based on looking ahead. * XLogReadRecord() works as before, consuming records one by one, and allowing them to be examined via the traditional XLogRecGetXXX() macros.
An alternative new interface XLogNextRecord() is added that returns pointers to DecodedXLogRecord structs that can be examined directly. * XLogReadAhead() provides a second cursor that lets you see further ahead, as long as data is available and there is enough space in the decoding buffer. This returns DecodedXLogRecord pointers to the caller, but also adds them to a queue of records that will later be consumed by XLogNextRecord()/XLogReadRecord(). The buffer's size is controlled with wal_decode_buffer_size. The buffer could potentially be placed into shared memory, for future projects. Large records that don't fit in the circular buffer are called "oversized" and allocated separately with palloc(). Discussion: https://postgr.es/m/CA+hUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq=AovOddfHpA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/f003d9f8721b3249e4aec8a1946034579d40d42c
Remove duplicate typedef. Thinko in commit 323cbe7c, per complaint from BF animal locust's older GCC compiler. https://git.postgresql.org/pg/commitdiff/34399a670a1c559ef8a355ed25d090d73e400ad4
Doc: Review for "Optionally prefetch referenced data in recovery.". Typos, corrections and language improvements in the docs, and a few in code comments too. Reported-by: Justin Pryzby pryzby@telsasoft.com Discussion: https://postgr.es/m/20210409033703.GP6592%40telsasoft.com https://git.postgresql.org/pg/commitdiff/dc88460c24ed71ba7464ef4749e5f25da1bf6652
Make new GUC short descriptions more consistent. Reported-by: Daniel Westermann (DWE) daniel.westermann@dbi-services.com Discussion: https://postgr.es/m/GV0P278MB0483490FEAC879DCA5ED583DD2739%40GV0P278MB0483.CHEP278.PROD.OUTLOOK.COM https://git.postgresql.org/pg/commitdiff/846d35b2dc1bd4d09f5e3e5e5a2cc8efafeb600e

Noah Misch pushed:

Standardize pg_authid oid_symbol values. Commit c9c41c7a337d3e2deb0b2a193e9ecfb865d8f52b used two different naming patterns. Standardize on the majority pattern, which was the only pattern in the last reviewed version of that commit. https://git.postgresql.org/pg/commitdiff/df5efaf4410f94cc1b69e8ade1d64dc92232ec1d

Pending Patches

Bharath Rupireddy sent in another revision of a patch to add table AMs for multi- and single inserts, and use same for CTAS, REFRESH MATERIALIZED VIEW, and COPY.

Sait Talha Nisanci sent in another revision of a patch intended to fix a bug that manifested as a crash in record_type_typmod_compare.

Bharath Rupireddy sent in two more revisions of a patch to clarify the error message caused by adding a non-table to a publication, naming the type of object it is rather than the type (table) it's not.

Jaime Casanova sent in a patch to use AV worker items infrastructure for GIN pending list's cleanup.

Jürgen Purtz sent in another revision of a patch to add a chapter on architecture to the tutorial.

Andres Freund sent in another revision of a patch to move the stats collector's temporary storage from files to shared memory.

Amul Sul sent in three more revisions of a patch to implement ALTER SYSTEM READ {ONLY | WRITE}.

Michaël Paquier sent in another revision of a patch to make it possible to use NSS as a libpq TLS backend.

Vigneshwaran C, Amit Kapila, and Masahiko Sawada traded patches to use HTAB for replication slot stats.

Bruce Momjian sent in two revisions of a patch to fix and clarify the documentation for some corner cases in interval arithmetic.

Jaime Casanova sent in a patch to document the fact that BRIN's autosummarize parameter is off by default.

Heikki Linnakangas sent in another revision of a patch to simplify COPY FROM parsing by forcing lookahead.

Bruce Momjian sent in another revision of a patch to implement key management.

Amit Langote sent in two more revisions of a patch to allow batching of inserts during cross-partition updates.

Bertrand Drouvot sent in three more revisions of a patch to make it possible to have minimal logical decoding on standbys.

Himanshu Upadhyaya sent in two revisions of a patch to fix an infelicity between PREPARE TRANSACTION and TEMP TABLEs.

Thomas Munro and Kyotaro HORIGUCHI traded patches to remove the read_page callback from XLogReadRecord.

Justin Pryzby sent in two more revisions of a patch to make pg_ls_* show directories and shared filesets.

Hou Zhijie, Masahiko Sawada, Amit Langote, and Shi Yu traded patches to plug a table reference leak in logical replication.

Fabien COELHO and Michaël Paquier traded patches to add a SHOW_ALL_RESULTS option to psql.

Takamichi Osumi sent in another revision of a patch to make it possible to disable WAL logging to speed up data loading.

Yugo Nagata sent in another revision of a patch to implement incremental view maintenance.

Michael Banck sent in another revision of a patch to add a new PGC_ADMINSET GUC context, administrator, and add a pg_change_role_settings predefined role.

Pavel Borisov sent in a patch to ensure same treatment of page header and page special size alignment during page init. Both are now just checked for proper alignment with asserts not MAXALIGNing anything silently. Caller should give properly maxalinged values into page init function.

Thomas Munro sent in another revision of a patch to add a PSQL_WATCH_PAGER setting for psql's \watch command.

Tom Lane sent in three more revisions of a patch to make it possible for psql \df to choose functions by their arguments.

Vigneshwaran C sent in another revision of a patch to identify missing publications from publishers during CREATE/ALTER SUBSCRIPTION.

Ajin Cherian and Amit Kapila traded patches to add missing documentation for streaming in-progress transactions.

Peter Smith sent in three more revisions of a patch to add logical decoding of two-phase transactions.

Thomas Munro sent in another revision of a patch to implement WAL prefetching.

Thomas Munro and Andrey Borodin traded patches to make all SLRU buffer sizes configurable.

Andrey V. Lepikhov sent in another revision of a patch to remove the 64k rangetable limit.

Haotian Wu sent in a patch to add a --drop-cascade to pg_dump/restore.

Bharath Rupireddy sent in a patch to disallow the RESTART option for CREATE SEQUENCE, as it only makes sense in the case of ALTER SEQUENCE.

Andres Freund sent in a patch to fix a race in InvalidateObsoleteReplicationSlots() and re-remove SlotAcquireBehavior.

Bharath Rupireddy sent in three revisions of a patch to simplify the backend terminate and wait logic in the postgres_fdw test.

Justin Pryzby sent in another revision of a patch to change track_activity_query_size from the previously correct Resource Usage / Memory to STATS_COLLECTOR category, make log_autovacuum_min_duration LOGGING_WHAT, make track_commit_timestamp REPLICATION_SENDING, and change force_parallel_mode to a DEVELOPER GUC, and remove it from sample config to help avoid users finding this option and changing it in hopes that it'll make their queries faster, but without reading the documentation or understanding what it does.

Justin Pryzby sent in another revision of a patch to speed up COPY FROM to partitioned tables with foreign partitions.

Thomas Munro sent in a patch to use SIGIO to detect postmaster death.

Pavel Borisov sent in a patch to stabilize the tablespaces test for partitioned indexes. When doing tablespace tests, sometimes (very rarely) the order of parent and child tables changed, which made the test fail.

Maxim Orlov sent in another revision of a patch intended to fix a bug that manifested as SSL negotiation error on massive connect/disconnect.

Andrey V. Lepikhov sent in another revision of a patch to make asymmetric partitionwise joins work more efficiently by teaching the optimizer to consider a partitionwise join of a non-partitioned table with each partition of partitioned table. This technique causes changes to the 'reparameterize by child' machinery.

Michaël Paquier sent in a patch to move tablespace path re-creation from the makefiles to pg_regress.

Pavel Stěhule sent in another revision of a patch to implement schema variables.

Tom Lane sent in a patch intended to fix a bug that manifested as reference leak with type by giving up on associating a long-lived tupdesc refcount with these expression nodes at all, relying instead on the fact that typcache entries never go away once made.

Rémi Lapeyre sent in three more revisions of a patch to add header support to "COPY TO" text format, and a corresponding header matching mode to "COPY FROM".

Justin Pryzby sent in another revision of a patch to ALTER TABLE ... DETACH CONCURRENTLY to avoid creation of redundant constraint.

Pavel Stěhule sent in another revision of a patch to add an --options-file option to pg_dump/pg_restore.

Tom Lane sent in a patch to make PGWARNING and PGERROR universally available.

Peter Geoghegan sent in a patch intended to fix a bug that manifested as PANIC: wrong buffer passed to visibilitymap_clear by acquiring a super-exclusive lock in lazy_vacuum_heap_rel() again.

Ranier Vilela sent in a patch to fix an uninitialized scalar variable in src/backend/statistics/extended_stats.c.