Quick Links

PostgreSQL Weekly News - August 8, 2021

Posted on 2021-08-10 by PWN

PWN

PostgreSQL Weekly News - August 8, 2021

PGConf NYC is happening December 3-4, 2021. The CfP is open, as are opportunities to sponsor.

PostgreSQL Product News

Pgpool-II 4.2.4, 4.1.8, 4.0.15, 3.7.20 and 3.6.27, a connection pooler and statement replication system for PostgreSQL, released

pgmoneta 0.4.0, a backup and restore system for PostgreSQL, released

Buildfarm 13.1 software, a continuous integration system for the PostgreSQL project, released

dbForge Schema Compare 1.2 for PostgreSQL released

pg_timetable 4.0.0, a job scheduler for PostgreSQL, released. https://github.com/cybertec-postgresql/pg_timetable/releases

Person of the week

PostgreSQL Jobs for August

https://archives.postgresql.org/pgsql-jobs/2021-08/

PostgreSQL in the News

Planet PostgreSQL: https://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david@fetter.org.

Applied Patches

Amit Kapila pushed:

Fix test failure in 021_twophase.pl. The test is expecting two prepared transactions corresponding to two subscriptions but it waits to catch up for just one subscription. Fix it by allowing to wait for both subscriptions. Reported-by: Michael Paquier, as per buildfarm Author: Ajin Cherian Reviewed-By: Amit Kapila, Vignesh C, Peter Smith Discussion: https://postgr.es/m/CAA4eK1+_0iNQ8Z=KVTjmmAqNX-hyv+1+fnZ-Yx8CVP=uAcekqw@mail.gmail.com https://git.postgresql.org/pg/commitdiff/eaf5321c352478266cebe2aa50ea4c34a8fdd2c7
Add prepare API support for streaming transactions in logical replication. Commit a8fd13cab0 added support for prepared transactions to built-in logical replication via a new option "two_phase" for a subscription. The "two_phase" option was not allowed with the existing streaming option. This commit permits the combination of "streaming" and "two_phase" subscription options. It extends the pgoutput plugin and the subscriber side code to add the prepare API for streaming transactions which will apply the changes accumulated in the spool-file at prepare time. Author: Peter Smith and Ajin Cherian Reviewed-by: Vignesh C, Amit Kapila, Greg Nancarrow Tested-By: Haiying Tang Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru Discussion: https://postgr.es/m/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3+Q@mail.gmail.com https://git.postgresql.org/pg/commitdiff/63cf61cdeb7b0450dcf3b2f719c553177bac85a2

Etsuro Fujita pushed:

Fix oversight in commit 1ec7fca8592178281cd5cdada0f27a340fb813fc. I failed to account for the possibility that when ExecAppendAsyncEventWait() notifies multiple async-capable nodes using postgres_fdw, a preceding node might invoke process_pending_request() to process a pending asynchronous request made by a succeeding node. In that case the succeeding node should produce a tuple to return to the parent Append node from tuples fetched by process_pending_request() when notified. Repair. Per buildfarm via Michael Paquier. Back-patch to v14, like the previous commit. Thanks to Tom Lane for testing. Discussion: https://postgr.es/m/YQP0UPT8KmPiHTMs%40paquier.xyz https://git.postgresql.org/pg/commitdiff/a8ed9bd59d48c13da50ed2358911721b2baf1de3
postgres_fdw: Fix issues with generated columns in foreign tables. postgres_fdw imported generated columns from the remote tables as plain columns, and caused failures like "ERROR: cannot insert a non-DEFAULT value into column "foo"" when inserting into the foreign tables, as it tried to insert values into the generated columns. To fix, we do the following under the assumption that generated columns in a postgres_fdw foreign table are defined so that they represent generated columns in the underlying remote table: * Send DEFAULT for the generated columns to the foreign server on insert or update, not generated column values computed on the local server.
Add to postgresImportForeignSchema() an option "import_generated" to include column generated expressions in the definitions of foreign tables imported from a foreign server. The option is true by default. The assumption seems reasonable, because that would make a query of the postgres_fdw foreign table return values for the generated columns that are consistent with the generated expression. While here, fix another issue in postgresImportForeignSchema(): it tried to include column generated expressions as column default expressions in the foreign table definitions when the import_default option was enabled. Per bug #16631 from Daniel Cherniy. Back-patch to v12 where generated columns were added. Discussion: https://postgr.es/m/16631-e929fe9db0ffc7cf%40postgresql.org https://git.postgresql.org/pg/commitdiff/aa769f80ed80b7adfbdea9a6bc267ba4aeb80fd7

Andres Freund pushed:

Remove misplaced comment from AuxiliaryProcessMain(). The comment didn't make sense anymore since at least 626eb021988. As it didn't actually explain anything anyway, just remove it. Author: Andres Freund andres@anarazel.de https://git.postgresql.org/pg/commitdiff/8b1de88b7ce9fe0458d3950121a797fd3d988f6c
pgstat: split reporting/fetching of bgwriter and checkpointer stats. These have been unrelated since bgwriter and checkpointer were split into two processes in 806a2aee379. As there several pending patches (shared memory stats, extending the set of tracked IO / buffer statistics) that are made a bit more awkward by the grouping, split them. Done separately to make reviewing easier. This does not change the contents of pg_stat_bgwriter or move fields out of bgwriter/checkpointer stats that arguably do not belong in either. However pgstat_fetch_global() was renamed and split into pgstat_fetch_stat_checkpointer() and pgstat_fetch_stat_bgwriter(). Author: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/1bc8e7b0991c1eae5fa6dc2d29bb2280efb52740
pgbench: When using pipelining only do PQconsumeInput() when necessary. Up to now we did a PQconsumeInput() for each pipelined query, asking the OS for more input - which it often won't have, as all results might already have been sent. That turns out to have a noticeable performance impact. Alvaro Herrera reviewed the idea to add the PQisBusy() check, but not this concrete patch. Author: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/20210720180039.23rivhdft3l4mayn@alap3.anarazel.de Backpatch: 14, where libpq/pgbench pipelining was introduced. https://git.postgresql.org/pg/commitdiff/87bff68840d542011ed8f60427502fb90fdf2873
process startup: Rename postmaster's --forkboot to --forkaux. It is confusing that aux processes are started with --forkboot, given that bootstrap mode cannot run below postmaster. Author: Andres Freund andres@anarazel.de Reviewed-By: Kyotaro Horiguchi horikyota.ntt@gmail.com Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/50017f77722b8b998ead5ca6fdb0b821fe7a34d2
process startup: Separate out BootstrapModeMain from AuxiliaryProcessMain. There practically was no shared code between the two, once all the ifs are removed. And it was quite confusing that aux processes weren't actually started by the call to AuxiliaryProcessMain() in main(). There's more to do, AuxiliaryProcessMain() should move out of bootstrap.c, and BootstrapModeMain() shouldn't use/be part of AuxProcType. Author: Andres Freund andres@anarazel.de Reviewed-By: Kyotaro Horiguchi horikyota.ntt@gmail.com Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/5aa4a9d2077fa902b4041245805082fec6be0648
process startup: auxprocess: reindent block. Kept separate for ease of review, particularly because pgindent insists on reflowing a few comments. Author: Andres Freund andres@anarazel.de Reviewed-By: Kyotaro Horiguchi horikyota.ntt@gmail.com Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/27f790346621e1db3cc0305e7ae2b2cbfb537aa6
process startup: Move AuxiliaryProcessMain into its own file. After the preceding commits the auxprocess code is independent from bootstrap.c - so a dedicated file seems less confusing. Author: Andres Freund andres@anarazel.de Reviewed-By: Kyotaro Horiguchi horikyota.ntt@gmail.com Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/0a692109dcc73178962069addf7478ac89950e4d
process startup: Remove bootstrap / checker modes from AuxProcType. Neither is actually initialized as an auxiliary process, so it does not really make sense to reserve a PGPROC etc for them. This keeps checker mode implemented by exiting partway through bootstrap mode. That might be worth changing at some point, perhaps if we ever extend checker mode to be a more general tool. Author: Andres Freund andres@anarazel.de Reviewed-By: Kyotaro Horiguchi horikyota.ntt@gmail.com Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/f8dd4ecb0b7fc3420e199021375e622815cd326f
process startup: Centralize pgwin32_signal_initialize() calls. For one, the existing location lead to somewhat awkward code in main(). For another, the new location is easier to understand anyway. Author: Andres Freund andres@anarazel.de Reviewed-By: Kyotaro Horiguchi horikyota.ntt@gmail.com Reviewed-By: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/07bf37850991c68a7038fb06186bddfd64c72faf
Call pgwin32_signal_initialize() in postmaster as well. This was an oversight in 07bf3785099. While it does reduce the benefit of the simplification due to that commit, it still seems like a win to me. It seems like it might be a good idea to have a function mirroring InitPostmasterChild() / InitStandaloneProcess() for postmaster in miscinit.c to make it easier to keep initialization between the three possible environment in sync. Author: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/20210805214109.lzfk3r3ae37bahmv@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/0de13bbc47d19c95de132cc85c341fdab079c170
process startup: Always call Init[Auxiliary]Process() before BaseInit(). For EXEC_BACKEND InitProcess()/InitAuxiliaryProcess() needs to have been called well before we call BaseInit(), as SubPostmasterMain() needs LWLocks to work. Having the order of initialization differ between platforms makes it unnecessarily hard to understand the system and to add initialization points for new subsystems without a lot of duplication. To be able to change the order, BaseInit() cannot trigger CreateSharedMemoryAndSemaphores() anymore - obviously that needs to have happened before we can call InitProcess(). It seems cleaner to create shared memory explicitly in single user/bootstrap mode anyway. After this change the separation of bufmgr initialization into InitBufferPoolAccess() / InitBufferPoolBackend() is not meaningful anymore so the latter is removed. Author: Andres Freund andres@anarazel.de Reviewed-By: Kyotaro Horiguchi horikyota.ntt@gmail.com Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/b406478b87e2234c0be4ca4105eee3bb466a646b
pgstat: Bring up pgstat in BaseInit() to fix uninitialized use of pgstat by AV. Previously pgstat_initialize() was called in InitPostgres() and AuxiliaryProcessMain(). As it turns out there was at least one case where we reported stats before pgstat_initialize() was called, see AutoVacWorkerMain()'s intentionally early call to pgstat_report_autovac(). This turns out to not be a problem with the current pgstat implementation as pgstat_initialize() only registers a shutdown callback. But in the shared memory based stats implementation we are working towards pgstat_initialize() has to do more work. After b406478b87e BaseInit() is a central place where initialization shared by normal backends and auxiliary backends can be put. Obviously BaseInit() is called before InitPostgres() registers ShutdownPostgres. Previously ShutdownPostgres was the first before_shmem_exit callback, now that's commonly pgstats. That should be fine. Previously pgstat_initialize() was not called in bootstrap mode, but there does not appear to be a need for that. It's now done unconditionally. To detect future issues like this, assertions are added to a few places verifying that the pgstat subsystem is initialized and not yet shut down. Author: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/ee3f8d3d3aec0d7c961d6b398d31504bb272a450
Make parallel worker shutdown complete entirely via before_shmem_exit(). This is a step toward storing stats in dynamic shared memory. As dynamic shared memory segments are detached from just after before_shmem_exit() callbacks are processed, but before on_shmem_exit() callbacks are, no stats can be collected after before_shmem_exit() callbacks have been processed. Parallel worker shutdown can cause stats to be emitted during DSM detach callbacks, e.g. for SharedFileSet (which closes its files, which can causes fd.c to emit stats about temporary files). Therefore parallel worker shutdown needs to complete during the processing of before_shmem_exit callbacks. One might think this problem could instead be solved by carefully ordering the attaching to DSM segments, so that the pgstats segments get detached from later than the parallel query ones. That turns out to not work because the stats hash might need to grow which can cause new segments to be allocated, which then will be detached from earlier. There are two code changes: First, call ParallelWorkerShutdown() via before_shmem_exit. That's a good idea on its own, because other shutdown callbacks like ShutdownPostgres and ShutdownAuxiliaryProcess are called via before_*. Second, explicitly detach from the parallel query DSM segment, thereby ensuring all stats are emitted during ParallelWorkerShutdown(). There are nicer solutions to these problems, but it's not obvious which of those solutions is the correct one. As the shared memory stats work already is a huge amount of work... Author: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de Discussion: https://postgr.es/m/20210803023612.iziacxk5syn2r4ut@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/fa91d4c91f28f4819dc54f93adbd413a685e366a
Schedule ShutdownXLOG() in single user mode using before_shmem_exit(). Previously on_shmem_exit() was used. The upcoming shared memory stats patch uses DSM segments to store stats, which can not be used after the dsm_backend_shutdown() call in shmem_exit(). There does not seem to be any reason to do ShutdownXLOG() via on_shmem_exit(), so change it. Author: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de Discussion: https://postgr.es/m/20210803023612.iziacxk5syn2r4ut@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/a1bb3d5dbe6a66ae73d7805a63b951793b5d55df
pgstat: Schedule per-backend pgstat shutdown via before_shmem_exit(). Previously on_shmem_exit() was used. The upcoming shared memory stats patch uses DSM segments to store stats, which can not be used after the dsm_backend_shutdown() call in shmem_exit(). The preceding commits were required to permit this change. This commit is split off the shared memory stats patch to make it easier to isolate problems caused by the ordering changes rather than the much larger changes in where stats are stored. Author: Andres Freund andres@anarazel.de Author: Kyotaro Horiguchi horikyota.ntt@gmail.com Discussion: https://postgr.es/m/20210405092914.mmxqe7j56lsjfsej@alap3.anarazel.de Discussion: https://postgr.es/m/20210802164124.ufo5buo4apl6yuvs@alap3.anarazel.de Discussion: https://postgr.es/m/20210803023612.iziacxk5syn2r4ut@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/fb2c5028e63589c01fbdf8b031be824766dc7a68
Move temporary file cleanup to before_shmem_exit(). As reported by a few OSX buildfarm animals there exist at least one path where temporary files exist during AtProcExit_Files() processing. As temporary file cleanup causes pgstat reporting, the assertions added in ee3f8d3d3ae caused failures. This is not an OSX specific issue, we were just lucky that timing on OSX reliably triggered the problem. The known way to cause this is a FATAL error during perform_base_backup() with a MANIFEST used - adding an elog(FATAL) after InitializeBackupManifest() reliably reproduces the problem in isolation. The problem is that the temporary file created in InitializeBackupManifest() is not cleaned up via resource owner cleanup as WalSndResourceCleanup() currently is only used for non-FATAL errors. That then allows to reach AtProcExit_Files() with existing temporary files, causing the assertion failure. To fix this problem, move temporary file cleanup to a before_shmem_exit() hook and add assertions ensuring that no temporary files are created before / after temporary file management has been initialized / shut down. The cleanest way to do so seems to be to split fd.c initialization into two, one for plain file access and one for temporary file access. Right now there's no need to perform further fd.c cleanup during process exit, so I just renamed AtProcExit_Files() to BeforeShmemExit_Files(). Alternatively we could perform another pass through the files to check that no temporary files exist, but the added assertions seem to provide enough protection against that. It might turn out that the assertions added in ee3f8d3d3ae will cause too much noise - in that case we'll have to downgrade them to a WARNING, at least temporarily. This commit is not necessarily the best approach to address this issue, but it should resolve the buildfarm failures. We can revise later. Author: Andres Freund andres@anarazel.de Discussion: https://postgr.es/m/20210807190131.2bm24acbebl4wl6i@alap3.anarazel.de https://git.postgresql.org/pg/commitdiff/675c945394b36c2db0e8c8c9f6209c131ce3f0a8

Thomas Munro pushed:

Run checkpointer and bgwriter in crash recovery. Start up the checkpointer and bgwriter during crash recovery (except in --single mode), as we do for replication. This wasn't done back in commit cdd46c76 out of caution. Now it seems like a better idea to make the environment as similar as possible in both cases. There may also be some performance advantages. Reviewed-by: Robert Haas robertmhaas@gmail.com Reviewed-by: Aleksander Alekseev aleksander@timescale.com Tested-by: Jakub Wartak Jakub.Wartak@tomtom.com Discussion: https://postgr.es/m/CA%2BhUKGJ8NRsqgkZEnsnRc2MFROBV-jCnacbYvtpptK2A9YYp9Q%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/7ff23c6d277d1d90478a51f0dd81414d343f3850
Further simplify a bit of logic in StartupXLOG(). Commit 7ff23c6d277d1d90478a51f0dd81414d343f3850 left us with two identical cases. Collapse them. Author: Robert Haas robertmhaas@gmail.com Discussion: https://postgr.es/m/CA%2BhUKGJ8NRsqgkZEnsnRc2MFROBV-jCnacbYvtpptK2A9YYp9Q%40mail.gmail.com https://git.postgresql.org/pg/commitdiff/8f7c8e2bef2bd2587e5d66dd20de15f3db0a6bcb

Tom Lane pushed:

Doc: minor improvements for logical replication protocol documentation. Where appropriate, annotate message field data types with the backend code's name for the data type, eg XLogRecPtr or TimestampTz. Previously we just said "Int64" which didn't provide as much info. Also clarify references to object OIDs, and make use of the existing convention to denote the value of a field that must have a fixed value. Vignesh C, reviewed by Peter Smith and Euler Taveira. Discussion: https://postgr.es/m/CALDaNm0+fatx57KFcBopUZWQpH_tz3WKKfm-_eiTwcXui5BnhQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/a5cb4f9829fbfd68655543d2d371a18a8eb43b84
Add assorted new regexp_xxx SQL functions. This patch adds new functions regexp_count(), regexp_instr(), regexp_like(), and regexp_substr(), and extends regexp_replace() with some new optional arguments. All these functions follow the definitions used in Oracle, although there are small differences in the regexp language due to using our own regexp engine -- most notably, that the default newline-matching behavior is different. Similar functions appear in DB2 and elsewhere, too. Aside from easing portability, these functions are easier to use for certain tasks than our existing regexp_match[es] functions. Gilles Darold, heavily revised by me Discussion: https://postgr.es/m/fc160ee0-c843-b024-29bb-97b5da61971f@darold.net https://git.postgresql.org/pg/commitdiff/6424337073589476303b10f6d7cc74f501b8d9d7
Don't elide casting to typmod -1. Casting a value that's already of a type with a specific typmod to an unspecified typmod doesn't do anything so far as run-time behavior is concerned. However, it really ought to change the exposed type of the expression to match. Up to now, coerce_type_typmod hasn't bothered with that, which creates gotchas in contexts such as recursive unions. If for example one side of the union is numeric(18,3), but it needs to be plain numeric to match the other side, there's no direct way to express that. This is easy enough to fix, by inserting a RelabelType to update the exposed type of the expression. However, it's a bit nervous-making to change this behavior, because it's stood for a really long time. (I strongly suspect that it's like this in part because the logic pre-dates the introduction of RelabelType in 7.0. The commit log message for 57b30e8e2 is interesting reading here.) As a compromise, we'll sneak the change into 14beta3, and consider back-patching to stable branches if no complaints emerge in the next three months. Discussion: https://postgr.es/m/CABNQVagu3bZGqiTjb31a8D5Od3fUMs7Oh3gmZMQZVHZ=uWWWfQ@mail.gmail.com https://git.postgresql.org/pg/commitdiff/5c056b0c2519e602c2e98bace5b16d2ecde6454b
Really fix the ambiguity in REFRESH MATERIALIZED VIEW CONCURRENTLY. Rather than trying to pick table aliases that won't conflict with any possible user-defined matview column name, adjust the queries' syntax so that the aliases are only used in places where they can't be mistaken for column names. Mostly this consists of writing "alias.*" not just "alias", which adds clarity for humans as well as machines. We do have the issue that "SELECT alias.*" acts differently from "SELECT alias", but we can use the same hack ruleutils.c uses for whole-row variables in SELECT lists: write "alias.*::compositetype". We might as well revert to the original aliases after doing this; they're a bit easier to read. Like 75d66d10e, back-patch to all supported branches. Discussion: https://postgr.es/m/2488325.1628261320@sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/9179a82d7af38112cd0f6e84ab62d0b3592a6e0e
Make regexp engine's backref-related compilation state more bulletproof. Up to now, we remembered the definition of a capturing parenthesis subexpression by storing a pointer to the associated subRE node. That was okay before, because that subRE didn't get modified anymore while parsing the rest of the regexp. However, in the wake of commit ea1268f63, that's no longer true: the outer invocation of parseqatom() feels free to scribble on that subRE. This seems to work anyway, because the states we jam into the child atom in the "prepare a general-purpose state skeleton" stanza aren't really semantically different from the original endpoints of the child atom. But that would be mighty easy to break, and it's definitely not how things worked before. Between this and the issue fixed in the prior commit, it seems best to get rid of this dependence on subRE nodes entirely. We don't need the whole child subRE for future backrefs, only its starting and ending NFA states; so let's just store pointers to those. Also, in the corner case where we make an extra subRE to handle immediately-nested capturing parentheses, it seems like it'd be smart to have the extra subRE have the same begin/end states as the original child subRE does (s/s2 not lp/rp). I think that linking it from lp to rp might actually be semantically wrong, though since Spencer's original code did it that way, I'm not totally certain. Using s/s2 is certainly not wrong, in any case. Per report from Mark Dilger. Back-patch to v14 where the problematic patches came in. Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com https://git.postgresql.org/pg/commitdiff/cb76fbd7ec87e44b3c53165d68dc2747f7e26a9a
Fix use-after-free issue in regexp engine. Commit cebc1d34e taught parseqatom() to optimize cases where a branch contains only one, "messy", atom by getting rid of excess subRE nodes. The way we really should do that is to keep the subRE built for the "messy" child atom; but to avoid changing parseqatom's nominal API, I made it delete that node after copying its fields to the outer subRE made by parsebranch(). It seems that that actually worked at the time; but it became dangerous after ea1268f63, because that later commit allowed the lower invocation of parse() to return a subRE that was also pointed to by some v->subs[] entry. This meant we could wind up with a dangling pointer in v->subs[], allowing a later backref to misbehave, but only if that subRE struct had been reused in between. So the damage seems confined to cases like '((...))...(...\2'. To fix, do what I should have done before and modify parseqatom's API to make it possible for it to remove the caller's subRE instead of the callee's. That's safer because we know that subRE isn't complete yet, so noplace else will have a pointer to it. Per report from Mark Dilger. Back-patch to v14 where the problematic patches came in. Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com https://git.postgresql.org/pg/commitdiff/cc1868799c8311ed1cc3674df2c5e1374c914deb
Rethink regexp engine's backref-related compilation state. I had committer's remorse almost immediately after pushing cb76fbd7e, upon finding that removing capturing subexpressions' subREs from the data structure broke my proposed patch for REG_NOSUB optimization. Revert that data structure change. Instead, address the concern about not changing capturing subREs' endpoints by not changing the endpoints. We don't need to, because the point of that bit was just to ensure that the atom has endpoints distinct from the outer state pair that we're stringing the branch between. We already made suitable states in the parenthesized-subexpression case, so the additional ones were just useless overhead. This seems more understandable than Spencer's original coding, and it ought to be a shade faster too by saving a few state creations and arc changes. (I actually see a couple percent improvement on Jacobson's web corpus, though that's barely above the noise floor so I wouldn't put much stock in that result.) Also, fix the logic added by ea1268f63 to ensure that the subRE recorded in v->subs[subno] is exactly the one with capno == subno. Spencer's original coding recorded the child subRE of the capture node, which is okay so far as having the right endpoint states is concerned, but as of cb76fbd7e the capturing subRE itself always has those endpoints too. I think the inconsistency is confusing for the REG_NOSUB optimization. As before, backpatch to v14. Discussion: https://postgr.es/m/0203588E-E609-43AF-9F4F-902854231EE7@enterprisedb.com https://git.postgresql.org/pg/commitdiff/00116dee5ad4c1964777c91e687bb98b1d9f7ea0
Doc: remove bogus <indexterm> items. Copy-and-pasteo in 665c5855e, evidently. The 9.6 docs toolchain whined about duplicate index entries, though our modern toolchain doesn't. In any case, these GUCs surely are not about the default settings of these values. https://git.postgresql.org/pg/commitdiff/cf5ce5aa70d33fc3048ab63c50585b8cc0f11dfd

David Rowley pushed:

Track a Bitmapset of non-pruned partitions in RelOptInfo. For partitioned tables with large numbers of partitions where queries are able to prune all but a very small number of partitions, the time spent in the planner looping over RelOptInfo.part_rels checking for non-NULL RelOptInfos could become a large portion of the overall planning time. Here we add a Bitmapset that records the non-pruned partitions. This allows us to more efficiently skip the pruned partitions by looping over the Bitmapset. This will cause a very slight slow down in cases where no or not many partitions could be pruned, however, those cases are already slow to plan anyway and the overhead of looping over the Bitmapset would be unmeasurable when compared with the other tasks such as path creation for a large number of partitions. Reviewed-by: Amit Langote, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvqnPx6JnUuPwaf5ao38zczrAb9mxt9gj4U1EKFfd4AqLA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/475dbd0b718de8ac44da144f934651b959e3b705
Allow ordered partition scans in more cases. 959d00e9d added the ability to make use of an Append node instead of a MergeAppend when we wanted to perform a scan of a partitioned table and the required sort order was the same as the partitioned keys and the partitioned table was defined in such a way that earlier partitions were guaranteed to only contain lower-order values than later partitions. However, previously we didn't allow these ordered partition scans for LIST partitioned table when there were any partitions that allowed multiple Datums. This was a very cheap check to make and we could likely have done a little better by checking if there were interleaved partitions, but at the time we didn't have visibility about which partitions were pruned, so we still may have disallowed cases where all interleaved partitions were pruned. Since 475dbd0b7, we now have knowledge of pruned partitions, we can do a much better job inside partitions_are_ordered(). Here we pass which partitions survived partition pruning into partitions_are_ordered() and, for LIST partitioning, have it check to see if any live partitions exist that are also in the new "interleaved_parts" field defined in PartitionBoundInfo. For RANGE partitioning we can relax the code which caused the partitions to be unordered if a DEFAULT partition existed. Since we now know which partitions were pruned, partitions_are_ordered() now returns true when the DEFAULT partition was pruned. Reviewed-by: Amit Langote, Zhihong Yu Discussion: https://postgr.es/m/CAApHDvrdoN_sXU52i=QDXe2k3WAo=EVry29r2+Tq2WYcn2xhEA@mail.gmail.com https://git.postgresql.org/pg/commitdiff/db632fbca392389807ffb9d9b2207157e8e9b3e8
Remove unused function declaration. It appears that check_track_commit_timestamp was declared but has never been defined in our code base. Likely this is just leftover cruft from a development version of the original patch to add commit timestamps. Let's just remove the useless declaration. The inclusion of guc.h also seems surplus to requirements. Author: Andrey Lepikhov Discussion: https://postgr.es/m/f49aefb5-edbb-633a-af07-3e777023a94d@postgrespro.ru https://git.postgresql.org/pg/commitdiff/75a2d132eaef7d951db894cf3201f85e9a524774

Bruce Momjian pushed:

doc: add example of using pg_dump with GNU split and gzip. This is only possible with GNU split, not other versions like BSD split. Reported-by: jim@jdoherty.net Discussion: https://postgr.es/m/162653459215.701.6323855956817776386@wrigleys.postgresql.org Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/cfbbb8610d17bc6d82f37a446c38b29e2a5258f4
doc: mention inheritance's tableoid can be used in partitioning. Previously tableoid was not mentioned in the partition doc section. We only had a link to the "all the normal rules" of inheritance section. Reported-by: michal.palenik@freemap.sk Discussion: https://postgr.es/m/162627031219.693.11508199541771263335@wrigleys.postgresql.org Backpatch-through: 10 https://git.postgresql.org/pg/commitdiff/691272cae96b3c947d3d2d4d8c43c499e72c65a2
pg_upgrade: improve docs about extension upgrades. The previous wording was unclear about the steps needed to upgrade extensions, and how to update them after pg_upgrade. Reported-by: Dave Cramer Discussion: https://postgr.es/m/CADK3HHKawwbOcGwMGnDuAf3-U8YfvTcS8jqDv3UM=niijs3MMA@mail.gmail.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/5090d709f172ecd00b16b6e336c8c149a3f3d33d
pg_upgrade: warn about extensions that need updating. Also create a script that can be run to update them. Reported-by: Dave Cramer Discussion: https://postgr.es/m/CADK3HHKawwbOcGwMGnDuAf3-U8YfvTcS8jqDv3UM=niijs3MMA@mail.gmail.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/e462856a7a559c94bad51507c6b324f337d8254b
interval: round values when spilling to months. Previously spilled units greater than months were truncated to months. Also document the spill behavior. Reported-by: Bryn Llewelly Discussion: https://postgr.es/m/BDAE4B56-3337-45A2-AC8A-30593849D6C0@yugabyte.com Backpatch-through: master https://git.postgresql.org/pg/commitdiff/95ab1e0a9db321dd796344d526457016eada027f
C comment: correct heading of extension query. Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20210803161345.GZ12533@telsasoft.com Backpatch-through: 9.6 https://git.postgresql.org/pg/commitdiff/9e51cc87fd0ac46b183cb7302a6751d52d3f159a

Peter Geoghegan pushed:

Make vacuum_index_cleanup reloption RELOPT_TYPE_ENUM. Oversight in commit 3499df0d, which generalized the reloption as a way of giving users a way to consistently avoid VACUUM's index bypass optimization. Per off-list report from Nikolay Shaplov. Backpatch: 14-, where index cleanup reloption was extended. https://git.postgresql.org/pg/commitdiff/cc8033e1dafe89271aa86c2f2f86a828956929f0

Dean Rasheed pushed:

Fix division-by-zero error in to_char() with 'EEEE' format. This fixes a long-standing bug when using to_char() to format a numeric value in scientific notation -- if the value's exponent is less than -NUMERIC_MAX_DISPLAY_SCALE-1 (-1001), it produced a division-by-zero error. The reason for this error was that get_str_from_var_sci() divides its input by 10^exp, which it produced using power_var_int(). However, the underflow test in power_var_int() causes it to return zero if the result scale is too small. That's not a problem for power_var_int()'s only other caller, power_var(), since that limits the rscale to 1000, but in get_str_from_var_sci() the exponent can be much smaller, requiring a much larger rscale. Fix by introducing a new function to compute 10^exp directly, with no rscale limit. This also allows 10^exp to be computed more efficiently, without any numeric multiplication, division or rounding. Discussion: https://postgr.es/m/CAEZATCWhojfH4whaqgUKBe8D5jNHB8ytzemL-PnRx+KCTyMXmg@mail.gmail.com https://git.postgresql.org/pg/commitdiff/226ec49ffd78c0f246da8fceb3094991dd2302ff
Adjust the integer overflow tests in the numeric code. Formerly, the numeric code tested whether an integer value of a larger type would fit in a smaller type by casting it to the smaller type and then testing if the reverse conversion produced the original value. That's perfectly fine, except that it caused a test failure on buildfarm animal castoroides, most likely due to a compiler bug. Instead, do these tests by comparing against PG_INT16/32_MIN/MAX. That matches existing code in other places, such as int84(), which is more widely tested, and so is less likely to go wrong. While at it, add regression tests covering the numeric-to-int8/4/2 conversions, and adjust the recently added tests to the style of 434ddfb79a (on the v11 branch) to make failures easier to diagnose. Per buildfarm via Tom Lane, reviewed by Tom Lane. Discussion: https://postgr.es/m/2394813.1628179479%40sss.pgh.pa.us https://git.postgresql.org/pg/commitdiff/2642df9fac09540c761441edd9bdd0a72c62f39c

Fujii Masao pushed:

Remove unused argument "txn" in maybe_send_schema(). Commit 464824323e added the argument "txn" into maybe_send_schema() though it was not used. Author: Hou Zhijie Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/OS0PR01MB5716146E43928FB92D45D29794EC9@OS0PR01MB5716.jpnprd01.prod.outlook.com https://git.postgresql.org/pg/commitdiff/93d573d86571d148e2d14415166ec6981d34ea9d

Peter Eisentraut pushed:

Fix wording. https://git.postgresql.org/pg/commitdiff/05e60aece33e84960ef566e4735b2d93c2d791c8
Add missing message punctuation. https://git.postgresql.org/pg/commitdiff/ba4eb86ceff4eded5920d36ef82a2096db7ad0c0
Message style improvements. https://git.postgresql.org/pg/commitdiff/f4f4a649d80fea3ae0214b06e160eb5d46b48a16
pg_amcheck: Add missing translation markers. https://git.postgresql.org/pg/commitdiff/789d8060f0517d4da0776480d937d8b64d5c5976
pg_amcheck: Message style improvements. https://git.postgresql.org/pg/commitdiff/80dfbbf1b17a1fb1123401799efdc660ee977051
Remove T_MemoryContext. This is an abstract node that shouldn't have a node tag defined. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com https://git.postgresql.org/pg/commitdiff/256909c6c1679767230d1088f1bfab727dd99e14
Change NestPath node to contain JoinPath node. This makes the structure of all JoinPath-derived nodes the same, independent of whether they have additional fields. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com https://git.postgresql.org/pg/commitdiff/18fea737b5e47cc677eaf97365c765a47f346763
Change SeqScan node to contain Scan node. This makes the structure of all Scan-derived nodes the same, independent of whether they have additional fields. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com https://git.postgresql.org/pg/commitdiff/2226b4189bb4ccfcc53917a8695d24e91ff2f950
Check the size in COPY_POINTER_FIELD. instead of making each caller do it. Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce@enterprisedb.com https://git.postgresql.org/pg/commitdiff/c1132aae336c41cf9d316222e525d8d593c2b5d2
Remove some unnecessary casts in format arguments. We can use %zd or %zu directly, no need to cast to int. Conversely, some code was casting away from int when it could be using %d directly. https://git.postgresql.org/pg/commitdiff/ae03a7c7391dfc59f14226b7cfd8ef9f3390e56f