Quick Links

== PostgreSQL Weekly News - March 1, 2020 ==

From:	David Fetter <david(at)fetter(dot)org>
To:	PostgreSQL Announce <pgsql-announce(at)postgresql(dot)org>
Subject:	== PostgreSQL Weekly News - March 1, 2020 ==
Date:	2020-03-01 23:36:46
Message-ID:	20200301233645.GA4835@fetter.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-announce

Postgres Ibiza will be held in Ibiza, Spain on June 25-26, 2020. The CfP is open until
March 8, 2020.
https://pgibz.io/

PGDay Ukraine will take place April 25th, 2020 in Lviv at the Bank Hotel.
https://pgday.org.ua/

== PostgreSQL Product News ==

pgBackRest 2.24, a backup and restore system for PostgreSQL, released.
https://pgbackrest.org/release.html#2.24

Database Lab 0.3.0, a tool for fast cloning of large PostgreSQL databases to
build non-production environments, released:
https://gitlab.com/postgres-ai/database-lab

Joe 0.5.0, a Slack chatbot that helps backend developers and DBAs troubleshoot
and optimize PostgreSQL queries, releaesd.
https://gitlab.com/postgres-ai/joe/-/releases#0.5.0

HighGo PostgreSQL Server 1.2, a PostgreSQL derivative with a focus on database
backup performance enhancement and maintaining high usability, released.
https://www.highgo.ca/products/

pgquarrel 0.7.0, a tool which compares PostgreSQL database schemas and outputs a
set of commands to turn a database schema into another one, released.
http://eulerto.github.io/pgquarrel

== PostgreSQL Jobs for March ==

http://archives.postgresql.org/pgsql-jobs/2020-03/

== PostgreSQL Local ==

pgDay Israel 2020 will take place on March 19, 2020 in Tel Aviv.
http://pgday.org.il/

pgDay Paris 2020 will be held in Paris, France on March 26, 2020 at Espace
Saint-Martin.
https://2020.pgday.paris/

Nordic PGDay 2020 will be held in Helsinki, Finland at the Hilton Helsinki
Strand Hotel on March 24, 2020.

PostgreSQL(at)SCaLE is a two day, two track event which takes place on
March 5-6, 2020, at Pasadena Convention Center, as part of SCaLE 18X.
https://www.socallinuxexpo.org/scale/18x/postgresscale

PGConfNepal 2020 will be held April 17-18, 2020 at Kathmandu University, Dhulikhel,
Nepal.
https://pgconf.org.np/

The German-speaking PostgreSQL Conference 2020 will take place on May 15, 2019
in Stuttgart.

PGCon 2020 will take place in Ottawa on May 26-29, 2020.
https://www.pgcon.org/2020/

PGDay.IT 2020 will take place June 11-12 in Bergamo, Italy.
https://2020.pgday.it/en/

Swiss PGDay 2020 will take place in Rapperswil (near Zurich) on June 18-19, 2020.
The Call for Speakers is open through March 17, 2020.
https://www.pgday.ch/2020/

PostgresLondon 2020 will be July 7-8, 2020 with an optional training day on
July 6. The CfP is open at https://forms.gle/5m8ybUt9YDZG4gVU7 through March
27, 2020.
http://postgreslondon.org

PG Day Russia will be in Saint Petersburg on July 10, 2020. The CfP is open at
https://pgday.ru/en/2020/for-speakers through April 6, 2020.
https://pgday.ru/en/2020/

FOSS4G 2020, will take place in Calgary, Alberta, Canada August 24-29 2020.
the Call for Papers is currently open at https://2020.foss4g.org/speakers/
https://2020.foss4g.org/

Austrian pgDay will take place September 18, 2020 at Schloss Schoenbrunn
(Apothekertrakt) in Vienna. The CfP is open until April 19, 2020 at
https://pgday.at/en/talk-commitee/
https://pgday.at/en/

== PostgreSQL in the News ==

Planet PostgreSQL: http://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to david(at)fetter(dot)org(dot)

== Applied Patches ==

Michaël Paquier pushed:

- Add prefix checks in exclude lists for pg_rewind, pg_checksums and base
backups. An instance of PostgreSQL crashing with a bad timing could leave
behind temporary pg_internal.init files, potentially causing failures when
verifying checksums. As the same exclusion lists are used between pg_rewind,
pg_checksums and basebackup.c, all those tools are extended with prefix checks
to keep everything in sync, with dedicated checks added for pg_internal.init.
Backpatch down to 11, where pg_checksums (pg_verify_checksums in 11) and
checksum verification for base backups have been introduced. Reported-by:
Michael Banck Author: Michael Paquier Reviewed-by: Kyotaro Horiguchi, David
Steele Discussion:
https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
Backpatch-through: 11
https://git.postgresql.org/pg/commitdiff/bf883b211eae18662f2dfaede02f5d115bf0b805

- Issue properly WAL record for CID of first catalog tuple in multi-insert.
Multi-insert for heap is not yet used actively for catalogs, but the code to
support this case is in place for logical decoding. The existing code forgot
to issue a XLOG_HEAP2_NEW_CID record for the first tuple inserted, leading to
failures when attempting to use multiple inserts for catalogs at decoding
time. This commit fixes the problem by WAL-logging the needed CID. This is
not an active bug, so no back-patch is done. Author: Daniel Gustafsson
Discussion: https://postgr.es/m/E0D4CC67-A1CF-4DF4-991D-B3AC2EB5FAE9@yesql.se
https://git.postgresql.org/pg/commitdiff/7d672b76bf27327dc3527dabcd8be4e2dedf430f

- Fix build failure on header generation with repetitive builds of MSVC.
GenerateConfigHeader() in Solution.pm was complaining about unused define
symbols even if a newer config header was not generated, causing successive
build attempts with MSVC to fail. Oversight in commit 8f4fb4c. Author:
Kyotaro Horiguchi Reviewed-by: Juan José Santamaría Flecha Discussion:
https://postgr.es/m/20200218.160500.44393633318853097.horikyota.ntt@gmail.com
https://git.postgresql.org/pg/commitdiff/59f9cd9dd5e4db8c59c57a17388c17564a3211a3

- createdb: Fix quoting of --encoding, --lc-ctype and --lc-collate. The original
coding failed to properly quote those arguments, leading to failures when
using quotes in the values used. As the quoting can be encoding-sensitive,
the connection to the backend needs to be taken before applying the correct
quoting. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion:
https://postgr.es/m/20200214041004.GB1998@paquier.xyz Backpatch-through: 9.5
https://git.postgresql.org/pg/commitdiff/008cf040962c98c7c55d54c28dcb43c3c1d83c92

- Skip foreign tablespaces when running pg_checksums/pg_verify_checksums.
Attempting to use pg_checksums (pg_verify_checksums in 11) on a data folder
which includes tablespace paths used across multiple major versions would
cause pg_checksums to scan all directories present in pg_tblspc, and not only
marked with TABLESPACE_VERSION_DIRECTORY. This could lead to failures when
for example running sanity checks on an upgraded instance with --check. Even
worse, it was possible to rewrite on-disk pages with --enable for a cluster
potentially online. This commit makes pg_checksums skip any directories not
named TABLESPACE_VERSION_DIRECTORY, similarly to what is done for base
backups. Reported-by: Michael Banck Author: Michael Banck, Bernd Helmle
Discussion:
https://postgr.es/m/62031974fd8e941dd8351fbc8c7eff60d59c5338.camel@credativ.de
backpatch-through: 11
https://git.postgresql.org/pg/commitdiff/428a2609ef64b69d709418a50e192ff11a2643f1

- Remove TAP test for createdb --lc-ctype. OpenBSD falls back to "C" when using
an incorrect input with setlocale() and LC_CTYPE, causing this test,
introduced by 008cf04, to fail. This removes the culprit test to avoid the
portability issue. Per report from Robert Haas, via buildfarm member
curculio. Discussion:
https://postgr.es/m/CA+TgmoZ6ddh3mHD9gU8DvNYoFmuJaYYn1+4AvZNp25vTdRwCAQ@mail.gmail.com
Backpatch-through: 11
https://git.postgresql.org/pg/commitdiff/c4b0edb07ed53063ea4c86cd7918ad6ea01d8979

Robert Haas pushed:

- Move bitmap_hash and bitmap_match to bitmapset.c. The closely-related function
bms_hash_value is already defined in that file, and this change means that
hashfn.c no longer needs to depend on nodes/bitmapset.h. That gets us closer
to allowing use of the hash functions in hashfn.c in frontend code. Patch by
me, reviewed by Suraj Kharage and Mark Dilger. Discussion:
http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/07b95c3d8334f737d4717c91967729f7721e785c

- Put all the prototypes for hashfn.c into the same header file. Previously,
some of the prototypes for functions in hashfn.c were in utils/hashutils.h and
others were in utils/hsearch.h, but that is confusing and has no particular
benefit. Patch by me, reviewed by Suraj Kharage and Mark Dilger. Discussion:
http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/9341c783cc42ffae5860c86bdc713bd47d734ffd

- Adapt hashfn.c and hashutils.h for frontend use. hash_any() and its various
variants are defined to return Datum, which is a backend-only concept, but the
underlying functions actually want to return uint32 and uint64, and only
return Datum because it's convenient for callers who are using them to
implement a hash function for some SQL datatype. However, changing these
functions to return uint32 and uint64 seems like it might lead to programming
errors or back-patching difficulties, both because they are widely used and
because failure to use UInt{32,64}GetDatum() might not provoke a compilation
error. Instead, rename the existing functions as well as changing the return
type, and add static inline wrappers for those callers that need the previous
behavior. Although this commit adapts hashutils.h and hashfn.c so that they
can be compiled as frontend code, it does not actually do anything that would
cause them to be so compiled. That is left for another commit. Patch by me,
reviewed by Suraj Kharage and Mark Dilger. Discussion:
http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/a91e2fa94180f24dd68fb6c99136cda820e02089

- Move src/backend/utils/hash/hashfn.c to src/common. This also involves
renaming src/include/utils/hashutils.h, which becomes
src/include/common/hashfn.h. Perhaps an argument can be made for keeping the
hashutils.h name, but it seemed more consistent to make it match the name of
the file, and also more descriptive of what is actually going on here. Patch
by me, reviewed by Suraj Kharage and Mark Dilger. Off-list advice on how not
to break the Windows build from Davinder Singh and Amit Kapila. Discussion:
http://postgr.es/m/CA+TgmoaRiG4TXND8QuM6JXFRkM_1wL2ZNhzaUKsuec9-4yrkgw@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/05d8449e73694585b59f8b03aaa087f04cc4679a

Peter Eisentraut pushed:

- Change client-side fsync_fname() to report errors fatally. Given all we have
learned about fsync() error handling in the last few years, reporting an
fsync() error non-fatally is not useful, unless you don't care much about the
file, in which case you probably don't need to use fsync() in the first place.
Change fsync_fname() and durable_rename() to exit(1) on fsync() errors other
than those that we specifically chose to ignore. This affects initdb,
pg_basebackup, pg_checksums, pg_dump, pg_dumpall, and pg_rewind. Reviewed-by:
Michael Paquier <michael(at)paquier(dot)xyz> Discussion:
https://www.postgresql.org/message-id/flat/d239d1bd-aef0-ca7c-dc0a-da14bdcf0392%402ndquadrant.com
https://git.postgresql.org/pg/commitdiff/1420617b14e2e3722367b826986a50ea33ff62ec

- Add PostgreSQL home page to --help output. Per emerging standard in GNU
programs and elsewhere. Autoconf already has support for specifying a home
page, so we can just that. Reviewed-by: Daniel Gustafsson <daniel(at)yesql(dot)se>
Discussion:
https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
https://git.postgresql.org/pg/commitdiff/1933ae629e7b706c6c23673a381e778819db307d

- Refer to bug report address by symbol rather than hardcoding. Use the
PACKAGE_BUGREPORT macro that is created by Autoconf for referring to the bug
reporting address rather than hardcoding it everywhere. This makes it easier
to change the address and it reduces translation work. Reviewed-by: Daniel
Gustafsson <daniel(at)yesql(dot)se> Discussion:
https://www.postgresql.org/message-id/flat/8d389c5f-7fb5-8e48-9a4a-68cec44786fa%402ndquadrant.com
https://git.postgresql.org/pg/commitdiff/864934131ef72dc3a403ad1375a94543fcc04206

Tom Lane pushed:

- Account explicitly for long-lived FDs that are allocated outside fd.c. The
comments in fd.c have long claimed that all file allocations should go through
that module, but in reality that's not always practical. fd.c doesn't supply
APIs for invoking some FD-producing syscalls like pipe() or epoll_create();
and the APIs it does supply for non-virtual FDs are mostly insistent on
releasing those FDs at transaction end; and in some cases the actual open()
call is in code that can't be made to use fd.c, such as libpq. This has led
to a situation where, in a modern server, there are likely to be seven or so
long-lived FDs per backend process that are not known to fd.c. Since
NUM_RESERVED_FDS is only 10, that meant we had *very* few spare FDs if
max_files_per_process is >= the system ulimit and fd.c had opened all the
files it thought it safely could. The contrib/postgres_fdw regression test,
in particular, could easily be made to fall over by running it under a
restrictive ulimit. To improve matters, invent functions
Acquire/Reserve/ReleaseExternalFD that allow outside callers to tell fd.c that
they have or want to allocate a FD that's not directly managed by fd.c. Add
calls to track all the fixed FDs in a standard backend session, so that we are
honestly guaranteeing that NUM_RESERVED_FDS FDs remain unused below the EMFILE
limit in a backend's idle state. The coding rules for these functions say
that there's no need to call them in code that just allocates one FD over a
fairly short interval; we can dip into NUM_RESERVED_FDS for such cases. That
means that there aren't all that many places where we need to worry. But
postgres_fdw and dblink must use this facility to account for long-lived FDs
consumed by libpq connections. There may be other places where it's worth
doing such accounting, too, but this seems like enough to solve the immediate
problem. Internally to fd.c, "external" FDs are limited to max_safe_fds/3
FDs. (Callers can choose to ignore this limit, but of course it's unwise to do
so except for fixed file allocations.) I also reduced the limit on
"allocated" files to max_safe_fds/3 FDs (it had been max_safe_fds/2).
Conceivably a smarter rule could be used here --- but in practice, on
reasonable systems, max_safe_fds should be large enough that this isn't much
of an issue, so KISS for now. To avoid possible regression in the number of
external or allocated files that can be opened, increase FD_MINFREE and the
lower limit on max_files_per_process a little bit; we now insist that the
effective "ulimit -n" be at least 64. This seems like pretty clearly a bug
fix, but in view of the lack of field complaints, I'll refrain from risking a
back-patch. Discussion:
https://postgr.es/m/E1izCmM-0005pV-Co@gemulon.postgresql.org
https://git.postgresql.org/pg/commitdiff/3d475515a15f70a4a3f36fbbba93db6877ff8346

- Fix compile failure. I forgot that some compilers won't handle #if constructs
within ereport() calls. Duplicating most of the call is annoying but simple.
Per buildfarm.
https://git.postgresql.org/pg/commitdiff/36390713a60f446da7e7ae758771c9104fa89394

- Suppress unnecessary RelabelType nodes in more cases. eval_const_expressions
sometimes produced RelabelType nodes that were useless because they just
relabeled an expression to the same exposed type it already had. This is
worth avoiding because it can cause two equivalent expressions to not be
equal(), preventing recognition of useful optimizations. In the test case
added here, an unpatched planner fails to notice that the "sqli = constant"
clause renders a sort step unnecessary, because one code path produces an
extra RelabelType and another doesn't. Fix by ensuring that
eval_const_expressions_mutator's T_RelabelType case will not add in an
unnecessary RelabelType. Also save some code by sharing a subroutine with the
effectively-equivalent cases for CollateExpr and CoerceToDomain. (CollateExpr
had no bug, and I think that the case couldn't arise with CoerceToDomain, but
it seems prudent to do the same check for all three cases.) Back-patch to
v12. In principle this has been wrong all along, but I haven't seen a case
where it causes visible misbehavior before v12, so refrain from changing
stable branches unnecessarily. Per investigation of a report from Eric
Gillum. Discussion:
https://postgr.es/m/CAMmjdmvAZsUEskHYj=KT9sTukVVCiCSoe_PBKOXsncFeAUDPCQ@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/a477bfc1dfb8d2b7125a9818edcbf383bf82d62a

- Doc: correct thinko in pg_buffercache documentation. Access to this module is
granted to the pg_monitor role, not pg_read_all_stats. (Given the view's
performance impact, it seems wise to be restrictive, so I think this was the
correct decision --- and anyway it was clearly intentional.) Per bug #16279
from Philip Semanchuk. Discussion:
https://postgr.es/m/16279-fcaac33c68aab0ab@postgresql.org
https://git.postgresql.org/pg/commitdiff/963ea55c34cc325243fd30468ce8c629572f0c73

- Avoid failure if autovacuum tries to access a just-dropped temp namespace.
Such an access became possible when commit 246a6c8f7 added more aggressive
cleanup of orphaned temp relations by autovacuum. Since autovacuum's snapshot
might be slightly stale, it could attempt to access an already-dropped temp
namespace, resulting in an assertion failure or null-pointer dereference. (In
practice, since we don't drop temp namespaces automatically but merely recycle
them, this situation could only arise if a superuser does a manual drop of a
temp namespace. Still, that should be allowed.) The core of the bug, IMO, is
that isTempNamespaceInUse and its callers failed to think hard about whether
to treat "temp namespace isn't there" differently from "temp namespace isn't
in use". In hopes of forestalling future mistakes of the same ilk, replace
that function with a new one checkTempNamespaceStatus, which makes the same
tests but returns a three-way enum rather than just a bool.
isTempNamespaceInUse is gone entirely in HEAD; but just in case some external
code is relying on it, keep it in the back branches, as a bug-compatible
wrapper around the new function. Per report originally from Prabhat Kumar
Sahu, investigated by Mahendra Singh and Michael Paquier; the final form of
the patch is my fault. This replaces the failed fix attempt in a052f6cbb.
Backpatch as far as v11, as 246a6c8f7 was. Discussion:
https://postgr.es/m/CAKYtNAr9Zq=1-ww4etHo-VCC-k120YxZy5OS01VkaLPaDbv2tg@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/80d76be51cfb7f8f54e779ec5a287d80ac28b6e4

- Remove obsolete comment. Noted while studying subplan hash issue.
https://git.postgresql.org/pg/commitdiff/6afc8aefd3b95c0c4e7d07d2c99b90ce83e313de

- Correctly re-use hash tables in buildSubPlanHash(). Commit 356687bd8 omitted
to remove leftover code for destroying a hashed subplan's hash tables, with
the result that the tables were always rebuilt not reused; this leads to
severe memory leakage if a hashed subplan is re-executed enough times.
Moreover, the code for reusing the hashnulls table had a typo that would have
made it do the wrong thing if it were reached. Looking at the code coverage
report shows severe under-coverage of the potential callers of
ResetTupleHashTable, so add some test cases that exercise them. Andreas
Karlsson and Tom Lane, per reports from Ranier Vilela and Justin Pryzby.
Backpatch to v11, as the faulty commit was. Discussion:
https://postgr.es/m/edb62547-c453-c35b-3ed6-a069e4d6b937@proxel.se Discussion:
https://postgr.es/m/CAEudQAo=DCebm1RXtig9OH+QivpS97sMkikt0A9qHmMUs+g6ZA@mail.gmail.com
Discussion: https://postgr.es/m/20200210032547.GA1412@telsasoft.com
https://git.postgresql.org/pg/commitdiff/58c47ccfff20b8c125903482725c1dbfd30beade

Andres Freund pushed:

- expression eval: Reduce number of steps for agg transition invocations. Do so
by combining the various steps that are part of aggregate transition function
invocation into one larger step. As some of the current steps are only
necessary for some aggregates, have one variant of the aggregate transition
step for each possible combination. To avoid further manual copies of code in
the different transition step implementations, move most of the code into
helper functions marked as "always inline". The benefit of this change is an
increase in performance when aggregating lots of rows. This comes in part due
to the reduced number of indirect jumps due to the reduced number of steps,
and in part by reducing redundant setup code across steps. This mainly
benefits interpreted execution, but the code generated by JIT is also improved
a bit. As a nice side-effect it also ends up making the code a bit simpler.
A small additional optimization is removing the need to set
aggstate->curaggcontext before calling ExecAggInitGroup, choosing to instead
passign curaggcontext as an argument. It was, in contrast to other aggregate
related functions, only needed to fetch a memory context to copy the
transition value into. Author: Andres Freund Discussion:
https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
https://postgr.es/m/5c371df7cee903e8cd4c685f90c6c72086d3a2dc.camel@j-davis.com
https://git.postgresql.org/pg/commitdiff/2742c45080077ed3b08b810bb96341499b86d530

Magnus Hagander pushed:

- Include error code in message from pg_upgrade. In passing, also quote the
filename in one message where it wasn't. Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87pne2w98h.fsf@wibble.ilmari.org
https://git.postgresql.org/pg/commitdiff/4109bb5de4998b9301ea2ac18c9d6dfb0b4f900b

Peter Geoghegan pushed:

- Add equalimage B-Tree support functions. Invent the concept of a B-Tree
equalimage ("equality implies image equality") support function, registered as
support function 4. This indicates whether it is safe (or not safe) to apply
optimizations that assume that any two datums considered equal by an operator
class's order method must be interchangeable without any loss of semantic
information. This is static information about an operator class and a
collation. Register an equalimage routine for almost all of the existing
B-Tree opclasses. We only need two trivial routines for all of the opclasses
that are included with the core distribution. There is one routine for
opclasses that index non-collatable types (which returns 'true'
unconditionally), plus another routine for collatable types (which returns
'true' when the collation is a deterministic collation). This patch is
infrastructure for an upcoming patch that adds B-Tree deduplication. Author:
Peter Geoghegan, Anastasia Lubennikova Discussion:
https://postgr.es/m/CAH2-Wzn3Ee49Gmxb7V1VJ3-AC8fWn-Fr8pfWQebHe8rYRxt5OQ@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/612a1ab76724aa1514b6509269342649f8cab375

- Add deduplication to nbtree. Deduplication reduces the storage overhead of
duplicates in indexes that use the standard nbtree index access method. The
deduplication process is applied lazily, after the point where opportunistic
deletion of LP_DEAD-marked index tuples occurs. Deduplication is only applied
at the point where a leaf page split would otherwise be required. New posting
list tuples are formed by merging together existing duplicate tuples. The
physical representation of the items on an nbtree leaf page is made more space
efficient by deduplication, but the logical contents of the page are not
changed. Even unique indexes make use of deduplication as a way of
controlling bloat from duplicates whose TIDs point to different versions of
the same logical table row. The lazy approach taken by nbtree has significant
advantages over a GIN style eager approach. Most individual inserts of index
tuples have exactly the same overhead as before. The extra overhead of
deduplication is amortized across insertions, just like the overhead of page
splits. The key space of indexes works in the same way as it has since commit
dd299df8 (the commit that made heap TID a tiebreaker column). Testing has
shown that nbtree deduplication can generally make indexes with about 10 or 15
tuples for each distinct key value about 2.5X - 4X smaller, even with single
column integer indexes (e.g., an index on a referencing column that
accompanies a foreign key). The final size of single column nbtree indexes
comes close to the final size of a similar contrib/btree_gin index, at least
in cases where GIN's posting list compression isn't very effective. This can
significantly improve transaction throughput, and significantly reduce the
cost of vacuuming indexes. A new index storage parameter (deduplicate_items)
controls the use of deduplication. The default setting is 'on', so all new
B-Tree indexes automatically use deduplication where possible. This decision
will be reviewed at the end of the Postgres 13 beta period. There is a
regression of approximately 2% of transaction throughput with synthetic
workloads that consist of append-only inserts into a table with several
non-unique indexes, where all indexes have few or no repeated values. The
underlying issue is that cycles are wasted on unsuccessful attempts at
deduplicating items in non-unique indexes. There doesn't seem to be a way
around it short of disabling deduplication entirely. Note that deduplication
of items in unique indexes is fairly well targeted in general, which avoids
the problem there (we can use a special heuristic to trigger deduplication
passes in unique indexes, since we're specifically targeting "version bloat").
Bump XLOG_PAGE_MAGIC because xl_btree_vacuum changed. No bump in
BTREE_VERSION, since the representation of posting list tuples works in a way
that's backwards compatible with version 4 indexes (i.e. indexes built on
PostgreSQL 12). However, users must still REINDEX a pg_upgrade'd index to use
deduplication, regardless of the Postgres version they've upgraded from. This
is the only way to set the new nbtree metapage flag indicating that
deduplication is generally safe. Author: Anastasia Lubennikova, Peter
Geoghegan Reviewed-By: Peter Geoghegan, Heikki Linnakangas Discussion:
https://postgr.es/m/55E4051B.7020209@postgrespro.ru
https://postgr.es/m/4ab6e2db-bcee-f4cf-0916-3a06e6ccbb55@postgrespro.ru
https://git.postgresql.org/pg/commitdiff/0d861bbb702f8aa05c2a4e3f1650e7e8df8c8c27

- Silence compiler warning in nbtinsert.c. Per buildfarm member longfin.
https://git.postgresql.org/pg/commitdiff/2d8a6fad18fcec1850b79ddcf3a0032058d86715

- Silence another compiler warning in nbtinsert.c. Per complaint from Álvaro
Herrera.
https://git.postgresql.org/pg/commitdiff/2c0797da2c26dd5574e0d9128887c1fc2a385c25

- Doc: Fix deduplicate_items index term. Reported-By: Fujii Masao Discussion:
https://postgr.es/m/18f07ae8-7d89-537c-b0a9-54100a1b46da@oss.nttdata.com
https://git.postgresql.org/pg/commitdiff/e537aed61db767745b614600be15cd19bb581330

- Teach pageinspect about nbtree deduplication. Add a new bt_metap() column to
display the metapage's allequalimage field. Also add three new columns to
contrib/pageinspect's bt_page_items() function: * Add a boolean column
("dead") that displays the LP_DEAD bit value for each non-pivot tuple. * Add
a TID column ("htid") that displays a single heap TID value for each tuple.
This is the TID that is returned by BTreeTupleGetHeapTID(), so comparable
values are shown for pivot tuples, plain non-pivot tuples, and posting list
tuples. * Add a TID array column ("tids") that displays TIDs from each
tuple's posting list, if any. This works just like the "tids" column from
pageinspect's gin_leafpage_items() function. No version bump for the
pageinspect extension, since there hasn't been a stable Postgres release since
the last version bump (the last bump was part of commit 58b4cb30). Author:
Peter Geoghegan Discussion:
https://postgr.es/m/CAH2-WzmSMmU2eNvY9+a4MNP+z02h6sa-uxZvN3un6jY02ZVBSw@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/93ee38eade1b2b4964354b95b01b09e17d6f098d

- Doc: Fix pageinspect bt_page_items() example. Oversight in commit 93ee38ea.
https://git.postgresql.org/pg/commitdiff/dba915333967609cc7ca73f0904dfdc08db0a052

Álvaro Herrera pushed:

- Fix docs regarding AFTER triggers on partitioned tables. In commit
86f575948c77 I forgot to update the trigger.sgml paragraph that needs to
explain that AFTER triggers are allowed in partitioned tables. Do so now.
Discussion: https://postgr.es/m/20200224185850.GA30899@alvherre.pgsql
https://git.postgresql.org/pg/commitdiff/3acfe6b089a5283c2961eb2ff135106e23b89263

- Record parents of triggers. This let us get rid of a recently introduced ugly
hack (commit 1fa846f1c9af). Author: Álvaro Herrera Reviewed-by: Amit Langote,
Tom Lane Discussion: https://postgr.es/m/20200217215641.GA29784@alvherre.pgsql
https://git.postgresql.org/pg/commitdiff/b9b408c487244ef8e6d613d183c56eb2c62990b1

- Catversion bump for b9b408c48724. Per Tom Lane.
https://git.postgresql.org/pg/commitdiff/afb5465e0cfce7637066eaaaeecab30b0f23fbe3

- Add comments on avoid reuse of parse-time snapshot. Apparently, reusing the
parse-time query snapshot for later steps (execution) is a frequently
considered optimization ... but it doesn't work, for reasons discovered in
thread [1]. Adding some comments about why it doesn't really work can relieve
some future hackers from wasting time reimplementing it again. [1]
https://postgr.es/m/flat/5075D8DF(dot)6050500(at)fuzzy(dot)cz Author: Michail Nikolaev
Discussion:
https://postgr.es/m/CANtu0ogp6cTvMJObXP8n=k+JtqxY1iT9UV5MbGCpjjPa5crCiw@mail.gmail.com
https://git.postgresql.org/pg/commitdiff/db989184cda7f4aa1ff764cca96142029e7e093b

Jeff Davis pushed:

- Save calculated transitionSpace in Agg node. This will be useful in the
upcoming Hash Aggregation work to improve estimates for hash table sizing.
Discussion:
https://postgr.es/m/37091115219dd522fd9ed67333ee8ed1b7e09443.camel%40j-davis.com
https://git.postgresql.org/pg/commitdiff/c11cb17dc55a6b432dc637491a4491920f9c2770

- Fix commit c11cb17d. I neglected to update copyfuncs/outfuncs/readfuncs.
Discussion: https://postgr.es/m/12491.1582833409%40sss.pgh.pa.us
https://git.postgresql.org/pg/commitdiff/32bb4535a0be892f4a2794a26ba429032f4ae51e

Dean Rasheed pushed:

- Fix corner-case loss of precision in numeric ln(). When deciding on the local
rscale to use for the Taylor series expansion, ln_var() neglected to account
for the fact that the result is subsequently multiplied by a factor of
2^(nsqrt+1), where nsqrt is the number of square root operations performed in
the range reduction step, which can be as high as 22 for very large inputs.
This could result in a loss of precision, particularly when combined with
large rscale values, for which a large number of Taylor series terms is
required (up to around 400). Fix by computing a few extra digits in the
Taylor series, based on the weight of the multiplicative factor
log10(2^(nsqrt+1)). It remains to be proven whether or not the other 8 extra
digits used for the Taylor series is appropriate, but this at least deals with
the obvious oversight of failing to account for the effects of the final
multiplication. Per report from Justin AnyhowStep. Reviewed by Tom Lane.
Discussion: https://postgr.es/m/16280-279f299d9c06e56f@postgresql.org
https://git.postgresql.org/pg/commitdiff/43a899f41f46918a0bf442edb091b08c214c68f8

== Pending Patches ==

Dagfinn Ilmari Mannsåker sent in another revision of a patch to fix parallel
make of libpq.

Peter Eisentraut sent in a patch to remove win32ver.rc from version_stamp.pl.

Peter Eisentraut sent in a patch to make the bitmap of updated columns available
to a trigger in TriggerData.

Andy Fan sent in another revision of a patch to erase the distinctClause if the
result is unique by definition.

Tom Lane sent in another revision of a patch to account honestly for external FD
usage.

Alexander Korotkov sent in another revision of a patch to improve checking for
missing parent links.

Tom Lane sent in another revision of a patch to implement unicode escapes with
any backend encoding.

Peter Eisentraut sent in another revision of a patch to propagate ALTER TABLE
... SET STORAGE to indexes.

Juan José Santamaría Flecha sent in another revision of a patch to fix command
line colorization on windows.

Thomas Munro sent in a patch to use FeBeWaitSet for GSSAPI and SSL waits.

Antonin Houska sent in a patch to move two variables in RelationCopyStorage() to
a narrower scope.

Antonin Houska sent in a patch to change an Assertion form "not persistent" to
"is ephemeral".

Masahiko Sawada and Cary Huang traded patches to implement an internal key
management system.

Rémi Lapeyre sent in another revision of a patch to enable COPY "test" to output
a header and add header matching mode to COPY FROM.

Zeng Wenjing sent in four more revisions of a patch to implement global
temporary tables.

David Steele sent in a patch to skip pg_internal.init.

Konstantin Knizhnik sent in a patch to port vectorize_engine to zedstore.

Michael Banck sent in a patch to fix checksum verification in base backups for
random or zero page headers.

Tom Lane and Andrew Dunstan traded patches to help transition PL/PythonU from 2
to 3.

Imai Yoshikazu sent in another revision of a patch to add a pg_stat_waitaccum
view and change the measuring method of wait event time from INSTR_TIME to
rdtsc.

Daniel Gustafsson sent in two more revisions of a patch to use
heap_multi_insert() for pg_attribute/depend insertions.

Alexander Korotkov sent in six more revisions of a patch to pg_rewind to add
options to restore WAL files from archives.

Justin Pryzby sent in two more revisions of a patch to make HashAggregate report
bucket and memory stats.

Fujii Masao sent in another revision of a patch to see to it that Wait event is
reported while waiting for WAL archiving to finish.

Fujii Masao sent in another revision of a patch to implement a
pg_stat_progress_basebackup view.

Dagfinn Ilmari Mannsåker sent in three revisions of a patch to make pg_upgrade's
version file parsing error message consistent.

Julien Rouhaud sent in two more revisions of a patch to implement collation
versioning.

David Cramer sent in three more revisions of a patch to throw an error on failed
COMMIT.

Yugo Nagata sent in a patch to the auto_explain contrib extension which makes it
possible to log plans before the queries are executed.

Yuzuko Hosoya sent in another revision of a patch to fix an infelicity between
autovacuum and partitioned tables.

Masahiko Sawada sent in another revision of a patch to identify user-created
objects.

John Naylor and David Fetter traded patches to truncate timestamps on arbitrary
intervals.

David Fetter sent in two more revisions of a patch to use compiler intrinsics
for bit operations in hashing.

Masahiko Sawada sent in two more revisions of a patch to improve wait events of
recovery conflict resolution.

Pavel Stěhule sent in three more revisions of a patch to implement schema
variables.

Thomas Munro sent in another revision of a patch to prevent WaitEventSet syscall
churn.

Jeff Davis sent in another revision of a patch to implement memory-bounded hash
aggregation.

Robert Haas sent in another revision of a patch to implement backup manifests.

Nikita Glukhov sent in another revision of a patch to implement JSONPATH syntax
extensions.

Álvaro Herrera sent in a patch to implement BEFORE ROW triggers for partitioned
tables.

Jeff Janes sent in a patch to psql to enable tab completion for ALTER TEXT
SEARCH DICTIONARY.

Jeff Janes sent in a patch to add an absolute value option to the dict_int
function.

Justin Pryzby sent in another revision of a patch to include the block number in
the error context for vacuum.

Kyotaro HORIGUCHI sent in two revisions of a patch to add a TAP test for a crash
bug, and fix said crash bug in targeted promotion.

Takuma Hoshiai sent in another revision of a patch to implement incremental
maintenance of materialized views.

Jeff Davis sent in a patch to add a function that makes it possible to add new
tapes dynamically.

Peter Eisentraut sent in a patch to improve handling of parameter differences in
physical replication.

Tom Lane sent in two revisions of a patch to ensure less-silly selectivity for
JSONB matching operators.

Tom Lane sent in another revision of a patch to revisit opclass member checks
and dependency strength.

Kyotaro HORIGUCHI sent in another revision of a patch to allow wait event sets
to be registered to resource owners, add some infrastructure for asynchronous
execution, and use same for the PostgreSQL FDW.

Justin Pryzby sent in another revision of a patch to emit a more specific error
message when failing to alter a partitioned index, and allow reloptions on
partitioned tables and indexes.

Kyotaro HORIGUCHI sent in a patch to make end-of-recovery errors less scary.

Masahiko Sawada sent in another revision of a patch to add a RESUME option to
VACUUM and autovacuum.

Richard Guo sent in a PoC patch to pull up EXPR_SUBLINK.

Peter Eisentraut sent in a patch to remove HAVE_WORKING_LINK.

Juan José Santamaría Flecha sent in three more revisions of a patch to support
large files on Win32.

Álvaro Herrera sent in another revision of a patch to migrate commandTag from
string to enum.

Tomáš Vondra sent in two revisions of a patch to enable ALTER TYPE to change
storage strategy.

Noah Misch sent in another revision of a patch to fix the WAL-skipping feature.

Nikita Glukhov sent in a patch to introduce opclass parameters and use same.

Dilip Kumar sent in another revision of a patch to remove an infelicity between
logical_work_mem and logical streaming of large in-progress transactions.

David Zhang sent in another revision of a patch to fix psql to report IO errors
on writing query results.

Aleksey Kondratov and Justin Pryzby traded patches to Allow CLUSTER, VACUUM FULL
and REINDEX to change tablespace on the fly.

David Cramer sent in another revision of a patch to implement binary support for
the pgoutput plugin.

Justin Pryzby sent in another revision of a patch to fix the fact that ALTER tbl
rewrite loses CLUSTER ON index.

Tomáš Vondra sent in another revision of a patch to implement SLRU statistics.

Andrey Borodin sent in two more revisions of a patch to add sort support for
point gist_point_sortsupport, implement GiST build using sort support, and add a
function relopt for gist build.

Chis Bandy sent in a patch to add schema and table names to partition error.

Andreas Karlsson sent in another revision of a patch to fix the process of
resetting of subplan hash tables.

Fabien COELHO sent in another revision of a patch to add some barriers to
pgbench.

Ivan Panchenko sent in two revisions of a patch to make a transform which
transforms bool for PL/Perl[U].

Tom Lane sent in two more revisions of a patch to allow localized month names in
to_date and to_timestamp.

KaiGai Kohei sent in another revision of a patch to enable TRUNCATE on foreign
tables.

Mike Palmiotto sent in another revision of a patch to coalesce all of the
postmaster subprocess startups into one central framework, and use same to add a
hook to allow extensions to set worker metadata.

Andres Freund sent in a patch to make connections scale better by looking at
PGXACT->xmin a lot less frequently.

Amit Langote sent in another revision of a patch to fix partitionwise join to
handle FULL JOINs correctly.

Pavel Stěhule sent in a patch to add a reglanguage type.

Floris Van Nee sent in another revision of a patch to avoid a pipeline stall in
_bt_compare(), and inline _bt_compare().

Browse pgsql-announce by date

	From	Date	Subject
Next Message	Andrey M. Borodin	2020-03-04 09:14:29	Odyssey 1.1
Previous Message	David Christensen	2020-02-28 22:40:25	Bucardo 5.6.0 released!