Quick Links

Re: PG 12 draft release notes

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Emre Hasegeli <emre(at)hasegeli(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PG 12 draft release notes
Date:	2019-05-21 19:47:34
Message-ID:	20190521194734.tllkyg4akjj4txbb@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

qOn Mon, May 20, 2019 at 03:17:19PM -0700, Andres Freund wrote:
> Hi,
>
> Note that I've added a few questions to individuals involved with
> specific points. If you're in the To: list, please search for your name.
>
>
> On 2019-05-11 16:33:24 -0400, Bruce Momjian wrote:
> > I have posted a draft copy of the PG 12 release notes here:
> >
> > http://momjian.us/pgsql_docs/release-12.html
> > They are committed to git.
>
> Thanks!
>
> <title>Migration to Version 12</title>
>
> There's a number of features in the compat section that are more general
> improvements with a side of incompatibility. Won't it be confusing to
> e.g. have have the ryu floating point conversion speedups in the compat
> section, but not in the "General Performance" section?

Yes, it can be. What I did with the btree item was to split out the max
length change with the larger changes. We can do the same for other
items. As you rightly stated, it is for cases where the incompatibility
is minor compared to the change. Do you have a list of the ones that
need this treatment?

> <para>
> Remove the special behavior of <link
> linkend="datatype-oid">OID</link> columns (Andres Freund,
> John Naylor)
> </para>
>
> Should we mention that tables with OIDs have to have their oids removed
> before they can be upgraded?

Uh, is that true? pg_upgrade? pg_dump?

> <para>
> Refactor <link linkend="functions-geometry">geometric
> functions</link> and operators (Emre Hasegeli)
> </para>
>
> <para>
> This could lead to more accurate, but slightly different, results
> from previous releases.
> </para>
> </listitem>
> <listitem>
> 
>
> <para>
> Restructure <link linkend="datatype-geometric">geometric
> types</link> to handle NaN, underflow, overflow and division by
> zero more consistently (Emre Hasegeli)
> </para>
> </listitem>
>
> <listitem>
> 
>
> <para>
> Improve behavior and error reporting for the <link
> linkend="datatype-geometric">line data type</link> (Emre Hasegeli)
> </para>
> </listitem>
>
> Is that sufficient explanation? Feels like we need to expand a bit
> more. In particular, is it possible that a subset of the changes here
> require reindexing?
>
> Also, aren't three different entries a bit too much?

The 'line' item related to more errors than just the ones listed for the
geometric data types, so I was not clear how to do that as a single
entry. I think there is a much larger compatibility breakage
possibility with 'line'.

> <listitem>
> 
>
> <para>
> Improve speed of btree index insertions (Peter Geoghegan,
> Alexander Korotkov)
> </para>
>
> <para>
> The new code improves the space-efficiency of page splits,
> reduces locking overhead, and gives better performance for
> <command>UPDATE</command>s and <command>DELETE</command>s on
> indexes with many duplicates.
> </para>
> </listitem>
>
> <listitem>
> 
>
> <para>
> Have new btree indexes sort duplicate index entries in heap-storage
> order (Peter Geoghegan, Heikki Linnakangas)
> </para>
>
> <para>
> Indexes <application>pg_upgraded</application> from previous
> releases will not have this ordering.
> </para>
> </listitem>
>
> I'm not sure that the grouping here is quite right. And the second entry
> probably should have some explanation about the benefits?

Agreed.

> <listitem>
> 
>
> <para>
> Reduce locking requirements for index renaming (Peter Eisentraut)
> </para>
> </listitem>
>
> Should we specify the newly required lock level? Because it's quire
> relevant for users what exactly they're now able to do concurrently in
> operation?

Sure.

> <listitem>
> 
>
> <para>
> Add support for <link linkend="sql-createfunction">function
> selectivity</link> (Tom Lane)
> </para>
> </listitem>
>
> Hm, that message doesn't seem like an accurate description of that
> commit (if anything it's a391ff3c?). Given that it all requires C
> hackery, perhaps we ought to move it to the source code section? And
> isn't the most important part of this set of changes
>
> commit 74dfe58a5927b22c744b29534e67bfdd203ac028
> Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Date: 2019-02-11 21:26:08 -0500
>
> Allow extensions to generate lossy index conditions.

Uh, I missed that as an important item. Can someone give me some text?

> <listitem>
> 
>
> <para>
> Greatly reduce memory consumption of <xref linkend="sql-copy"/>
> and function calls (Andres Freund, Tomas Vondra, Tom Lane)
> </para>
> </listitem>
>
> Grouping these three changes together makes no sense to me.
>
> I think the first commit just ought not to be mentioned separately, it's
> just a fix for a memory leak in 31f3817402, essentially a 12 only bugfix?

Oh, I was not aware of that.

> The second commit is about position() etc, which seems not to match that
> description either?

Ugh.

> The third is probably more appropriate to be in the source code
> section. While it does speed up function calls a bit (in particular
> plpgsql which is very function call heavy), it also is a breaking change
> for some external code? Not sure why Tom is listed with this entry?

The order of names is just a guess when multiple commits are merged ---
this needs help.

> <listitem>
> 
>
> <para>
> Improve search performance for multi-byte characters (Heikki
> Linnakangas)
> </para>
> </listitem>
>
> That's the second reference to the commit. I suspect this is much better
> separate, so I'd just remove it from above.

Done.

> <listitem>
> 
>
> <para>
> Allow <link linkend="storage-toast"><literal>TOAST</literal></link>
> values to be minimally decompressed (Paul Ramsey)
> </para>
>
> I'd s/minimal/partial/ - I don't think the code guarantees anything
> about it being minimal? And "minimally decompressed" also is somewhat
> confusing, because it sounds like it's about the compression quality
> rather than only decompressing part of the data.

It is confusing. Is "partially decompressed" better?

> <listitem>
> 
>
> <para>
> Prevent <xref linkend="sql-truncate"/> from requesting a lock on
> tables for which it lacks permission (Michaël Paquier)
> </para>
>
> <para>
> This prevents unauthorized locking delays.
> </para>
> </listitem>
>
> <listitem>
> 
>
> <para>
> Prevent <command>VACUUM</command> and <command>ANALYZE</command>
> from requesting a lock on tables for which it lacks permission
> (Michaël Paquier)
> </para>
>
> <para>
> This prevents unauthorized locking delays.
> </para>
> </listitem>
>
>
> I don't think this should be in the <title><acronym>Authentication</acronym></title>
> section.

I put it in that section since I thought the motivation was to prevent
people from locking up connecting to the database if someone has a
pending VACUUM/ANALYZE. No?

> Also perhaps, s/it/the user/, or "the caller"?

Agreed, "the user".

> <listitem>
> 
>
> <para>
> Reduce the default value of <xref
> linkend="guc-autovacuum-vacuum-cost-delay"/> to 2ms (Tom Lane)
> </para>
> </listitem>
>
> I think this needs to explain that this can increase autovacuum's IO
> throughput considerably.

Uh, well, do we normally document the effect of a change like this? It
will cause vacuum to be more agressive, and increase I/O? Do we want to
re-educate on what this paramater does?

> <listitem>
> 
>
> <para>
> Allow <xref linkend="guc-vacuum-cost-delay"/> to specify
> sub-millisecond delays (Tom Lane)
> </para>
>
> <para>
> Floating-point values can also now be specified.
> </para>
> </listitem>
>
> And this should be merged with the previous entry?

Uh, I thought the change of default and its range were different enough
that combining them would add confusion.

> <listitem>
> 
>
> <para>
> Allow time-based server variables to use <link
> linkend="config-setting">micro-seconds</link> (us) (Tom Lane)
> </para>
> </listitem>
>
> <listitem>
> 
>
> <para>
> Allow fractional input for integer server variables (Tom Lane)
> </para>
>
> <para>
> For example, <command>SET work_mem = '30.1GB'</command>.
> </para>
> </listitem>
>
> <listitem>
> 
>
> <para>
> Allow units to be specified for floating-point server variables
> (Tom Lane)
> </para>
> </listitem>
>
> Can't we combine these? Seems excessively detailed in comparison to the
> rest of the entries.

See above. It seems confusing to combine them but please propose text
if you think it is possible.

> <listitem>
> 
>
> <para>
> Add an explicit value of <literal>current</literal> for <xref
> linkend="guc-recovery-target-time"/> (Peter Eisentraut)
> </para>
> </listitem>
>
> Seems like this should be combined with the earlier "Cause recovery to
> advance to the latest timeline by default" entry.

The odd part is that the old default was 'current' but there was no way
to specify current --- you just specified nothing. That seemed
confusing enough that having them combined would add confusion, but if
you have some suggested text?

> <listitem>
> 
>
> <para>
> Add support for <link linkend="sql-createtable">generated
> columns</link> (Peter Eisentraut)
> </para>
>
> <para>
> Rather than storing a value only at row creation time, generated
> columns are also modified during updates, and can reference other
> table columns.
> </para>
> </listitem>
>
> I find this description confusing. How about cribbing from the commit?
> Roughly like
>
> This allows creating columns that are computed from expressions,
> including references to other columns in the same table, rather than
> having to be specified by the inserter/updater.
>
> Think we also ought to mention that this is only stored generated
> columns, given that the SQL feature also includes virtual columns?

OK, new text is:

The content of generated columns are computed from expressions
(including references to other columns in the same table)
rather than being specified by <command>INSERT</command> or
<command>UPDATE</command> commands.
>
> <listitem>
> 
>
> <para>
> Add <xref linkend="sql-vacuum"/> and <command>CREATE
> TABLE</command> options to prevent <command>VACUUM</command>
> from truncating trailing empty pages (Tsunakawa Takayuki)
> </para>
>
> <para>
> The options are <varname>vacuum_truncate</varname> and
> <varname>toast.vacuum_truncate</varname>. This reduces vacuum
> locking requirements.
> </para>
> </listitem>
>
> Maybe add something like: "This can be helpful to avoid query
> cancellations on standby that are not avoided by hot_standby_feedback."?

So you turn off truncate on the primary becaues the replay of the
truncate on the standby might cause a cancelation? I was not aware that
was a common problem.

> <listitem>
> 
>
> <para>
> Allow vacuum to avoid index cleanup with the
> <literal>INDEX_CLEANUP</literal> option (Masahiko Sawada)
> </para>
> </listitem>
>
> I think we ought to expand a bit more on why one would do that,
> including perhaps some caveat?

I actually have no idea why someone would want to do that.

> <listitem>
> 
>
> <para>
> Allow modifications of system tables using <xref
> linkend="sql-altertable"/> (Peter Eisentraut)
> </para>
>
> <para>
> This allows modifications of <literal>reloptions</literal> and
> autovacuum settings.
> </para>
> </listitem>
>
> I think the first paragraph is a bit dangerous. This does *not*
> generally allow modifications of system tables using ALTER TABLE.

OK, new text added "options":

Allow modifications of system table options using <xref
linkend="sql-altertable"/> (Peter Eisentraut)

> <listitem>
> 
>
> <para>
> Compute behavior based on pgbench's <option>--rate</option>
> value more precisely (Tom Lane)
> </para>
> </listitem>
>
> "Computing behavior" sounds a bit odd. Maybe "Improve precision of
> pgbench's <option>--rate</option>" option?

Done.

> <listitem>
> 
>
> <para>
> Allow restoration of an <command>INSERT</command>-statement dump
> to skip rows which would cause conflicts (Surafel Temesgen)
> </para>
>
> <para>
> The <application>pg_dump</application> option is
> <option>--on-conflict-do-nothing</option>.
> </para>
> </listitem>
>
> Hm, this doesn't seem that clear. It's not really a restoration time
> option, and it sounds a bit like that in the above. How about instead saying something
> like:
> Allow pg_dump to emit INSERT ... ON CONFLICT DO NOTHING (Surafel).

Done.

> <listitem>
> 
>
> <para>
> Allow the number of float digits to be specified
> for <application>pg_dump</application> and
> <application>pg_dumpall</application> (Andrew Dunstan)
> </para>
>
> <para>
> This allows the float digit output to match previous dumps.
> </para>
>
> Hm, feels like that should be combined with the ryu compat entry?

Uh, but it relates to this specific command, and it is a new feature
rather than a compatibility.

> <para>
> Add <xref linkend="sql-create-access-method"/> command to create
> new table types (Haribabu Kommi, Andres Freund, Álvaro Herrera,
> Dimitri Dolgov)
> </para>
>
> A few points:
>
> 1) Is this really source code, given that CREATE ACCESS METHOD TYPE
> TABLE is a DDL command, and USING (...) for CREATE TABLE etc is an
> option to DDL commands?

I struggled with this. It is a new command, but it has no use yet to
users, so if we move it out of "source code" we need to be clear it has
no useful purpose yet. Can we do that clearly?

> 2) I think the description sounds a bit too much like it's about new
> forms of tables, rather than their storage. How about something
> roughly like:
>
> Allow different <link linkend="tableam">table access methods</> to be
> <link linkend="sql-create-access-method>created</> and <link
> linkend="sql-createtable-method">used</>. This allows to develop and
> use new ways of storing and accessing table data, optimized for
> different use-cases, without having to modify
> PostgreSQL. The existing <literal>heap</literal> access method
> remains the default.

I added a new detail paragraph:

This enables the development of new <link linkend="tableam">table
access methods</>, which can optimize storage for different
use-cases. The existing <literal>heap</literal> access method
remains the default.

> 3) This misses a large set of commits around making tableam possible, in
> particular the commits around
>
> commit 4da597edf1bae0cf0453b5ed6fc4347b6334dfe1
> Author: Andres Freund <andres(at)anarazel(dot)de>
> Date: 2018-11-16 16:35:11 -0800
>
> Make TupleTableSlots extensible, finish split of existing slot type.
>
> Given that those commits entail an API break relevant for extensions,
> should we have them as a separate "source code" note?

I have added this commit to the table-am item. I don't know if this is
something that extension people care about, but if so, we should
certainly add it.

> 4) I think the attribution isn't quite right. For one, a few names with
> substantial work are missing (Amit Khandekar, Ashutosh Bapat,
> Alexander Korotkov), and the order doesn't quite seem right. On the
> latter part I might be somewhat petty, but I spend *many* months of
> my life on this.
>
> How about:
> Andres Freund, Haribabu Kommi, Alvaro Herrera, Alexander Korotkov, David Rowley, Dimitri Golgov
> if we keep 3) separate and

I used the above list since I combined 3 so far.

> Andres Freund, Haribabu Kommi, Alvaro Herrera, Ashutosh Bapat, Alexander Korotkov, Amit Khandekar, David Rowley, Dimitri Golgov
> otherwise?
>
> I think it might actually make sense to take David off this list,
> because his tableam work is essentially part of it's own entry, as

> 
>
> <para>
> Improve speed of <command>COPY</command> into partitioned tables
> (David Rowley)
> </para>
>
> since his copy.c portions of 86b85044e823a largely are a rewrite of
> the above commit.
>

OK, David removed.

> 
>
> <para>
> Document that the <literal>B</literal>/bytes units can be specified
> for <link linkend="config-setting">server variables</link>
> (Greg Stark)
> </para>
> </listitem>
>
> Given how large changes we skip over in the release notes, I don't
> really see a point in including changes like this. Feels like we'd at
> the very least also have to include larger changes with typo/grammar
> fixes etc?

I mentioned it since it was added in a prior release, but was not
documented, so effectively there was no way for someone to know it was
possible before, so I thought it made sense to mention it.

I have only corrected a small number of issues above and look for
guidance to finish the rest. I will reply to the other emails in this
thread now.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Re: PG 12 draft release notes at 2019-05-20 22:17:19 from Andres Freund

Responses

Re: PG 12 draft release notes at 2019-05-21 19:57:56 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Joe Conway	2019-05-21 19:48:37	stawidth inconsistency with all NULL columns
Previous Message	Fabrízio de Royes Mello	2019-05-21 19:47:28	Re: Re: Refresh Publication takes hours and doesn´t finish