Re: Annotated release notes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>
Subject: Re: Annotated release notes
Date: 2003-10-31 05:43:48
Message-ID: 7423.1067579028@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> OK, I have committed changes to release.sgml so most complex entries
> have a paragraph describing the change. You can see the result at:
> http://candle.pha.pa.us/main/writings/pgsql/sgml/release.html#RELEASE-7-4
> I need people to check this and help me with the items marked 'bjm'.

Okay, a few comments ...

<listitem><para> IN/NOT IN subqueries are now much more efficient</para>
<para>
In previous releases, IN/NOT IN subqueries were joined to the
upper query by sequentially scanning the subquery looking for
a join. The 7.4 code uses the same sophisticated techniques
used by ordinary joins and so is much faster, and is now faster
than EXISTS subqueries.
</para>
</listitem>

This might be overstatement. How about "... is much faster. An IN
will now usually be as fast as or faster than an equivalent EXISTS
subquery; this reverses the conventional wisdom that applied to previous
Postgres releases."

<listitem><para> Improved GROUP BY processing by using hash buckets</para>
<para>
In previous releases, GROUP BY totals were accumulated by
sequentially scanning the list of groups looking for a match;
the 7.4 code places GROUP BY values in hash buckets so the
proper match can be found much quicker. This is particularly
significant in speeding up queries that have a large
number of distinct GROUP BY values.
</para>
</listitem>

This is backwards. I suggest "In previous releases, GROUP BY required
sorting the input data to bring group members together. 7.4 can do it
that way, or can accumulate data into per-group hash buckets in-memory.
The hash technique avoids a sort and so can be much faster, if the
number of distinct GROUP BY values is not too large to fit in memory."

<listitem><para> ANSI joins are now better optimized</para>
<para>
Prior releases evaluated ANSI join syntax only in the order
specified by the query; 7.4 allows full optimization of
queries using ANSI join syntax, meaning the optimizer considers
all possible join orderings and chooses the most efficient.
</para>
</listitem>

This is correct only for inner joins. Outer joins still follow the
syntax-implied ordering. Not sure what the best rewording is.

<listitem><para> Full support for IPv6 connections and IPv6 address
data types</para>
<para>
Prior releases allowed only IPv6 connections and IP data types only
supported IPv4 addresses. This release adds full IPv6 support in
both of these areas.
</para>
</listitem>

Surely "allowed only IPv4 connections".

<listitem><para> New protocol improves connection speed/reliability,
and adds error codes, status information, a binary protocol, error
reporting verbosity, and cleaner startup packets.</para>
</listitem>

I dunno anything about improving connection speed/reliability. How
about "New client-to-server protocol adds error codes, more status
information, better support for binary data transmission, parameter
values separated from SQL commands, prepared statements available at the
protocol level, clean recovery from COPY failures, and cleaner startup
packets. The older protocol is still supported by both servers and
clients."

<listitem><para>Align shared buffers on 32-byte boundary for copy speed improvement (Manfred Spraul)</para>
<para>
Certain CPU's perform faster data copies when addresses are 32-bit
aligned.
</para>
</listitem>

bit -> byte

<listitem><para>Fix subquery aggregates of upper query columns to match SQL spec. (Tom)</para>
<para>
bjm
</para>
</listitem>

Try:

Fix aggregates in subqueries to match SQL spec

The SQL spec says that an aggregate function appearing within a nested
subquery belongs to the outer query if its argument contains only
outer-query variables. Prior PG releases did not handle this fine point
correctly.

<listitem><para>Add option to prevent auto-addition of tables referenced in query (Nigel J.
Andrews) </para>
<para>
By default, tables mentioned in the query are automatically added
to the FROM clause if they are not already there. This option
disabled that behavior.
</para>
</listitem>

I'd suggest "... not already there. This is compatible with
historical Postgres behavior but is contrary to the SQL spec.
This option allows selecting spec-compatible behavior."

<listitem><para>Multiple pggla_dump fixes, including tar format and large objects</para></listitem>

"pggla_dump"?

<listitem><para>Syntax errors now reported as 'syntax error' rather than 'parse error' (Tom)</para></listitem>

Is it worth giving this its own bullet point? It's far down in the
noise compared to all the other message rewordings.

regards, tom lane

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Joe Conway 2003-10-31 05:54:10 Re: Annotated release notes
Previous Message Bruce Momjian 2003-10-31 04:59:05 Annotated release notes

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-10-31 05:53:14 Re: Experimental patch for inter-page delay in VACUUM
Previous Message Christopher Kings-Lynne 2003-10-31 05:30:00 Rule regression failure freebsd?