Quick Links

Probable documentation errors or improvements

From:	Yaroslav <ladayaroslav(at)yandex(dot)ru>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Probable documentation errors or improvements
Date:	2020-09-10 19:19:55
Message-ID:	1599765595731-0.post@n3.nabble.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Disclaimer: I'm not a native speaker, so not sure those are actually
incorrect, and can't offer non-trivial suggestions.

General ones:
. "boolean" vs "Boolean" --- usage seems to be inconsistent, even in the
same context.

. Transaction isolation levels are sometimes called "transaction isolation
modes", is that correct?

. In https://www.postgresql.org/docs/current/tableam.html, links to source
code
are also hyperlinks into git, like (from SGML source):

For details, please refer to the <ulink
url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/access/tableam.h;hb=HEAD">
<filename>src/include/access/tableam.h</filename></ulink> file.

Perhaps, other similar links in documentation should also be made into
hyperlinks?

--------------------------------------------------------------------------------
Specific ones:

--
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS

4.1.2.4 Dollar-quoted String Constants

While the standard syntax for specifying string constants is usually
convenient, it can be difficult to understand when the desired string
contains
many single quotes or backslashes, since each of those must be doubled.
-- Not so for backslashes (perhaps, this sentence is from
pre-standard_conforming_strings ages).

----- <in the same section> ------

Notice that inside the dollar-quoted string, single quotes can be used
without
needing to be escaped. Indeed, no characters inside a dollar-quoted string
are
ever escaped: the string content is always written literally. Backslashes
are
not special, and neither are dollar signs, unless they are part of a
sequence
matching the opening tag.
-- Backslashes, again. Though here in may be justified, not sure.

----- <in the same section> ------

$function$
BEGIN
RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
END;
$function$
-- While it's understandable what the example is trying to demonstrate,
single-quoted string would work here, too (so, no actual advantage, in
this case).

----- <in the same section> ------

With single-quote syntax, each backslash in the above example would have to
be
written as four backslashes, which would be reduced to two backslashes in
parsing the original string constant, and then to one when the inner string
constant is re-parsed during function execution.
-- Nothing needs to be changed about backslashes, yet again.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/ddl-basics.html
5.1. Table Basics

A table in a relational database is much like a table on paper: It consists
of
rows and columns.
-- Why "I" in It after ":" is capitalized?

----- <in the same section> ------
Some of the frequently used data types are integer for whole numbers,
numeric
for possibly fractional numbers, text for character strings, date for dates,
time for time-of-day values, and timestamp for values containing both date
and
time.
-- Perhaps, add (or replace with) timestamptz for storing moments in time
(or
something like that)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-EXCLUSION

5.4.6. Exclusion Constraints

Exclusion constraints ensure that if any two rows are compared on the
specified
columns or expressions using the specified operators, at least one of these
operator comparisons will return false or null. The syntax is:

CREATE TABLE circles (
c circle,
EXCLUDE USING gist (c WITH &&)
);
-- Not only the definition is hard to grasp, but the example doesn't
clarify
a lot, as it's not explained what it actually achieves (especially
given
geometric data types and operators are described several chapters
later).

--------------------------------------------------------------------------------
https://www.postgresql.org/docs/current/ddl-alter.html#DDL-ALTER-REMOVING-A-CONSTRAINT

5.6.4. Removing a Constraint

(If you are dealing with a generated constraint name like $2, don't forget
that
you'll need to double-quote it to make it a valid identifier.)
-- I don't think current releases generate names like that anymore?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/ddl-rowsecurity.html

5.8. Row Security Policies
-- A general note: interaction of row security with search_path is not
documented at all, but it may be important is some cases, like using
functions in row security policies.

----- <in the same section> ------

We can then see that an administrator connecting over a network will not see
any records, due to the restrictive policy:

=> SELECT current_user;
current_user
--------------
admin
(1 row)

=> select inet_client_addr();
inet_client_addr
------------------
127.0.0.1
(1 row)

=> SELECT current_user;
current_user
--------------
admin
-- "SELECT current_user;" twice (I guess first one should have been psql
-h).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/dml-update.html

6.2. Updating Data

Only if you have a primary key in the table (independent of whether you
declared it or not) can you reliably address individual rows by choosing a
condition that matches the primary key.
-- If it's about relational theory here, it's candidate key, actually (and
there could be more that one).

Graphical database access tools rely on this fact to allow you to update
rows
individually.
-- Is it certain there's no tool out there that uses "WHERE CURRENT OF
cursor"
for that, for instance? So, perhaps, change it to "usually rely" or
"Many
graphical..."?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/queries-table-expressions.html

7.2. Table Expressions

Trivial table expressions simply refer to a table on disk, a so-called base
table, but more complex expressions can be used to modify or combine base
tables in various ways.
-- Formally, referenced "tables" may be [materialized] views, etc.
And the same, below:

A table reference can be a table name (possibly schema-qualified), or a
derived
table such as a subquery, a JOIN construct, or complex combinations of
these.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-FROM

7.2.1.5. LATERAL Subqueries
-- This section is yet another place where SQL style could use
unification.

For example, if get_product_names() returns the names of products made by a
manufacturer, but some manufacturers in our table currently produce no
products, we could find out which ones those are like this:

SELECT m.name
FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON
true
WHERE pname IS NULL;
-- Is it really a good example? A "groupwise maximum" would be much better
one, IMHO. Besides, the query could have been (more naturally) written
as:

SELECT m.name
FROM manufacturers
WHERE EXISTS (
SELECT 1
FROM get_product_names(m.id)
)

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/datatype-character.html

8.3. Character Types

SQL defines two primary character types: character varying(n) and
character(n),
where n is a positive integer.
-- SQL standard, perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-math.html#FUNCTIONS-MATH-FUNC-TABLE

Table 9.5. Mathematical Functions

In div(y numeric, x numeric) example:
div(9,4)
-- No space after the comma.

----- <in the same section> ------

mod(9,4)
-- The same as above.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE

Table 9.55. General-Purpose Aggregate Functions

bit_and -- the bitwise AND of all non-null input values, or null if none
bit_or -- the bitwise OR of all non-null input values, or null if none
bool_and -- true if all input values are true, otherwise false
-- Descriptions of NULL handling are inconsistent: either the one in bit_*
should be removed, or it should be added everywhere except in
non-strict
functions?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/functions-aggregate.html
9.20. Aggregate Functions

Users accustomed to working with other SQL database management systems might
be
disappointed by the performance of the count aggregate when it is applied to
the
entire table.
-- Which modern DBMS(s) is that about? Perhaps, the note could be removed?

----- <in the same section> ------

This ordering is unspecified by default, but can be controlled by writing an
ORDER BY clause within the aggregate call, as shown in Section 4.2.7.
Alternatively, supplying the input values from a sorted subquery will
usually
work. For example:

SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;

Beware that this approach can fail if the outer query level contains
additional
processing, such as a join, because that might cause the subquery's output
to be
reordered before the aggregate is computed.
-- Perhaps, as the reliable way exists, the "dangerous" one could be
removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/functions-srf.html

9.24. Set Returning Functions

In the example:
-- basic usage
SELECT generate_subscripts('{NULL,1,NULL,2}'::int[], 1) AS s;
-- Why not:
SELECT n
FROM generate_subscripts('{NULL,1,NULL,2}'::int[], 1) AS s(n);

----- <in the same section> ------

SELECT a AS array, s AS subscript, a[s] AS value
FROM (SELECT generate_subscripts(a, 1) AS s, a FROM arrays) foo;
-- IMO, bad style: SRF call in the select list + implicit cross join + no
AS before "foo", but used in every field alias; besides, "array" and
"value" are keywords.

----- <in the same section> ------

CREATE OR REPLACE FUNCTION unnest2(anyarray)
-- The same bad style as above.

----- <in the same section> ------

SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
-- Nobody but a superuser can try this example (besides, the output is
quite
long). Why not come up with a better one?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-ACCESS-TABLE

After Table 9.64. Access Privilege Inquiry Functions, in has_table_privilege
description:

(Thus, there are actually six variants of has_table_privilege, which can be
distinguished by the number and types of their arguments.)
-- This note (describing an implementation detail) is present here, but
not
for the other has_* functions. Perhaps, it could be removed or
generalized?

----- <in the same section> ------

The desired access privilege type is specified by a text string, which must
evaluate to one of the values SELECT, INSERT, UPDATE, DELETE, TRUNCATE,
REFERENCES, or TRIGGER. Optionally, WITH GRANT OPTION can be added to a
privilege type to test whether the privilege is held with grant option.
Also,
multiple privilege types can be listed separated by commas, in which case
the
result will be true if any of the listed privileges is held.
-- The third sentence contradicts the first one ("must evaluate to one" vs
"
"multiple privilege types can be listed")?

----- <in the same section> ------

has_sequence_privilege checks whether a user can access a sequence in a
particular way. The possibilities for its arguments are analogous to
has_table_privilege. The desired access privilege type must evaluate to one
of
USAGE, SELECT, or UPDATE.
-- It also supports combinations of privileges. Besides, the support of
optional WITH GRANT OPTION (as well as in the other similar functions)
is not entirely clear from the description, IMHO.

----- <in the same section> ------
-- A note on style here --- in that section, there are several paragraphs
of
the common form:

"XXX checks whether a user can access a YYY in a particular way. Its
argument
possibilities are analogous to has_table_privilege. The desired access
privilege
type must evaluate to ZZZ."
-- Could all of these be rewritten in a better way?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-TXID-SNAPSHOT-PARTS

Table 9.75. Snapshot Components

In the xip_list description:

The list includes only those active txids between xmin and xmax;
-- Is this correct English?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP

9.26.3. Backup Control Functions

pg_current_wal_lsn displays the current write-ahead log write location in
the
same format used by the above functions.
-- "in the same format _as_ used", perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/functions-event-triggers.html#PG-EVENT-TRIGGER-SQL-DROP-FUNCTIONS

9.28.2. Processing Objects Dropped by a DDL Command

In example of "CREATE FUNCTION test_event_trigger_for_drops()":
END LOOP;
END
$$;
-- Style nitpick: no ";" after the last END, unlike in other examples.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/typeconv-func.html

Example 10.7. Variadic Function Resolution

CREATE FUNCTION public.variadic_example(VARIADIC numeric[]) RETURNS int
LANGUAGE sql AS 'SELECT 1';
CREATE FUNCTION
-- "CREATE FUNCTION" is both input SQL and psql output, hardly
distinguishable.
Besides, the function body is not dollar-quoted (bad style). The same
goes
for the following functions in the example.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/indexes-bitmap-scans.html

11.5. Combining Multiple Indexes

For example, given an index on (a, b) a query condition like WHERE a = 5 AND
b =
6 could use the index, but a query like WHERE a = 5 OR b = 6 could not
directly
use the index.
-- "could not directly use" is vague, IMO (i.e. PostgreSQL actually can
and
sometimes will use an index like that for the query).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/indexes-partial.html

11.8. Partial Indexes

Matching takes place at query planning time, not at run time. As a result,
parameterized query clauses do not work with a partial index. For example a
prepared query with a parameter might specify “x < ?” which will never imply
“x < 2” for all possible values of the parameter.
-- First of all, parameters syntax in postgres is "$n", not "?"; then,
partial
indexes work just fine with parameterization or prepared statements?
E.g.:
CREATE TABLE a(b int, c int);
CREATE INDEX ON a(b) WHERE c > 2;
PREPARE foo(int) AS SELECT b FROM a WHERE c > $1 AND b = 15;
EXPLAIN EXECUTE foo(3); -- Can use the index

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/textsearch-intro.html

12.1. Introduction

Textual search operators have existed in databases for years. PostgreSQL has
~,
~*, LIKE, and ILIKE operators for textual data types, but they lack many
essential properties required by modern information systems:
-- Honestly, the sentence reads like an ad to me. ;)

It is possible to use OR to search for multiple derived forms, but this is
tedious and error-prone (some words can have several thousand derivatives).
-- It's also possible to use "|" in regexps for the purpose (and much
easier
to list thousands of derivatives this way).

They tend to be slow because there is no index support, so they must process
all
documents for every search.
-- There's pg_trgm, now.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-intro.html#TEXTSEARCH-MATCHING

12.1.2. Basic Text Matching

There are functions to_tsquery, plainto_tsquery, and phraseto_tsquery that
are
helpful in converting user-written text into a proper tsquery, primarily by
normalizing words appearing in the text.
-- There's also websearch_to_tsquery now, is it relevant here?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/textsearch-tables.html

12.2.1. Searching a Table

A more complex example is to select the ten most recent documents that
contain
create and table in the title or body:

SELECT title
FROM pgweb
WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create & table')
ORDER BY last_mod_date DESC
LIMIT 10;
-- It's rather create and table in the title and/or body?
E.g. it matches with:
WITH pgweb(title, body, last_mod_date) AS (VALUES ('create', 'table',
'2020-01-01'))

----- <in the same section> ------

For clarity we omitted the coalesce function calls which would be needed to
find
rows that contain NULL in one of the two fields.
-- "title || ' ' || body" above could be replaced with, say:
"concat_ws(' ', title, body)" to avoid the problem, and this paragraph
removed.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES

12.3.2. Parsing Queries

SELECT websearch_to_tsquery('english', 'The fat rats');
websearch_to_tsquery
----------------------
'fat' & 'rat'
(1 row)
-- All the examples of websearch_to_tsquery have "(1 row)" after each one,
while
the ones for previous functions lack it.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-features.html#TEXTSEARCH-MANIPULATE-TSQUERY

12.4.2. Manipulating Queries

tsquery && tsquery
Returns the AND-combination of the two given queries.
-- Is this "given" a good English (there are several similar sentences
below)?

----- <in the same section> ------

SELECT to_tsquery('fat') <-> to_tsquery('cat | rat');
?column?
-----------------------------------
'fat' <-> 'cat' | 'fat' <-> 'rat'
-- This returns "'fat' <-> ( 'cat' | 'rat' )" now.

Returns a query that searches for a match to the first given query followed
by a
match to the second given query at a distance of at distance lexemes, using
the
<N> tsquery operator.
-- Seems like "at a distance of at distance lexemes" has an extra "at"?

----- <in the same section> ------

SELECT numnode(plainto_tsquery('the any'));
NOTICE: query contains only stopword(s) or doesn't contain lexeme(s),
ignored
numnode
---------
0
-- The actual NOTICE spelling now is:
"text-search query contains only stop words or doesn't contain lexemes,
ignored"

----- <in the same section> ------

SELECT querytree(to_tsquery('!defined'));
querytree
-----------

-- This returns "T" now.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-dictionaries.html#TEXTSEARCH-SYNONYM-DICTIONARY

12.6.3. Synonym Dictionary

mydb=# CREATE TEXT SEARCH DICTIONARY syn (template=synonym,
synonyms='synonym_sample');
-- Why is this "mydb=#" prefix in all the examples here (previous ones
don't
have it)?

SELECT ts_lexize('syn','indices');
-- No space after the comma in the function argument list (there are
several
like this below).

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/textsearch-dictionaries.html#TEXTSEARCH-THESAURUS

12.6.4.1. Thesaurus Configuration

CREATE TEXT SEARCH DICTIONARY thesaurus_simple (
TEMPLATE = thesaurus,
DictFile = mythesaurus,
Dictionary = pg_catalog.english_stem
);
-- Why TEMPLATE is spelled all-CAPS here, while DictFile is camel-case
(seems
inconsistent style)?

----- <in the same section> ------

Thesauruses are used during indexing so any change in the thesaurus
dictionary's
parameters requires reindexing. For most other dictionary types, small
changes
such as adding or removing stopwords does not force reindexing.
-- Is it "stop words" or "stopwords", or "stop-words"? Spelling seems to
be
inconsistent in the documentation (and error messages).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/textsearch-indexes.html

12.9. GIN and GiST Index Types

A GiST index can be covering, i.e. use the INCLUDE clause. Included columns
can
have data types without any GiST operator class. Included attributes will be
stored uncompressed.
-- Is this really relevant here (it's the same for any GiST index)?

----- <in the same section> ------

Since random access to table records is slow, this limits the usefulness of
GiST
indexes.
-- Or it might not be --- if it's SSD, or the index pages are in RAM.

----- <in the same section> ------

Note that GIN index build time can often be improved by increasing
maintenance_work_mem, while GiST index build time is not sensitive to that
parameter.
-- Buffered GiST build does depend on maintenance_work_mem value.

----- <in the same section> ------

Partitioning can be done at the database level using table inheritance, or
by
distributing documents over servers and collecting external search results,
e.g.
via Foreign Data access.
-- Why is it "table inheritance" here, not declarative partitioning?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/textsearch-psql.html

12.10. psql Support

-- All of this is documented in psql reference, and examples here are
lengthy.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/transaction-iso.html#XACT-REPEATABLE-READ

13.2.2. Repeatable Read Isolation Level

The Repeatable Read mode provides a rigorous guarantee that each transaction
sees a completely stable view of the database. However, this view will not
necessarily always be consistent with some serial (one at a time) execution
of
concurrent transactions of the same level. For example, even a read only
transaction at this level may see a control record updated to show that a
batch
has been completed but not see one of the detail records which is logically
part
of the batch because it read an earlier revision of the control record.
-- The example is very vague by itself, IMO. I suppose it's a reference to
https://wiki.postgresql.org/wiki/SSI#Deposit_Report , but is it really
expected a reader will understand that from the above paragraph?!

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/transaction-iso.html#XACT-SERIALIZABLE

13.2.3. Serializable Isolation Level

Suppose that serializable transaction A computes:

<skip>

Concurrently, serializable transaction B computes:

<skip>

but since there is no serial order of execution consistent with the result,
using Serializable transactions will allow one transaction to commit and
will
roll the other back with this message:
-- Inconsistent capitalization of Serializable?

----- <in the same section> ------

If you explicitly request a SERIALIZABLE READ ONLY DEFERRABLE transaction,
it
will block until it can establish this fact.
-- Actually, the first query in such a transaction will block. Perhaps, it
should be clarified?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-ROWS

13.3.2. Row-Level Locks

The FOR UPDATE lock mode is also acquired by any DELETE on a row, and also
by an
UPDATE that modifies the values on certain columns.
-- on certain -> of certain?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS

13.3.5. Advisory Locks

Like all locks in PostgreSQL, a complete list of advisory locks currently
held
by any session can be found in the pg_locks system view.
-- Row-level locks are not there --- perhaps, "Like all locks ..." part
is
just unnecessary?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/using-explain.html

14.1. Using EXPLAIN

Examples in this section are drawn from the regression test database after
doing
a VACUUM ANALYZE, using 9.3 development sources.
-- Which is EOL by now. Isn't it a time to update?

----- <in the same section> ------

You should be able to get similar results if you try the examples yourself,
but
your estimated costs and row counts might vary slightly because ANALYZE's
statistics are random samples rather than exact, and because costs are
inherently somewhat platform-dependent.
-- What does it mean (that some planner cost constants are
platform-dependent)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-BASICS

14.1.1. EXPLAIN Basics

(Every correct plan will output the same row set, we trust.)
-- Which is outright wrong:
SELECT *
FROM any_table
WHERE random() < 0.1;

----- <in the same section> ------
Returning to our example:
-- Is it really worth repeating it in full (it's not that far above)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-ANALYZE

14.1.2. EXPLAIN ANALYZE

The numbers provided by BUFFERS help to identify which parts of the query
are
the most I/O-intensive.
-- No mention of what these numbers actually *are* (pages / blocks).

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/planner-stats.html#PLANNER-STATS-EXTENDED

14.2.2.1. Functional Dependencies

In a fully normalized database, functional dependencies should exist only on
primary keys and superkeys. However, in practice many data sets are not
fully
normalized for various reasons; intentional denormalization for performance
reasons is a common example.
-- Confusing terminology: IIRC, modern normalization theory doesn't use
PKs
(but "candidate keys"). Besides, "denormalization" is confusing by
itself.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/parallel-safety.html

15.4.1. Parallel Labeling for Functions and Aggregates

The following operations are always parallel restricted.
-- Should end with ":" instead of "."?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/install-procedure.html

16.4. Installation Procedure

The standard installation provides all the header files needed for client
application development as well as for server-side program development, such
as
custom functions or data types written in C. (Prior to PostgreSQL 8.0, a
separate make install-all-headers command was needed for the latter, but
this
step has been folded into the standard install.)
-- I guess the last sentence is not relevant anymore.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/installation-platform-notes.html#INSTALLATION-NOTES-CYGWIN

16.7.2. Cygwin

The adduser command is not supported; use the appropriate user management
application on Windows NT, 2000, or XP. Otherwise, skip this step.

The su command is not supported; use ssh to simulate su on Windows NT, 2000,
or
XP. Otherwise, skip this step.
-- AFAIK, all these operating systems are dead.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/install-windows-full.html#id-1.6.4.8.8

17.1.1. Requirements

ActiveState TCL

Required for building PL/Tcl (Note: version 8.4 is required, the free
Standard
Distribution is sufficient).
-- version 8.4 or later, perhaps?

----- <in the same section> ------
Note
The obsolete winflex binaries distributed on the PostgreSQL FTP site and
referenced in older documentation will fail with “flex: fatal internal
error,
exec failed” on 64-bit Windows hosts. Use Flex from MSYS instead.
-- Is this note still relevant?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/creating-cluster.html

18.2. Creating a Database Cluster

After initialization, a database cluster will contain a database named
postgres,
which is meant as a default database for use by utilities, users and third
party
applications. The database server itself does not require the postgres
database
to exist, but many external utility programs assume it exists. Another
database
created within each cluster during initialization is called template1.
-- Why template0 is not mentioned at all in this subchapter?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/kernel-resources.html#SYSVIPC

18.4.1. Shared Memory and Semaphores

The maximum number of semaphores in the system is set by SEMMNS, which
consequently must be at least as high as max_connections plus
autovacuum_max_workers plus max_wal_senders, plus max_worker_processes, plus
one
extra for each 16 allowed connections plus workers (see the formula in Table
18.1).
-- Why duplicate formulas (in the table and here)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/upgrading.html#UPGRADING-VIA-PGDUMPALL

18.6.1. Upgrading Data via pg_dumpall

One upgrade method is to dump data from one major version of PostgreSQL and
reload it in another — to do this, you must use a logical backup tool like
pg_dumpall; file system level backup methods will not work.
-- IMO, it would be nice to call dumps dumps, not "logical backups".

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/runtime-config-developer.html

19.17. Developer Options

trace_lock_oidmin (integer)

If set, do not trace locks for tables below this OID. (use to avoid output
on system tables)
-- Incorrect "." position, should be moved to the end of the sentence.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/role-attributes.html

21.2. Role Attributes

(except for superusers, since those bypass all permission checks)
-- Repeated several times in this subchapter (could it be rephrased?).

----- <in the same section> ------

The password and md5 authentication methods make use of passwords. Database
passwords are separate from operating system passwords.
-- What about SCRAM (scram-sha-256)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/manage-ag-dropdb.html

22.5. Destroying a Database

template1 would be the only option for dropping the last user database of a
given cluster.
-- What about using "postgres" (or even template0, if one allows
connections
to it) database?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/manage-ag-tablespaces.html

22.6. Tablespaces

Note that in PostgreSQL 9.1 and earlier you will also need to update the
pg_tablespace catalog with the new locations. (If you do not, pg_dump will
continue to output the old tablespace locations.)
-- Outdated comment (9.1 is EOL).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/maintenance.html

Chapter 24. Routine Database Maintenance Tasks

PostgreSQL, like any database software, requires that certain tasks be
performed
regularly to achieve optimum performance.
-- Why not remove "like any database software"? I really doubt anyone
checked
every "database software" in the world (and, say, sqlite comes to
mind).
Besides, how is that relevant to the point made here?

----- <in the same section> ------

The other main category of maintenance task is periodic “vacuuming” of the
database.
-- "of maintenance tasks" (plural), perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

24.1.5. Preventing Transaction ID Wraparound Failures

vacuum_freeze_min_age controls how old an XID value has to be before rows
bearing that XID will be frozen. Increasing this setting may avoid
unnecessary
work if the rows that would otherwise be frozen will soon be modified again,
but
decreasing this setting increases the number of transactions that can elapse
before the table must be vacuumed again.
-- Isn't it "decreasing this setting _decreases_ the number of
transactions..."?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/backup.html

Chapter 25. Backup and Restore

As with everything that contains valuable data, PostgreSQL databases should
be
backed up regularly.
-- IMNSHO, the whole point of the chapter is backwards. If the data is
valuable, one wants to prevent (or recover) from its loss. Backups are
just
a *mean* for that, while data loss avoidance is the *goal*. Enters
disaster
recovery (and more broadly business continuity). Two metrics are
fundamental to that: RPO and RTO. Using pg_dump, one cannot even get
predictable values for those in any non-trivial case. Therefore, "25.1.
SQL
Dump" should not be first and foremost in the chapter (if mentioned at
all); and the whole chapter better be written with the above in mind.
At
least, it would be nice if dumps were called dumps, not backups, as
well as
the process of making those was called dumping in "25.1. SQL Dump".

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-BASE-BACKUP

25.3.2. Making a Base Backup

To aid you in doing this, the base backup process creates a backup history
file
that is immediately stored into the WAL archive area.
-- The sentence isn't clear --- what's meant by "WAL archive area" here?

----- <in the same section> ------

If you used the label to identify the associated dump file, then the
archived
history file is enough to tell you which dump file to restore.
-- What is meant by "dump file" here?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP

25.3.3.1. Making a Non-Exclusive Low-Level Backup

On a primary, if archive_mode is enabled and the wait_for_archive parameter
is
true, pg_stop_backup does not return until the last segment has been
archived.
-- It's not entirely clear that wait_for_archive is pg_stop_backup
parameter,
and even if yes, which one is it (first or second).

----- <in the same section> ------

25.3.3.3. Backing Up the Data Directory

This is easy to arrange if pg_wal/ is a symbolic link pointing to someplace
outside the cluster directory, which is a common setup anyway for
performance
reasons.
-- Perhaps, this "common" setup should be also mentioned / recommended
elsewhere, then?

----- <in the same section> ------

In case of confusion it is therefore possible to look inside a backup file
and
determine exactly which backup session the dump file came from.
-- "inside a backup _label_ file", perhaps? And what's meant by "dump"
here?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-PITR-RECOVERY

25.3.4. Recovering Using a Continuous Archive Backup

It is important that the command return nonzero exit status on failure.
-- "returns", perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/continuous-archiving.html#BACKUP-TIPS

25.3.6.1. Standalone Hot Backups

With this preparation, a backup can be taken using a script like the
following:
-- The following is the recipe for an exclusive backup, which is
deprecated.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/high-availability.html

Chapter 26. High Availability, Load Balancing, and Replication

Some solutions are synchronous, meaning that a data-modifying transaction is
not
considered committed until all servers have committed the transaction. This
guarantees that a failover will not lose any data and that all load-balanced
servers will return consistent results no matter which server is queried.
-- The last sentence confuses distributed durability ("guarantees that a
failover will not lose any data") with distributed atomic visibility
("all
load-balanced servers will return consistent results") --- the latter
is
impossible to guarantee by any method of replication provided by
PostgreSQL.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/log-shipping-alternative.html

26.4. Alternative Method for Log Shipping

Note that in this mode, the server will apply WAL one file at a time, so if
you
use the standby server for queries (see Hot Standby), there is a delay
between
an action in the master and when the action becomes visible in the standby,
corresponding the time it takes to fill up the WAL file.
-- Is it "corresponding the time" or "corresponding to the time"?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/log-shipping-alternative.html#WARM-STANDBY-RECORD

26.4.2. Record-Based Log Shipping

Starting with PostgreSQL version 9.0, you can use streaming replication (see
Section 26.2.5) to achieve the same benefits with less effort.
-- The described method seems to be quite non-trivial, and, it seems,
provides
no benefits over streaming replication. Perhaps, this section should be
removed altogether?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-ADMIN

26.5.3. Administrator's Overview

WAL file control commands will not work during recovery, e.g.
pg_start_backup,
pg_switch_wal etc.
-- pg_start_backup does work during recovery (for non-exclusive backups).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/wal-reliability.html

29.1. Reliability

While forcing data to the disk platters periodically might seem like a
simple
operation, it is not. Because disk drives are dramatically slower than main
memory and CPUs, several layers of caching exist between the computer's main
memory and the disk platters.
-- Perhaps, "disk platters" (used twice here) could be improved (SSDs
don't
have any)?

----- <in the same section> ------

These commands are not directly accessible to PostgreSQL, but some file
systems
(e.g., ZFS, ext4) can use them to flush data to the platters on
write-back-enabled drives.
-- Perhaps, "platters" could be improved (SSDs don't have any)?

----- <in the same section> ------

Another risk of data loss is posed by the disk platter write operations
themselves. Disk platters are divided into sectors, commonly 512 bytes each.
-- Perhaps, "platters" / "Disk platters" could be improved (SSDs don't
have
any)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/logical-replication.html

Chapter 30. Logical Replication

The subscriber applies the data in the same order as the publisher so that
transactional consistency is guaranteed for publications within a single
subscription.
-- "transactional consistency" is vague. Perhaps, a warning analogous
to the one in
https://www.postgresql.org/docs/current/applevel-consistency.html#SERIALIZABLE-CONSISTENCY
could be added here for clarification?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/logical-replication-architecture.html

30.5.1. Initial Snapshot

Logical replication starts by copying a snapshot of the data on the
publisher
database.
-- "from the publisher", perhaps?

----- <in the same section> ------

The subscriber applies data in the order in which commits were made on the
publisher so that transactional consistency is guaranteed for the
publications
within any single subscription.
-- "transactional consistency" is vague (not clear if it's atomic, here)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/libpq-status.html

33.2. Connection Status Functions

(server_encoding, TimeZone, and integer_datetimes were not reported by
releases
before 8.0; standard_conforming_strings was not reported by releases before
8.1;
IntervalStyle was not reported by releases before 8.4; application_name was
not
reported by releases before 9.0.)
-- All of these releases are very old --- perhaps, this could be removed?

Pre-3.0-protocol servers do not report parameter settings, but libpq
includes
logic to obtain values for server_version and client_encoding anyway.
-- Pre-3.0-protocol servers are very old --- perhaps, this could be
removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/libpq-async.html

33.4. Asynchronous Command Processing

Like PQprepare, it will not work on 2.0-protocol connections.
-- The chapter has several warnings like this. Perhaps (as protocol 2.0 is
very old) these could be removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/lo-interfaces.html#LO-CREATE

34.3.1. Creating a Large Object

lo_create is new as of PostgreSQL 8.1; if this function is run against an
older
server version, it will fail and return InvalidOid.
-- PostgreSQL 8.1 is very old --- perhaps, the warning could be removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/xfunc-sql.html

37.5. Query Language (SQL) Functions

Alternatively, an SQL function can be declared to return a set (that is,
multiple rows) by specifying the function's return type as SETOF sometype,
or
equivalently by declaring it as RETURNS TABLE(columns).
-- "an SQL function" here...

Alternatively, if you want to define a SQL function that performs actions
but
has no useful value to return, you can define it as returning void.
-- But "a SQL function" here. Is it correct?

----- <in the same section> ------

CREATE FUNCTION clean_emp() RETURNS void AS '
DELETE FROM emp
WHERE salary < 0;
' LANGUAGE SQL;
-- Why not use dollar-quoting here (as recommended in the text below)?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-FUNCTION-ARGUMENTS

37.5.1. Arguments for SQL Functions

The ability to use names to reference SQL function arguments was added in
PostgreSQL 9.2. Functions to be used in older servers must use the $n
notation.
-- Perhaps (as 9.2 is EOL), this note could be removed?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-BASE-FUNCTIONS

37.5.2. SQL Functions on Base Types

-- Alternative syntax for string literal:
CREATE FUNCTION one() RETURNS integer AS '
SELECT 1 AS result;
' LANGUAGE SQL;
-- What's the point of the "alternative syntax" example (besides
thickening
the manual)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/xfunc-c.html#id-1.8.3.13.13

37.10.9. Polymorphic Arguments and Return Types

There are two routines provided in fmgr.h to allow a version-1 C function to
discover the actual data types of its arguments and the type it is expected
to
return.
-- Remove "version-1", perhaps?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/extend-extensions.html#id-1.8.3.20.11

37.17.1. Extension Files

The CREATE EXTENSION command relies on a control file for each extension,
which
must be named the same as the extension with a suffix of .control, and must
be
placed in the installation's SHAREDIR/extension directory.
-- In the other places in documentation, it's $SHAREDIR (with dollar).

By default, the script file(s) are also placed in the SHAREDIR/extension
directory; but the control file can specify a different directory for the
script
file(s).
-- In the other places in documentation, it's $SHAREDIR (with dollar).

Unless an absolute path is given, the name is relative to the installation's
SHAREDIR directory.
-- In the other places in documentation, it's $SHAREDIR (with dollar).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/rules-views.html#id-1.8.6.7.7

40.2.3. The Power of Views in PostgreSQL

The benefit of implementing views with the rule system is, that the planner
has
all the information about which tables have to be scanned plus the
relationships
between these tables plus the restrictive qualifications from the views plus
the
qualifications from the original query in one single query tree.
-- The comma is not needed here?

----- <in the same section> ------

And the rule system as implemented in PostgreSQL ensures, that this is all
information available about the query up to that point.
-- The comma is not needed here?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/rules-triggers.html

40.7. Rules Versus Triggers

So if many rows are affected in one statement, a rule issuing one extra
command
is likely to be faster than a trigger that is called for every single row
and
must re-determine what to do many times.
-- There are triggers with transition relations now --- perhaps, this
should
be corrected?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-TRIGGER-EXAMPLE

Example 42.3. A PL/pgSQL Trigger Function

last_date timestamp,
-- Why not timestamptz?

NEW.last_date := current_timestamp;
-- current_timestamp returns timestamptz (which should be used here).

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-EXAMPLE

Example 42.4. A PL/pgSQL Trigger Function for Auditing

CREATE TABLE emp_audit(
operation char(1) NOT NULL,
stamp timestamp NOT NULL,
-- Should be timestamptz (BTW, now() is used later to fill it).

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-VIEW-TRIGGER-AUDIT-EXAMPLE

Example 42.5. A PL/pgSQL View Trigger Function for Auditing

CREATE TABLE emp_audit(
...
stamp timestamp NOT NULL
-- Should be timestamptz

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-TRIGGER-SUMMARY-EXAMPLE

Example 42.6. A PL/pgSQL Trigger Function for Maintaining a Summary Table

<<insert_update>>
LOOP
UPDATE sales_summary_bytime
SET amount_sold = amount_sold + delta_amount_sold,
units_sold = units_sold + delta_units_sold,
amount_cost = amount_cost + delta_amount_cost
WHERE time_key = delta_time_key;

EXIT insert_update WHEN found;

BEGIN
INSERT INTO sales_summary_bytime (
time_key,
amount_sold,
units_sold,
amount_cost)
VALUES (
delta_time_key,
delta_amount_sold,
delta_units_sold,
delta_amount_cost
);

EXIT insert_update;

EXCEPTION
WHEN UNIQUE_VIOLATION THEN
-- do nothing
END;
END LOOP insert_update;
-- The above works only in READ COMMITTED --- perhaps, it should be
mentioned?
Or just replaced with INSERT ... ON CONFLICT ... DO UPDATE?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-trigger.html#PLPGSQL-TRIGGER-AUDIT-TRANSITION-EXAMPLE

Example 42.7. Auditing with Transition Tables

CREATE TABLE emp_audit(
operation char(1) NOT NULL,
stamp timestamp NOT NULL,
-- Should be timestamptz

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

42.11.2. Plan Caching

CREATE FUNCTION logfunc2(logtxt text) RETURNS void AS $$
DECLARE
curtime timestamp;
BEGIN
curtime := 'now';
INSERT INTO logtable VALUES (logtxt, curtime);
END;
$$ LANGUAGE plpgsql;
-- Why not timestamptz in the above?

----- <in the same section> ------

In the case of logfunc1, the PostgreSQL main parser knows when analyzing the
INSERT that the string 'now' should be interpreted as timestamp, because the
target column of logtable is of that type. Thus, 'now' will be converted to
a
timestamp constant when the INSERT is analyzed, and then used in all
invocations
of logfunc1 during the lifetime of the session.
-- timestamp -> timestamptz?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pltcl-functions.html

43.2. PL/Tcl Functions and Arguments

In a nonstrict function, if the actual value of an argument is null, the
corresponding $n variable will be set to an empty string.
-- It's "n variable" (without dollar).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pltcl-dbaccess.html

43.5. Database Access from PL/Tcl

For example:
spi_exec "SELECT count(*) AS cnt FROM pg_proc"
will set the Tcl variable $cnt to the number of rows in the pg_proc system
catalog.
-- It's "variable cnt" (without dollar).

----- <in the same section> ------

We need backslashes inside the query string given to spi_prepare to ensure
that
the $n markers will be passed through to spi_prepare as-is, and not replaced
by
Tcl variable substitution.
-- Perhaps, add that:
{SELECT count(*) AS cnt FROM t1 WHERE num >= $1 AND num <= $2}
could have been used instead (or use it instead and remove the
paragraph)?

----- <in the same section> ------

quote string
Doubles all occurrences of single quote and backslash characters in the
given
string.
-- And indeed it does, so:
set x {don't \add \ }
elog NOTICE "$x = [quote $x]"
Results in:
NOTICE: don't \add \ = don''t \\add \\
Which is not going to work correctly with default
standard_conforming_strings
setting, and the following example is actually wrong:

"SELECT '[ quote $val ]' AS ret"
-- Should be "SELECT E'[ quote $val ]' AS ret" (or, perhaps, it's a bug in
"quote" function?)

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pltcl-trigger.html

43.6. Trigger Functions in PL/Tcl

The information from the trigger manager is passed to the function body in
the
following variables:

$TG_name
-- It's "TG_name" (without dollar). The same goes for all the following
variables.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pltcl-event-trigger.html

43.7. Event Trigger Functions in PL/Tcl

$TG_event
-- It's "TG_event" (without dollar). The same goes for the following
variable.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/plperl-funcs.html

44.1. PL/Perl Functions and Arguments

It is usually most convenient to use dollar quoting (see Section 4.1.2.4)
for
the string constant. If you choose to use escape string syntax E'', you must
double any single quote marks (') and backslashes (\) used in the body of
the
function (see Section 4.1.2.1).
-- Perhaps, the last sentence (or both) could be removed (it's
sufficiently
described elsewhere; PL/Tcl and PL/Python chapters have no similar text
about quoting)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-copy.html

filename

The path name of the input or output file. An input file name can be an
absolute or relative path, but an output file name must be an absolute
path.
Windows users might need to use an E'' string and double any backslashes
used in the path name.
-- But why they might need to use an E'' string (this is quite vague)?

FREEZE
This is intended as a performance option for initial data loading.
-- But actually it's almost useless (as visibility map is not written
by
COPY FREEZE, therefore index-only scans don't work as intended, and
the
next VACUUM will re-scan the whole table). Perhaps, add a note about
that?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createconversion.html

The privileges required to create a conversion might be changed in a future
release.
-- But are still the same (looked back as far as 7.3). What's the use of
this
warning (perhaps, it could be removed)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createextension.html

PostgreSQL will create the extension using details from the file
SHAREDIR/extension/extension_name.control.
-- In the other places in documentation, it's $SHAREDIR (with dollar).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createforeigntable.html

CREATE FOREIGN TABLE films (
code char(5) NOT NULL,
title varchar(40) NOT NULL,
did integer NOT NULL,
date_prod date,
kind varchar(10),
-- Perhaps, replace all the char types here with text + CHECKs (would be
nice
if the documentation adhered to
https://wiki.postgresql.org/wiki/Don%27t_Do_This)?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createtable.html
-- For all the "Examples" section: perhaps, replace all the char types
here
with text + CHECKs (would be nice if the documentation adhered to
https://wiki.postgresql.org/wiki/Don%27t_Do_This)?

Also:
modtime timestamp DEFAULT current_timestamp
-- Replace with timestamptz.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createtransform.html

Synopsis
FROM SQL WITH FUNCTION from_sql_function_name [ (argument_type [, ...])
],
TO SQL WITH FUNCTION to_sql_function_name [ (argument_type [, ...]) ]
-- It seems that a transform function might accept many arguments, but
it's
not documented like that below?

from_sql_function_name[(argument_type [, ...])]
It must take one argument of type internal and return type internal.
-- [, ...] contradicts the description?

to_sql_function_name[(argument_type [, ...])]
It must take one argument of type internal and return the type that is
the
type for the transform.
-- [, ...] contradicts the description?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-createview.html

CREATE VIEW comedies AS
SELECT f.*,
country_code_to_name(f.country_code) AS country,
(SELECT avg(r.rating)
FROM user_ratings r
WHERE r.film_id = f.id) AS avg_rating
FROM films f
WHERE f.kind = 'Comedy';
-- Using AS for aliases (user_ratings AS r, films AS f) would be better
style, IMO.
--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-dropfunction.html

* The standard only allows one function to be dropped per command.
-- The following items don't have points at the end.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-rollback-to.html

The SQL standard specifies that the key word SAVEPOINT is mandatory, but
PostgreSQL and Oracle allow it to be omitted.
-- Why do we need to document Oracle behavior here?

----- <in the same section> ------

Also, SQL has an optional clause AND [ NO ] CHAIN which is not currently
supported by PostgreSQL.
-- ISO SQL (from a draft of 2011) states: "If AND CHAIN is specified, then
<savepoint clause> shall not be specified." So, the sentence is
incorrect?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-select.html

[ HAVING condition [, ...] ]
-- There could be only one condition in HAVING, "[, ...]" is incorrect.

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/sql-select.html#SQL-FOR-UPDATE-SHARE

Previous releases failed to preserve a lock which is upgraded by a later
savepoint. For example, this code:

would fail to preserve the FOR UPDATE lock after the ROLLBACK TO. This has
been
fixed in release 9.3.
-- 9.3 is already unsupported --- perhaps, the warning could be removed
now?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-select.html#id-1.9.3.171.9

Examples

To join the table films with the table distributors:

SELECT f.title, f.did, d.name, f.date_prod, f.kind
FROM distributors d, films f
WHERE f.did = d.did
-- Why not use explicit JOIN ... ON here? Also, there's no ";" at the end,
unlike in the following examples.

----- <in the same section> ------

SELECT kind, sum(len) AS total FROM films GROUP BY kind;
-- Would be nice if examples were formatted in the same style (the
following
ones, too).

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-select.html#id-1.9.3.171.10

Note that if a FROM clause is not specified, the query cannot reference any
database tables. For example, the following query is invalid:

SELECT distributors.* WHERE distributors.name = 'Westward';

PostgreSQL releases prior to 8.1 would accept queries of this form, and add
an
implicit entry to the query's FROM clause for each table referenced by the
query. This is no longer allowed.
-- 8.1 is unsupported for years. Perhaps, this warning could be removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-selectinto.html

[ HAVING condition [, ...] ]
-- There could be only one condition in HAVING, "[, ...]" is incorrect.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-set.html

SET TIME ZONE
<skip>
LOCAL
DEFAULT
Set the time zone to your local time zone (that is, the server's default
value of timezone).
-- Seems like it resets to the database / role default (if any of
those is
set), instead.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-show.html

These variables can be set using the SET statement, by editing the
postgresql.conf configuration file, through the PGOPTIONS environmental
variable
(when using libpq or a libpq-based application), or through command-line
flags
when starting the postgres server. See Chapter 19 for details.
-- The list of methods is not exhaustive --- perhaps, add "etc.", or just
remove / rephrase these sentences?

----- <in the same section> ------

Show all settings:
<skip>
(196 rows)
-- It's 299 rows as of PostgreSQL 12.4. Perhaps, remove the "rows" line?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sql-truncate.html

When RESTART IDENTITY is specified, the implied ALTER SEQUENCE RESTART
operations are also done transactionally; that is, they will be rolled back
if
the surrounding transaction does not commit. This is unlike the normal
behavior
of ALTER SEQUENCE RESTART.
-- But ALTER SEQUENCE RESTART is also transactional... what is this about?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/app-pgbasebackup.html

Especially if the option --checkpoint=fast is not used, this can take some
time
during which pg_basebackup will be appear to be idle.
-- "will be appear to be" -> "will appear to be"?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/pgarchivecleanup.html

pg_archivecleanup is designed to work with PostgreSQL 8.0 and later when
used as
a standalone utility, or with PostgreSQL 9.0 and later when used as an
archive
cleanup command.
-- Isn't this information useless / deprecated by now?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/app-postgres.html

The -- options will not work on FreeBSD or OpenBSD. Use -c instead. This is
a
bug in the affected operating systems; a future release of PostgreSQL will
provide a workaround if this is not fixed.
-- This warning appeared here in PostgreSQL 8.2. Is it still relevant?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/view-pg-locks.html

51.74. pg_locks

The waiting process will sleep until the other lock is released (or a
deadlock
situation is detected).
-- There's also lock timeout which seems relevant here.

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/catalog-pg-authid.html

Table 51.8. pg_authid Columns
-- Some rows in the description have point at the end, some don't
(seemingly
at random).

--------------------------------------------------------------------------------
https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-ASYNC

52.2.6. Asynchronous Operations

(server_encoding, TimeZone, and integer_datetimes were not reported by
releases
before 8.0; standard_conforming_strings was not reported by releases before
8.1;
IntervalStyle was not reported by releases before 8.4; application_name was
not
reported by releases before 9.0.)
-- All these versions are unsupported --- perhaps, the sentence could be
removed?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/protocol-logical-replication.html#PROTOCOL-LOGICAL-MESSAGES-FLOW

52.5.3. Logical Replication Protocol Message Flow

The origin message indicated that the transaction originated on different
replication node.
-- origin -> Origin (as spelled in the other cases in the paragraph),
indicated -> indicates?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/error-message-reporting.html

As of PostgreSQL 9.3, complete coverage exists only for errors in SQLSTATE
class 23 (integrity constraint violation), but this is likely to be expanded
in
future.
-- Perhaps, this information could be updated?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/nls-translator.html#id-1.10.7.2.6

54.1.4. Editing the PO Files

There is (unsurprisingly) a PO mode for Emacs, which I find quite useful.
-- Who is "I"? Perhaps, this should be rephrased? Also, why only Emacs is
mentioned, not any other editors? Perhaps, something like

https://www.gnu.org/software/trans-coord/manual/web-trans/html_node/PO-Editors.html
should be referenced, instead?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/index-functions.html

61.2. Index Access Method Functions

False means it is certain that the index entry matches the scan keys. true
means
this is not certain, and the conditions represented by the scan keys must be
rechecked against the heap tuple after fetching it.
-- "False" is capitalized, "true" isn't --- this can't be correct?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/gin-implementation.html

66.4. Implementation

As of PostgreSQL 9.1, null key values can be included in the index.
-- As 9.1 is already unsupported, perhaps "As of PostgreSQL 9.1" could be
removed?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/gin-implementation.html#GIN-FAST-UPDATE

66.4.1. GIN Fast Update Technique

As of PostgreSQL 8.4, GIN is capable of postponing much of this work by
inserting new tuples into a temporary, unsorted list of pending entries.
-- As 8.4 is already unsupported, perhaps "As of PostgreSQL 8.4" could be
removed?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/gin-tips.html

66.5. GIN Tips and Tricks

Insertion into a GIN index can be slow due to the likelihood of many
keys
being inserted for each item. So, for bulk insertions into a table it is
advisable to drop the GIN index and recreate it after finishing bulk
insertion.

As of PostgreSQL 8.4, this advice is less necessary since delayed
indexing
is used (see Section 66.4.1 for details). But for very large updates it
may
still be best to drop and recreate the index.
-- As 8.4 is already unsupported, perhaps it's better to rephrase both
paragraphs?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/gin-tips.html

66.5. GIN Tips and Tricks

gin_pending_list_limit can be overridden for individual GIN indexes by
changing
storage parameters, and which allows each GIN index to have its own cleanup
threshold.
-- "and which allows" -> "and it allows", perhaps?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/row-estimation-examples.html

70.1. Row Estimation Examples

The outputs shown are taken from version 8.3. The behavior of earlier (or
later)
versions might vary.
-- 8.3 is very old --- perhaps, it's time to update the chapter?

----- <in the same section> ------

Note also that since ANALYZE uses random sampling while producing
statistics,
the results will change slightly after any new ANALYZE.
-- Strictly speaking, it's "might change", not "will change", as small
tables
are scanned in full anyway (which could be relevant here), no?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/errcodes-appendix.html

Appendix A. PostgreSQL Error Codes

As of PostgreSQL 9.3, complete coverage for this feature exists only for
errors
in SQLSTATE class 23 (integrity constraint violation), but this is likely to
be
expanded in future.
-- Perhaps, this information could be updated?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/release-12-3.html

E.2. Release 12.3

The previous coding could allow the file to be created with permissions that
wouldn't allow the postmaster to write on it.
-- Is "write on" correct?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/cube.html#id-1.11.7.18.7

F.9.4. Defaults

I believe this union:
-- "I" disagrees with the general style of the documentation?

----- <in the same section> ------

In all binary operations on differently-dimensioned cubes, I assume the
lower-dimensional one to be a Cartesian projection, i. e., having zeroes in
place of coordinates omitted in the string representation.
-- "I" disagrees with the general style of the documentation?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/isn.html

F.19. isn

It is hoped that a future version of this module will obtained the prefix
list
from one or more tables that can be easily updated by users as needed;
however,
at present, the list can only be updated by modifying the source code and
recompiling.
-- "will obtained" -> "will obtain"?

--------------------------------------------------------------------------------
--
https://www.postgresql.org/docs/current/pgcrypto.html#PGCRYPTO-HASH-SPEED-TABLE

[In the description of the Table F.18. Hash Algorithm Speeds]

That way I can show the speed with different numbers of iterations.
-- "I" disagrees with the general style of the documentation?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/seg.html#id-1.11.7.43.5

F.34.2. Syntax

Because ... is widely used in data sources, it is allowed as an alternative
spelling of ...
-- While the first of those is <...>, and the second is <..>., it's a
little
confusing / not very apparent, IMO --- perhaps, the sentence could be
rephrased so the operator is not at the end of it?

--------------------------------------------------------------------------------
-- https://www.postgresql.org/docs/current/sepgsql.html#SEPGSQL-INSTALLATION

F.35.2. Installation

Be sure that the libselinux-devel RPM is installed at build time.
-- Why RPM, specifically (debian(-based) distributions support SELinux,
too)?

-----
WBR, Yaroslav Schekin.
--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Responses

Re: Probable documentation errors or improvements at 2020-09-13 19:36:26 from David G. Johnston
Re: Probable documentation errors or improvements at 2020-09-19 17:55:59 from Justin Pryzby

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Julien Rouhaud	2020-09-10 19:27:40	Re: Collation versioning
Previous Message	Surafel Temesgen	2020-09-10 18:55:27	Re: Improvements in Copy From