Quick Links

== PostgreSQL Weekly News - October 26 2008 ==

From:	David Fetter <david(at)fetter(dot)org>
To:	PostgreSQL Announce <pgsql-announce(at)postgresql(dot)org>
Subject:	== PostgreSQL Weekly News - October 26 2008 ==
Date:	2008-10-27 03:37:26
Message-ID:	20081027033726.GD30186@fetter.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-announce

Bug fix releases 8.3.5, etc. will be out soon. Get those last-minute
fixes in.

November's commitfest begins this coming week. Watch for some great
new features.

== PostgreSQL Product News ==

check_postgres 2.3.10 released.
http://bucardo.org/check_postgres/

pgdview 0.3 released.
http://pgfoundry.org/projects/pg-rdump/

== PostgreSQL Jobs for October ==

http://archives.postgresql.org/pgsql-jobs/2008-10/threads.php

== PostgreSQL Local ==

Dickson Guedes is looking for volunteers to help with a PgMeeting in
Florianópolis. Write to guediz AT gmail DOT com if you want to help.

PostgreSQL has a table at LinuxLive, Olympia, London, UK on 23-25
October, 2008. Write to Dave Page to participate.
dpage AT pgadmin DOT org

There will be a PostgreSQL BoF at Ontario Linux fest October 25.
http://www.onlinux.ca/

David Fetter and Rober Treat will be speaking at the Beijing Perl
Workshop on November 8.
http://conference.perlchina.org/bjpw2008/schedule

PGDay Rio de la Plata will be in Buenos Aires November 22.
http://pgday.postgres-arg.org/

== PostgreSQL in the News ==

Planet PostgreSQL: http://planet.postgresql.org/

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david(at)fetter(dot)org, German language
to pwn(at)pgug(dot)de, Italian language to pwn(at)itpug(dot)org(dot)

== Applied Patches ==

Tom Lane committed:

- In pgsql/src/tools/findoidjoins/make_oidjoins_check, fix bogus
comment emitted by make_oidjoins_check, per Greg Stark.

- Update oidjoins test to match CVS HEAD.

- Implement comparison of generic records (composite types), and
invent a pseudo-type record[] to represent arrays of
possibly-anonymous composite types. Since composite datums carry
their own type identification, no extra knowledge is needed at the
array level. The main reason for doing this right now is that it is
necessary to support the general case of detection of cycles in
recursive queries: if you need to compare more than one column to
detect a cycle, you need to compare a ROW() to an array built from
ROW()s, at least if you want to do it as the spec suggests. Add
some documentation and regression tests concerning the cycle
detection issue.

- Eliminate unnecessary array[] decoration in examples of recursive
cycle detection.

- Add docs and regression test about sorting the output of a recursive
query in depth-first search order. Upon close reading of SQL:2008,
it seems that the spec's SEARCH DEPTH FIRST and SEARCH BREADTH FIRST
options do not actually guarantee any particular result order: what
they do is provide a constructed column that the user can then sort
on in the outer query. So this is actually just as much
functionality ...

- In pgsql/src/backend/utils/adt/timestamp.c, fix
EncodeSpecialTimestamp to throw error on unrecognized input, rather
than returning a failure code that none of its callers bothered to
check for.

- Extend the date type to support infinity and -infinity, analogously
to the timestamp types. Turns out this doesn't even reduce the
available range of dates, since the restriction to dates that work
for Julian-date arithmetic is much tighter than the int32 range
anyway. Per a longstanding TODO item.

- Update citext expected output for recent change in error message
location pointers. This is only a whitespace change, which ought to
be ignored by regression testing, but for some reason buildfarm
member spoonbill doesn't like it.

- In pgsql/src/backend/catalog/index.c, add a defense to prevent
storing pseudo-type data into index columns. Formerly, the lack of
any opclasses that could accept such data was enough of a defense,
but now with a "record" opclass we need to check more carefully.
(You can still use that opclass for an index, but you have to store
a named composite type not an anonymous one.)

- In pgsql/src/backend/catalog/heap.c, make the system-attributes loop
in AddNewAttributeTuples depend on lengthof(SysAtt) not
FirstLowInvalidHeapAttributeNumber, for consistency with the other
uses of the SysAtt array, and to make it clearer that it doesn't
walk off the end of that array.

- In pgsql/src/backend/executor/spi.c, fix SPI_getvalue and
SPI_getbinval to range-check the given attribute number according to
the TupleDesc's natts, not the number of physical columns in the
tuple. The previous coding would do the wrong thing in cases where
natts is different from the tuple's column count: either incorrectly
report error when it should just treat the column as null, or
actually crash due to indexing off the end of the TupleDesc's
attribute array. (The second case is probably not possible in
modern PG versions, due to more careful handling of inheritance
cases than we once had. But it's still a clear lack of robustness
here.) The incorrect error indication is ignored by all callers
within the core PG distribution, so this bug has no symptoms visible
within the core code, but it might well be an issue for add-on
packages. So patch all the way back.

- pgsql/src/port/win32error.c, reduce chatter from _dosmaperr() when
used in FRONTEND code. ITAGAKI Takahiro.

- In pgsql/src/include/nodes/relation.h, improve comments about
RelOptInfo.reltargetlist.

- In pgsql/src/backend/optimizer/path/costsize.c, salvage a little bit
of work from a failed patch: simplify and speed up set_rel_width().
The code had been catering for the possibility of different varnos
in the relation targetlist, but this is impossible for a base
relation (and if it were possible, putting all the widths in the
same RelOptInfo would be wrong anyway).

- Add a new column to pg_am to specify whether an index AM supports
backward scanning; GiST and GIN do not, and it seems like too much
trouble to make them do so. By teaching ExecSupportsBackwardScan()
about this restriction, we ensure that the planner will protect a
scroll cursor from the problem by adding a Materialize node. In
passing, fix another longstanding bug in the same area: backwards
scan of a plan with set-returning functions in the targetlist did
not work either, since the TupFromTlist expansion code pays no
attention to direction (and has no way to run a SRF backwards
anyway). Again the fix is to make ExecSupportsBackwardScan check
this restriction. Also adjust the index AM API specification to
note that mark/restore support is unnecessary if the AM can't
produce ordered output.

- Remove useless mark/restore support in hash index AM, per
discussion. (I'm leaving GiST/GIN cleanup to Teodor.)

- In pgsql/src/backend/catalog/sql_features.txt, fix broken SQL
features data, per buildfarm results.

- Add a concept of "placeholder" variables to the planner. These are
variables that represent some expression that we desire to compute
below the top level of the plan, and then let that value "bubble up"
as though it were a plain Var (ie, a column value). The immediate
application is to allow sub-selects to be flattened even when they
are below an outer join and have non-nullable output expressions.
Formerly we couldn't flatten because such an expression wouldn't
properly go to NULL when evaluated above the outer join. Now, we
wrap it in a PlaceHolderVar and arrange for the actual evaluation to
occur below the outer join. When the resulting Var bubbles up
through the join, it will be set to NULL if necessary, yielding the
correct results. This fixes a planner limitation that's existed
since 7.1. In future we might want to use this mechanism to
re-introduce some form of Hellerstein's "expensive functions"
optimization, ie place the evaluation of an expensive function at
the most suitable point in the plan tree.

- Dept of better ideas: refrain from creating the planner's
placeholder_list until vars are distributed to rels during
query_planner() startup. We don't really need it before that, and
not building it early has some advantages. First, we don't need to
put it through the various preprocessing steps, which saves some
cycles and eliminates the need for a number of routines to support
PlaceHolderInfo nodes at all. Second, this means one less unused
plan for any sub-SELECT appearing in a placeholder's expression,
since we don't build placeholder_list until after sublink expansion
is complete.

- When estimating without benefit of MCV lists (suggesting that one or
both inputs is unique or nearly so), make eqjoinsel() clamp the
ndistinct estimates to be not more than the estimated number of rows
coming from the input relations. This allows the estimate to change
in response to the selectivity of restriction conditions on the
inputs. This is a pretty narrow patch and maybe we should be more
aggressive about similarly clamping ndistinct in other cases; but
I'm worried about double-counting the effects of the restriction
conditions. However, it seems to help for the case exhibited by
Grzegorz Jaskiewicz (antijoin against a small subset of a relation),
so let's try this for awhile.

- Remove useless ps_OuterTupleSlot field from PlanState. I suppose
this was used long ago, but in the current code the ecxt_outertuple
field of ExprContext is doing all the work. Spotted by Ran Tang.

- Fix an oversight in two different recent patches: nodes that support
SRFs in their targetlists had better reset ps_TupFromTlist during
ReScan calls. There's no need to back-patch here since nodeAgg and
nodeGroup didn't even pretend to support SRFs in prior releases.

- Reduce the memory footprint of large pending-trigger-event lists, as
per my recent proposal. In typical cases, we now need 12 bytes per
insert or delete event and 16 bytes per update event; previously we
needed 40 bytes per event on 32-bit hardware and 80 bytes per event
on 64-bit hardware. Even in the worst case usage pattern with a
large number of distinct triggers being fired in one query, usage is
at most 32 bytes per event. It seems to be a bit faster than the
old code as well, due to reduction of palloc overhead. This commit
doesn't address the TODO item of allowing the event list to spill to
disk; rather it's trying to stave off the need for that. However,
it probably makes that task a bit easier by reducing the data
structure's dependency on pointers. It would now be practical to
dump an event list to disk by "chunks" instead of individual events.

- Fix an old bug in after-trigger handling: AfterTriggerEndQuery took
the address of afterTriggers->query_stack[afterTriggers->query_depth]
and hung onto it through all its firings of triggers. However, if a
trigger causes sufficiently many nested query executions, query_stack
will get repalloc'd bigger, leaving AfterTriggerEndQuery --- and
hence afterTriggerInvokeEvents --- using a stale pointer. So far as
I can find, the only consequence of this error is to stomp on a
couple of words of already-freed memory; which would lead to a
failure only if that chunk had already gotten re-allocated for
something else. So it's hard to exhibit a simple failure case, but
this is surely a bug. I noticed this while working on my recent
patch to reduce pending-trigger space usage. The present patch is
mighty ugly, because it requires making afterTriggerInvokeEvents
know about all the possible event lists it might get called on.
Fortunately, this is only needed in back branches because CVS HEAD
avoids the problem in a different way: afterTriggerInvokeEvents only
touches the passed AfterTriggerEventList pointer once at startup.
Back branches are stable enough that wiring in knowledge of all
possible call usages doesn't seem like a killer problem. Back-patch
to 8.0. 7.4's trigger code is completely different and doesn't seem
to have the problem (it doesn't even use repalloc).

- Add a heuristic to transformAExprIn() to make it prefer expanding "x
IN (list)" into an OR of equality comparisons, rather than x =
ANY(ARRAY[...]), when there are Vars in the right-hand side. This
avoids a performance regression compared to pre-8.2 releases, in
cases where the OR form can be optimized into scans of multiple
indexes. Limit the possible downside by preferring this form only
when the list isn't very long (I set the cutoff at 32 elements,
which is a bit arbitrary but in the right ballpark). Per discussion
with Jim Nasby. In passing, also make it try the OR form if it
cannot select a common type for the array elements; we've seen a
complaint or two about how the OR form worked for such cases and
ARRAY doesn't.

- In pgsql/src/backend/optimizer/plan/initsplan.c, be a little smarter
about qual handling for semi-joins: a qual that mentions only the
outer side can be pushed down rather than having to be evaluated at
the join.

- In pgsql/src/backend/parser/parse_expr.c, better solution to the
IN-list issue: instead of having an arbitrary cutoff, treat Var and
non-Var IN-list items differently. Only non-Var items are
candidates to go into an ANY(ARRAY) construct --- we put all Vars as
separate OR conditions on the grounds that that leaves more scope
for optimization. Per suggestion from Robert Haas.

Heikki Linnakangas committed:

- In pgsql/src/backend/postmaster/bgwriter.c, fix oversight in the
relation forks patch: forgot to copy fork number to fsync requests.
This should fix the installcheck failure of the buildfarm member
"kudu".

Michael Meskes committed:

- In ECPG, fixed parsing of parameters. Added regression test for
this.

Alvaro Herrera committed:

- In pgsql/src/backend/commands/cluster.c, ensure that CLUSTER leaves
the toast table and index with consistent names, by renaming the new
copies after the catalog games.

- In pgsql/src/backend/utils/error/elog.c, refactor some duplicate
code to set up formatted_log_time and formatted_start_time.

- Rework subtransaction commit protocol for hot standby. This patch
eliminates the marking of subtransactions as SUBCOMMITTED in pg_clog
during their commit; instead they remain in-progress until main
transaction commit. At main transaction commit, the commit protocol
is atomic-by-page instead of one transaction at a time. To avoid a
race condition with some subtransactions appearing committed before
others in the case where they span more than one pg_clog page, we
conserve the logic that marks them subcommitted before marking the
parent committed. Simon Riggs with minor help from me

- In pgsql/src/backend/access/transam/transam.c, these functions no
longer return a value, per complaint from gothic_moth via Zdenek
Kotala.

- In pgsql/src/backend/storage/buffer/bufmgr.c, properly access a
buffer's LSN using existing access macros instead of abusing
knowledge of page layout. Stolen from Jonah Harris' CRC patch

Neil Conway committed:

- In pgsql/src/backend/executor/nodeAgg.c, fix a small memory leak in
ExecReScanAgg() in the hashed aggregation case. In the previous
coding, the list of columns that needed to be hashed on was
allocated in the per-query context, but we reallocated every time
the Agg node was rescanned. Since this information doesn't change
over a rescan, just construct the list of columns once during
ExecInitAgg().

Teodor Sigaev committed:

- During repeated rescan of GiST index it's possible that scan key is
NULL but SK_SEARCHNULL is not set. Add checking IS NULL of keys to
set during key initialization. If key is NULL and SK_SEARCHNULL is
not set then nothnig can be satisfied. With assert-enabled
compilation that causes coredump. Bug was introduced in 8.3 by
support of IS NULL index scan.

- In pgsql/src/backend/tsearch/wparser_def.c, fix small bug in
headline generation. Patch from Sushant Sinha.

- Improve headeline generation. Now headline can contain several
fragments a-la Google. Sushant Sinha.

- Remove mark/restore support in GIN and GiST indexes. Per Tom's
comment. Also revome useless GISTScanOpaque->flags field.

- In pgsql/src/backend/access/gist/gistget.c, remove support of
backward scan in GiST per discussion.

- Fix GiST's killing tuple: GISTScanOpaque->curpos wasn't correctly
set. As result, killtuple() marks as dead wrong tuple on page. Bug
was introduced by me while fixing possible duplicates during GiST
index scan.

Peter Eisentraut committed:

- In pgsql/src/backend/catalog/sql_features.txt, small correction SQL
feature table.

- Update feature list for SQL:2008.

- In pgsql/doc/src/sgml/ref/truncate.sgml, update compatibility
section of TRUNCATE for SQL:2008 final.

- In pgsql/src/backend/catalog/sql_features.txt, AS is no longer
required in SELECT list.

- In pgsql/src/backend/catalog/sql_features.txt, feature F402 "Named
column joins for LOBs, arrays, and multisets" is supported, to the
extent that LOBs, arrays, and multisets are supported.

- In pgsql/src/backend/catalog/sql_features.txt, feature T152
"DISTINCT predicate with negation" is supported.

- In pgsql/src/backend/catalog/sql_features.txt, Feature T411 is not
found in SQL:2003 or 2008 anymore, so it must have been dropped or
it was a mistake.

- In pgsql/src/backend/parser/gram.y, SQL 200N -> SQL:2003.

- Allow SQL:2008 syntax ALTER TABLE ... ALTER COLUMN ... SET DATA TYPE
alongside our traditional syntax.

- Use format_type_be() instead of TypeNameToString() for some more
user-facing error messages where the type existence is established.

- In pgsql/src/interfaces/ecpg/test/Makefile, clean regression.out.

- SQL:2008 alternative syntax for LIMIT/OFFSET: OFFSET num {ROW|ROWS}
FETCH {FIRST|NEXT} [num] {ROW|ROWS} ONLY

- In pgsql/src/backend/catalog/sql_features.txt, feature T401 is not
listed in the SQL standard. Must have been a mistake.

- In pgsql/src/backend/catalog/sql_features.txt, feature T173
"Extended LIKE clause in table definition" is supported
(INCLUDING/EXCLUDING DEFAULTS)

- In pgsql/src/backend/catalog/sql_features.txt, on second thought,
let's not get involved in correcting the feature list in 8.3. The
list is quite outdated, and fixing it up would require more effort.
Plus, we don't want diverging information schema contents.

Magnus Hagander committed:

- Make pg_hba authoption be a set of 0 or more name=value pairs. Make
LDAP use this instead of the hacky previous method to specify the DN
to bind as. Make all auth options behave the same when they are not
compiled into the server. Rename "ident maps" to "user name maps",
and support them for all auth methods that provide an external
username. This makes a backwards incompatible change in the format
of pg_hba.conf for the ident, PAM and LDAP authentication methods.

- In pgsql/src/interfaces/libpq/fe-connect.c, fix memory leak when
using gsslib parameter in libpq connections.

- In pgsql/src/backend/libpq/README.SSL, remove large parts of the old
SSL readme, that consisted of a couple of copy/paste:d emails. Much
of the contents had already been migrated into the main
documentation, some was out of date and some just plain wrong. Keep
the "protocol-flowchart" which can still be useful.

- In pgsql/src/backend/libpq/be-secure.c, remove a "TODO-list"
structure at the top of the file, referring back to the old set of
SSL patches. Hasn't been updated since, and we keep the TODOs in the
"real" TODO list, really...

- In pgsql/src/interfaces/libpq/fe-secure.c, remove notes from the
frontend SSL source that are incorrect or end-user documentation
that lives in the actual documentation.

- In pgsql/src/backend/libpq/hba.c, replace now unnecessary goto
statements by using return directly.

== Rejected Patches (for now) ==

No one was disappointed this week :-)

== Pending Patches ==

KaiGai Kohei sent in another revision of his SE-PostgreSQL patches.

Jim Cox sent in another revision of his patch to add a VERBOSE option
to CLUSTER.

Euler Taveira de Oliveira sent in another revision of his reloptions
patch.

Simon Riggs sent in two revisions of a full-on Hot Standby patch.

Pavel Stehule sent in another WIP revision of his GROUPING SETS patch.

Ian Caulfield and Robert Haas each sent in a patch to implement
array_agg.

Guillaume Lelarge sent in a patch to enable people to change a
database's tablespace.

Jim Nasby sent in a patch to implement array_length().

KaiGai Kohei sent in another set of revisions to his SE-PostgreSQL
patches.

Magnus Hagander sent in another patch to clean up SSL.

Simon Riggs sent in rework of his subtransaction commit patch.

Charles Duffy sent in a patch which makes WAL segments more
compressible in some cases.

Magnus Hagander sent in a patch which adds a configuration option to
pg_hba.conf for "clientcert".

Peter Eisentraut sent in a patch to implement CURRENT_CATALOG and
CURRENT_SCHEMA.

Alvaro Herrera sent in a patch to double-buffer page writes in
preparation for block-level CRC checks.

Ramon Lawrence sent in a patch intended to improve the performance of
hybrid hash joins for large multi-batch joins where the probe relation
has skew.

Simon Riggs sent in another patch intended to make queries safe during
recovery.

Andrew Dunstan sent in a patch to implement a trigger function which
drops updates which would have no effect.

Simon Riggs sent in a patch to reduce some DDL locks to ShareLock.

Guillaume Lelarge sent in another revision of his patch to implement
ALTER DATABASE WITH TABLESPACE.

Robert Haas sent in a patch to refactor BufferAccessStrategy for bulk
inserts.

Pavel Stehule sent in a WIP patch to allow functions to have default
values for their parameters.

Jeff Davis sent in a WIP patch to implement ARRAY_AGG() and
ARRAY_ACCUM().

Tom Lane sent in a WIP patch which converts SQL-language functions to
return tuplestores.

Browse pgsql-announce by date

	From	Date	Subject
Next Message	David Fetter	2008-11-03 04:27:53	== PostgreSQL Weekly News - November 02 2008 ==
Previous Message	Simon Riggs	2008-10-21 17:53:37	New Services: Scalability and Managed Hosting