== PostgreSQL Weekly News - April 01 2012 ==
In a move that stunned the industry, Oracle has bought up all the
PostgreSQL support companies including Bull, Command Prompt, Dalibo,
EMC, EnterpriseDB, OmniTI, PostgreSQL Experts, Second Quadrant and
VMware. CEO Larry Ellison remarked that the move was necessary in
order to ensure that his next yacht would be larger rather than
smaller than his previous one.
== PostgreSQL Product News ==
psycopg2 2.4.5, a Python connector for PostgreSQL, released.
== PostgreSQL Jobs for April ==
== PostgreSQL Local ==
PGDay NYC will be held April 2, 2012 at Lighthouse International in
New York City.
London PostgreSQL Evening Meetup, 17 April 2012
PGCon 2012 will be held 17-18 May 2012, in Ottawa at the University of
Ottawa. It will be preceded by two days of tutorials on 15-16 May 2012.
PGDay France will be in Lyon on June 7, 2012.
== PostgreSQL in the News ==
Planet PostgreSQL: http://planet.postgresql.org/
PostgreSQL Weekly News is brought to you this week by David Fetter
Submit news and announcements by Sunday at 3:00pm Pacific time.
Please send English language ones to david(at)fetter(dot)org, German language
to pwn(at)pgug(dot)de, Italian language to pwn(at)itpug(dot)org(dot) Spanish language
== Reviews ==
== Applied Patches ==
Robert Haas pushed:
- Code cleanup for heap_freeze_tuple. It used to be case that lazy
vacuum could call this function with only a shared lock on the
buffer, but neither lazy vacuum nor any other code path does that
any more. Simplify the code accordingly and clean up some related,
- New GUC, track_iotiming, to track I/O timings. Currently, the only
way to see the numbers this gathers is via EXPLAIN (ANALYZE,
BUFFERS), but the plan is to add visibility through the stats
collector and pg_stat_statements in subsequent patches. Ants Aasma,
reviewed by Greg Smith, with some further changes by me.
- Expose track_iotiming information via pg_stat_statements. Ants
Aasma, reviewed by Greg Smith, with very minor tweaks by me.
- pg_test_timing utility, to measure clock monotonicity and timing
cost. Ants Aasma, Greg Smith
- Doc fix for pg_test_timing. Fujii Masao
- pg_basebackup: Error message improvements. Fujii Masao
- pg_basebackup: Error handling fixes. Thomas Ogrisegg and Fujii
- Attempt to unbreak pg_test_timing on Windows. Per buildfarm, and
- pg_test_timing: Lame hack to work around compiler warning. Fujii
Masao, plus a comment by me. While I'm at it, correctly tabify this
chunk of code.
Peter Eisentraut pushed:
- Remove dead assignment found by Coverity
- Improve PL/Python database access function documentation. Organize
the function descriptions as a list instead of running text, for
- pg_dump: Small message adjustment for consistency
- Run maintainer-check on all PO files, not only configured ones. The
intent is to allow configure --enable-nls=xx for installation speed
and size, but have maintainer-check check all source files
- Tweak markup to avoid extra whitespace in man pages
- initdb: Mark more messages for translation. Some Windows-only
messages had apparently been forgotten so far. Also make the
wording of the messages more consistent with similar messages other
parts, such as pg_ctl and pg_regress.
- Add new files to NLS file lists. Some of these are newly added,
some are older and were forgotten, some don't contain any
translatable strings right now but look like they could in the
- Replace printf format %i by %d. see also
- pgxs: Supply default values for BISON and FLEX variables.
Otherwise, the availability of these variables depends on what
happened to be available at the time the PostgreSQL build was
- Fix recently introduced typo in NLS file lists
- NLS: Seed Language field in PO header. Use msgmerge --lang option
to seed the Language field, recently introduced by gettext, in the
header of the new PO file.
Tom Lane pushed:
- Silence compiler warning about uninitialized variable.
- Bend parse location rules for the convenience of pg_stat_statements.
Generally, the parse location assigned to a multiple-token construct
is the location of its leftmost token. This commit breaks that rule
for the syntaxes TYPENAME 'LITERAL' and CAST(CONSTANT Alexander
Shulgin TYPENAME) --- the resulting Const will have the location of
the literal string, not the typename or CAST keyword. The cases
where this matters are pretty thin on the ground (no error messages
in the regression tests change, for example), and it's unlikely that
any user would be confused anyway by an error cursor pointing at the
literal. But still it's less than consistent. The reason for
changing it is that contrib/pg_stat_statements wants to know the
parse location of the original literal, and it was agreed that this
is the least unpleasant way to preserve that information through
parse analysis. Peter Geoghegan
- Add some infrastructure for contrib/pg_stat_statements. Add a
queryId field to Query and PlannedStmt. This is not used by the
core backend, except for being copied around at appropriate times.
It's meant to allow plug-ins to track a particular query forward
from parse analysis to execution. The queryId is intentionally not
dumped into stored rules (and hence this commit doesn't bump
catversion). You could argue that choice either way, but it seems
better that stored rule strings not have any dependency on plug-ins
that might or might not be present. Also, add a
post_parse_analyze_hook that gets invoked at the end of parse
analysis (but only for top-level analysis of complete queries, not
cases such as analyzing a domain's default-value expression). This
is mainly meant to be used to compute and assign a queryId, but it
could have other applications. Peter Geoghegan
- Improve contrib/pg_stat_statements to lump "similar" queries
together. pg_stat_statements now hashes selected fields of the
analyzed parse tree to assign a "fingerprint" to each query, and
groups all queries with the same fingerprint into a single entry in
the pg_stat_statements view. In practice it is expected that
queries with the same fingerprint will be equivalent except for
values of literal constants. To make the display more useful, such
constants are replaced by "?" in the displayed query strings. This
mechanism currently supports only optimizable queries (SELECT,
INSERT, UPDATE, DELETE). Utility commands are still matched on the
basis of their literal query strings. There remain some open
questions about how to deal with utility statements that contain
optimizable queries (such as EXPLAIN and SELECT INTO) and how to
deal with expiring speculative hashtable entries that are made to
save the normalized form of a query string. However, fixing these
issues should require only localized changes, and since there are
other open patches involving contrib/pg_stat_statements, it seems
best to go ahead and commit what we've got. Peter Geoghegan,
reviewed by Daniel Farina
- Improve handling of utility statements containing plannable
statements. When tracking nested statements,
contrib/pg_stat_statements formerly double-counted the execution
costs of utility statements that directly contain an executable
statement, such as EXPLAIN and DECLARE CURSOR. This was not obvious
since the ProcessUtility and Executor hooks would each add their
measured costs to the same stats table entry. However, with the new
implementation that hashes utility and plannable statements
differently, this showed up as seemingly-duplicate stats entries.
Fix that by disabling the Executor hooks when the query has a
queryId of zero, which was the case already for such statements but
is now more clearly specified in the code. (The zero queryId was
causing problems anyway because all such statements would add to a
single bogus entry.) The PREPARE/EXECUTE case still results in
counting the same execution in two different stats table entries,
but it should be much less surprising to users that there are two
entries in such cases. In passing, include a CommonTableExpr's
ctename in the query hash. I had left it out originally on the
grounds that we wanted to omit all inessential aliases, but since
RTE_CTE RTEs are hashing their referenced names, we'd better hash
the CTE names too to make sure we don't hash semantically different
queries the same.
- Improve contrib/pg_stat_statements' handling of PREPARE/EXECUTE
statements. It's actually more useful for the module to ignore
these. Ignoring EXECUTE (and not incrementing the nesting level)
allows the executor hooks to charge the time to the underlying
prepared query, which shows up as a stats entry with the original
PREPARE as query string (possibly modified by suppression of
constants, which might not be terribly useful here but it's not
worth avoiding). This is much more useful than cluttering the stats
table with a distinct entry for each textually distinct EXECUTE.
Experimentation with this idea shows that it's also preferable to
ignore PREPARE. If we don't, we get two stats table entries, one
with the query string hash and one with the jumble-derived hash, but
with the same visible query string (modulo those constants). This
is confusing and not very helpful, since the first entry will only
receive costs associated with initial planning of the query, which
is not something counted at all normally by pg_stat_statements.
(And if we do start tracking planning costs, we'd want them blamed
on the other hash table entry anyway.)
- Fix dblink's failure to report correct connection name in error
messages. The DBLINK_GET_CONN and DBLINK_GET_NAMED_CONN macros did
not set the surrounding function's conname variable, causing errors
to be incorrectly reported as having occurred on the "unnamed"
connection in some cases. This bug was actually visible in two
cases in the regression tests, but apparently whoever added those
cases wasn't paying attention. Noted by Kyotaro Horiguchi, though
this is different from his proposed patch. Back-patch to 8.4; 8.3
does not have the same type of error reporting so the patch is not
- Add PGDLLIMPORT to ScanKeywords and NumScanKeywords. Per buildfarm,
this is now needed by contrib/pg_stat_statements.
- Fix glitch recently introduced in psql tab completion.
Over-optimization (by me, looks like :-() broke the case of
recognizing a word boundary just before a quoted identifier.
Reported and diagnosed by Dean Rasheed.
- Rename frontend keyword arrays to avoid conflict with backend. ecpg
and pg_dump each contain keyword arrays with structure similar to
the backend's keyword array. Up to now, we actually named those
arrays the same as the backend's and relied on parser/keywords.h to
declare them. This seems a tad too cute, though, and it breaks now
that we need to PGDLLIMPORT-decorate the backend symbols. Rename to
avoid the problem. Per buildfarm. (It strikes me that maybe we
should get rid of the separate keywords.c files altogether, and just
define these arrays in the modules that use them, but that's a
rather more invasive change.)
- Fix O(N^2) behavior in pg_dump for large numbers of owned sequences.
The loop that matched owned sequences to their owning tables
required time proportional to number of owned sequences times number
of tables; although this work was only expended in selective-dump
situations, which is probably why the issue wasn't recognized long
since. Refactor slightly so that we can perform this work after the
index array for findTableByOid has been set up, reducing the time to
O(M log N). Per gripe from Mike Roest. Since this is a
longstanding performance bug, backpatch to all supported versions.
- Fix O(N^2) behavior in pg_dump when many objects are in dependency
loops. Combining the loop workspace with the record of
already-processed objects might have been a cute trick, but it
behaves horridly if there are many dependency loops to repair: the
time spent in the first step of findLoop() grows as O(N^2). Instead
use a separate flag array indexed by dump ID, which we can check in
constant time. The length of the workspace array is now never more
than the actual length of a dependency chain, which should be
reasonably short in all cases of practical interest. The code is
noticeably easier to understand this way, too. Per gripe from Mike
Roest. Since this is a longstanding performance bug, backpatch to
all supported versions.
Andrew Dunstan pushed:
- Remove now redundant pgpipe code.
- Unbreak Windows builds broken by pgpipe removal.
Heikki Linnakangas pushed:
- Inherit max_safe_fds to child processes in EXEC_BACKEND mode.
Postmaster sets max_safe_fds by testing how many open file
descriptors it can open, and that is normally inherited by all child
processes at fork(). Not so on EXEC_BACKEND, ie. Windows, however.
Because of that, we effectively ignored max_files_per_process on
Windows, and always assumed a conservative default of 32
simultaneous open files. That could have an impact on performance,
if you need to access a lot of different files in a query. After
this patch, the value is passed to child processes by
save/restore_backend_variables() among many other global variables.
It has been like this forever, but given the lack of complaints
about it, I'm not backpatching this.
Simon Riggs pushed:
- Correct epoch of txid_current() when executed on a Hot Standby
server. Initialise ckptXidEpoch from starting checkpoint and
maintain the correct value as we roll forwards. This allows
GetNextXidAndEpoch() to return the correct epoch when executed
during recovery. Backpatch to 9.0 when the problem is first
observable by a user. Bug report from Daniel Farina
== Rejected Patches (for now) ==
No one was disappointed this week :-)
== Pending Patches ==
Kyotaro HORIGUCHI sent in two more revisions of the patch to create a
new tuple storage format for libpq and use same in dblink.
Shigeru HANADA sent in two more revisions of the patch to add a
PostgreSQL FDW along with infrastructure for same.
Peter Eisentraut and Alexander Shulgin traded patches to add a URI
format for connection strings in libpq.
Fujii Masao sent in two revisions of a patch to make pg_basebackup
exit on error.
Ants Aasma sent in a patch to use lazy hash aggregation to speed up
cases where no actual aggregates are used.
Dimitri Fontaine sent in two more revisions of the patch to add finer
dependencies for EXTENSIONs.
Marco Nenciarini sent in another revision of the patch to allow each
element of an array to be an enforced foreign key reference.
Peter Eisentraut sent in a patch to fix some infelicities between
pgxs, bison and flexx.
Andrew Dunstan and Joachim Wieland traded patches to implement
Zoltan Boszormenyi sent in two more revisions of the ECPG FETCH
Daniel Farina sent in another revision of the patch to allow same-role
Pavel Stehule sent in another revision of the CHECK TRIGGER
functionality for PL/pgsql.
Peter Eisentraut sent in a patch which reverts the default
capitalization behavior in psql's tab completion to that prior to a
previous patch while expanding the tunability of that capitalization
with tab completion.
Robert Haas sent in two patches to measure lwlock-related latency
Heikki Linnakangas sent in a patch to set stack_base_ptr in
pgsql-announce by date
|Next:||From: Marc Balmer||Date: 2012-04-05 15:56:56|
|Subject: LuaPgSQL, an Lua Binding for PostgreSQL|
|Previous:||From: Daniele Varrazzo||Date: 2012-03-29 17:14:12|
|Subject: Psycopg 2.4.5 released|