== PostgreSQL Weekly News - June 03 2007 ==

From: David Fetter <david(at)fetter(dot)org>
To: PostgreSQL Announce <pgsql-announce(at)postgresql(dot)org>
Subject: == PostgreSQL Weekly News - June 03 2007 ==
Date: 2007-06-04 05:56:39
Message-ID: 20070604055639.GC1867@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-announce

== PostgreSQL Weekly News - June 03 2007 ==

Lively discussions continue on the many new features going into 8.3.

== PostgreSQL Product News ==

Continuent uni/cluster for PostgreSQL 2007 will be available June 4
http://www.continuent.com/index.php?option=com_content&task=view&id=212&Itemid=169

GNUmed 0.2.6.1 released.
http://wiki.gnumed.de/bin/view/Gnumed/GnumedTodo#AnchorDone

phpPgAdmin 4.1.2 released. Upgrade ASAP.
http://sourceforge.net/project/showfiles.php?group_id=37132

== PostgreSQL Jobs for June ==

http://archives.postgresql.org/pgsql-jobs/2007-06/threads.php

== PostgreSQL Local ==

Registration for pgday.it in Prato, Tuscany, Italy on July 6 and 7 is
open.

Some Important URLs:

Registration: http://www.pgday.it/en/generale/registrazione
PGDay's web site: http://www.pgday.it/en/
Sponsorship campaign: http://www.pgday.it/en/sponsorizzazioni/come
How to get to Prato: http://www.pgday.it/en/logistica/raggiungere_prato
Accommodations in Prato: http://www.pgday.it/en/logistica/dove_dormire
PostgreSQL official T-shirts http://www.prato.linux.it/node/30
Sign up to tour Tuscany: http://www.pgday.it/it/node/158

Important dates:

May 31: Deadline for the call for papers
June 5: Special accommodation rates expire for Hotel San Marco and Art Hotel Milano

== PostgreSQL in the News ==

Planet PostgreSQL: http://www.planetpostgresql.org/

General Bits, Archives and occasional new articles:
http://www.varlena.com/GeneralBits/

PostgreSQL Weekly News is brought to you this week by David Fetter

To get your submission into the upcoming issue, make sure it arrives
at david(at)fetter(dot)org or in German at pwn(at)pgug(dot)de by Sunday at 3:00pm
Pacific Time.

== Applied Patches ==

Andrew Dunstan committed:

- Improve efficiency of LIKE/ILIKE code, especially for multi-byte
charsets, and most especially for UTF8. Remove unnecessary special
cases for bytea processing and single-byte charset ILIKE. a ILIKE b
is now processed as lower(a) LIKE lower(b) in all cases. The code is
now considerably simpler. All comparisons are now performed
byte-wise, and the text and pattern are also advanced byte-wise
where it is safe to do so - essentially where a wildcard is not
being matched. Andrew Dunstan, from an original patch by ITAGAKI
Takahiro, with ideas from Tom Lane and Mark Mielke.

Teodor Sigaev committed:

- Replace ReadBuffer to ReadBufferWithStrategy in all vacuum-involved
places to implement limited-size "ring" of buffers for VACUUM for
GIN & GIST.

Peter Eisentraut committed:

- Clarify some error messages about duplicate things.

- Make some messages more consistent.

- Downgrade some low-level startup messages to DEBUG1.

Bruce Momjian committed:

- Remove description for now-complete TODO item: "-Add a GUC variable
to control the tablespace for temporary objects and sort files."

- Updated TODO: "Allow free-behind capability for large sequential
scans to avoid kernel cache spoiling."

- Update wording and add URL for TODO: "Research self-referential
UPDATEs that see inconsistent row versions in read-committed mode."

- Wording improvement in FAQ_DEV.

- Update FAQ_DEV URL to output for text format.

- Add URL for code comments to developer's FAQ.

- Update TODO to read: "Consider allowing 64-bit integers and floats
to be passed by value on 64-bit platforms Also change 32-bit floats
(float4) to be passed by value at the same time."

- Add to TODO: "Consider allowing 64-bit integers to be passed by
value on 64-bit platforms."

- Les Hill's patch which adds standard error redirection for OS/X &
darwin startup script.

- Guillaume Cottenceau's patch which updates documentation to mention
VACUUM FULL and CLUSTER where appropriate.

- Add URL for TODO: "Improve speed with indexes."

- Jim Nasby's patch which adds a documentation reference to
statistical functions from func.sgml.

- Mark Cotner's patch to update /contrib OS/X startup files, and move
to a separate OS/X directory.

- Update cvsutils documentation description.

- David Fetter's patch to update cvs instructions to suggest cvsutils.

- Fix trivial misspelling in comment.

- Add to TODO: "Fix self-referential UPDATEs that see inconsistent row
versions in read-committed mode."

Tom Lane committed:

- Create a GUC parameter temp_tablespaces that allows selection of the
tablespace(s) in which to store temp tables and temporary files.
This is a list to allow spreading the load across multiple
tablespaces (a random list element is chosen each time a temp object
is to be created). Temp files are not stored in per-database
pgsql_tmp/ directories anymore, but per-tablespace directories.
Jaime Casanova and Albert Cervera, with review by Bernd Helmle and
Tom Lane.

- Fix erroneous error reporting for overlength input in text_date(),
text_time(), and text_timetz(). 7.4-vintage bug found by Greg
Stark.

- Fix aboriginal bug in BufFileDumpBuffer that would cause it to write
the wrong data when dumping a bufferload that crosses a
component-file boundary. This probably has not been seen in the
wild because (a) component files are normally 1GB apiece and (b)
non-block-aligned buffer usage is relatively rare. But it's fairly
easy to reproduce a problem if one reduces RELSEG_SIZE in a test
build. Kudos to Kurt Harriman for spotting the bug.

- Make CREATE/DROP/RENAME DATABASE wait a little bit to see if other
backends will exit before failing because of conflicting DB usage.
Per discussion, this seems a good idea to help mask the fact that
backend exit takes nonzero time. Remove a couple of
thereby-obsoleted sleeps in contrib and PL regression test
sequences.

- Buy back some of the cycles spent in more-expensive hash functions
by selecting power-of-2, rather than prime, numbers of buckets in
hash joins. If the hash functions are doing their jobs properly by
making all hash bits equally random, this is good enough, and it
saves expensive integer division and modulus operations.

- Fix performance problems in multi-batch hash joins by ensuring that
we select a well-randomized batch number even when given a
poorly-randomized hash value. This is a bit inefficient but seems
the only practical solution given the constraint that we can't
change the hash functions in released branches. Per report from
Joseph Shraibman. Applied to 8.1 and 8.2 only --- HEAD is getting a
cleaner fix, and 8.0 and before use different coding that seems less
vulnerable.

- Fix several hash functions that were taking chintzy shortcuts
instead of delivering a well-randomized hash value. I got religion
on this after observing that performance of multi-batch hash join
degrades terribly if the higher-order bits of hash values aren't
random, as indeed was true for say hashes of small integer values.
It's now expected and documented that hash functions should use
hash_any or some comparable method to ensure that all bits of their
output are about equally random. initdb forced because this change
invalidates existing hash indexes. For the same reason, this isn't
back-patchable; the hash join performance problem will get a
band-aid fix in the back branches.

- The shortcut exit that I recently added to ExecInitIndexScan() for
EXPLAIN-only operation was a little too short; it skipped
initializing the node's result tuple type, which may be needed
depending on what's above the indexscan node. Call
ExecAssignResultTypeFromTL before exiting. (For good luck I moved
up the ExecAssignScanProjectionInfo call as well, so that everything
except indexscan-specific initialization will still be done.) Per
example from Grant Finnemore.

- Change build_index_pathkeys() so that the expressions it builds to
represent index key columns always have the type expected by the
index's associated operators, ie, we add RelabelType nodes when
dealing with binary-compatible index opclasses. This is needed to
get varchar indexes to play nicely with the new EquivalenceClass
machinery, as per recent gripe from Josh Berkus that CVS HEAD was
failing to match a varchar index column to a constant restriction in
the query. It seems likely that this change will allow removal of a
lot of ugly ad-hoc RelabelType-stripping that the planner has
traditionally done while matching expressions to other expressions,
but I'll worry about that some other day.

- Fix overly-strict sanity check in BeginInternalSubTransaction that
made it fail when used in a deferred trigger. Bug goes back to 8.0;
no doubt the reason it hadn't been noticed is that we've been
discouraging use of user-defined constraint triggers. Per report
from Frank van Vugt.

- Make large sequential scans and VACUUMs work in a limited-size
"ring" of buffers, rather than blowing out the whole shared-buffer
arena. Aside from avoiding cache spoliation, this fixes the problem
that VACUUM formerly tended to cause a WAL flush for every page it
modified, because we had it hacked to use only a single buffer.
Those flushes will now occur only once per ring-ful. The exact ring
size, and the threshold for seqscans to switch into the ring usage
pattern, remain under debate; but the infrastructure seems done.
The key bit of infrastructure is a new optional BufferAccessStrategy
object that can be passed to ReadBuffer operations; this replaces
the former StrategyHintVacuum API. This patch also changes the
buffer usage-count methodology a bit: we now advance usage_count
when first pinning a buffer, rather than when last unpinning it. To
preserve the behavior that a buffer's lifetime starts to decrease
when it's released, the clock sweep code is modified to not
decrement usage_count of pinned buffers. Work not done in this
commit: teach GiST and GIN indexes to use the vacuum
BufferAccessStrategy for vacuum-driven fetches. Original patch by
Simon, reworked by Heikki and again by Tom.

- Tweak the code in a couple of places to try to deliver more
user-friendly error messages when a single COPY line is too long for
us to handle. Per example from Johann Spies.

Neil Conway committed:

- Remove incorrect semicolon in example. This was previously fixed in
HEAD only -- backporting to 8.2. Per report from Frank van Vugt.

- Allow leading and trailing whitespace in the input to the boolean
type. Also, add explicit casts between boolean and text/varchar.
Both of these changes are for conformance with SQL:2003. Update the
regression tests, bump the catversion.

- Tweak: use memcpy() in text_time(), rather than manually copying
bytes in a loop.

- Fix a bug in input processing for the "interval" type. Previously,
"microsecond" and "millisecond" units were not considered valid
input by themselves, which caused inputs like "1 millisecond" to be
rejected erroneously. Update the docs, add regression tests, and
backport to 8.2 and 8.1

- mmgr README tweak: "either" is no longer correct. The previous
wording compared PortalContext with QueryContext, but the latter no
longer exists.

- Stop a few regression tests from needlessly disabling GEQO. This was
necessary in 1997, when geqo_threshold did not exist, but it is no
longer needed.

- Code cleanup: use "bool" for Boolean variables, rather than "int".

Michael Meskes committed:

- Applied patch send by Joachim Wieland to fix INTEGER_DATETIMES under
MSVC.

- Applied Joachim Wieland's patch for ecpg_config.h creation on Vista.
Changed variable test to not run into infinite loops on backend
errors.

== Rejected Patches (for now) ==

No one was disappointed this week :-)

== Pending Patches ==

Browse pgsql-announce by date

  From Date Subject
Next Message Abhijit Menon-Sen 2007-06-05 05:28:12 Archiveopteryx 2.0 released
Previous Message Robert Treat 2007-06-02 03:40:17 phpPgAdmin 4.1.2 released