== PostgreSQL Weekly News - June 03 2007 ==

From: David Fetter <david(at)fetter(dot)org>
To: PostgreSQL Announce <pgsql-announce(at)postgresql(dot)org>
Subject: == PostgreSQL Weekly News - June 03 2007 ==
Date: 2007-06-04 05:56:39
Message-ID:
Lists: pgsql-announce
== PostgreSQL Weekly News - June 03 2007 ==

Lively discussions continue on the many new features going into 8.3.

== PostgreSQL Product News ==

Continuent uni/cluster for PostgreSQL 2007 will be available June 4

GNUmed released.

phpPgAdmin 4.1.2 released.  Upgrade ASAP.

== PostgreSQL Jobs for June ==

== PostgreSQL Local ==

Registration for in Prato, Tuscany, Italy on July 6 and 7 is

Some Important URLs:

PGDay's web site:
Sponsorship campaign:
How to get to Prato:
Accommodations in Prato:
PostgreSQL official T-shirts
Sign up to tour Tuscany:

Important dates:

May 31: Deadline for the call for papers
June 5: Special accommodation rates expire for Hotel San Marco and Art Hotel Milano

== PostgreSQL in the News ==

Planet PostgreSQL:

General Bits, Archives and occasional new articles:

PostgreSQL Weekly News is brought to you this week by David Fetter

To get your submission into the upcoming issue, make sure it arrives
at david(at)fetter(dot)org or in German at pwn(at)pgug(dot)de by Sunday at 3:00pm
Pacific Time.

== Applied Patches == 

Andrew Dunstan committed:

- Improve efficiency of LIKE/ILIKE code, especially for multi-byte
  charsets, and most especially for UTF8. Remove unnecessary special
  cases for bytea processing and single-byte charset ILIKE.  a ILIKE b
  is now processed as lower(a) LIKE lower(b) in all cases. The code is
  now considerably simpler. All comparisons are now performed
  byte-wise, and the text and pattern are also advanced byte-wise
  where it is safe to do so - essentially where a wildcard is not
  being matched.  Andrew Dunstan, from an original patch by ITAGAKI
  Takahiro, with ideas from Tom Lane and Mark Mielke.

Teodor Sigaev committed:

- Replace ReadBuffer to ReadBufferWithStrategy in all vacuum-involved
  places to implement limited-size "ring" of buffers for VACUUM for

Peter Eisentraut committed:

- Clarify some error messages about duplicate things.

- Make some messages more consistent.

- Downgrade some low-level startup messages to DEBUG1.

Bruce Momjian committed:

- Remove description for now-complete TODO item: "-Add a GUC variable
  to control the tablespace for temporary objects and sort files."

- Updated TODO: "Allow free-behind capability for large sequential
  scans to avoid kernel cache spoiling."

- Update wording and add URL for TODO: "Research self-referential
  UPDATEs that see inconsistent row versions in read-committed mode."

- Wording improvement in FAQ_DEV.

- Update FAQ_DEV URL to output for text format.

- Add URL for code comments to developer's FAQ.

- Update TODO to read: "Consider allowing 64-bit integers and floats
  to be passed by value on 64-bit platforms Also change 32-bit floats
  (float4) to be passed by value at the same time."

- Add to TODO: "Consider allowing 64-bit integers to be passed by
  value on 64-bit platforms."

- Les Hill's patch which adds standard error redirection for OS/X &
  darwin startup script.

- Guillaume Cottenceau's patch which updates documentation to mention
  VACUUM FULL and CLUSTER where appropriate.

- Add URL for TODO: "Improve speed with indexes."

- Jim Nasby's patch which adds a documentation reference to
  statistical functions from func.sgml.

- Mark Cotner's patch to update /contrib OS/X startup files, and move
  to a separate OS/X directory.

- Update cvsutils documentation description.

- David Fetter's patch to update cvs instructions to suggest cvsutils.

- Fix trivial misspelling in comment.

- Add to TODO: "Fix self-referential UPDATEs that see inconsistent row
  versions in read-committed mode."

Tom Lane committed:

- Create a GUC parameter temp_tablespaces that allows selection of the
  tablespace(s) in which to store temp tables and temporary files.
  This is a list to allow spreading the load across multiple
  tablespaces (a random list element is chosen each time a temp object
  is to be created).  Temp files are not stored in per-database
  pgsql_tmp/ directories anymore, but per-tablespace directories.
  Jaime Casanova and Albert Cervera, with review by Bernd Helmle and
  Tom Lane.

- Fix erroneous error reporting for overlength input in text_date(),
  text_time(), and text_timetz().  7.4-vintage bug found by Greg

- Fix aboriginal bug in BufFileDumpBuffer that would cause it to write
  the wrong data when dumping a bufferload that crosses a
  component-file boundary.  This probably has not been seen in the
  wild because (a) component files are normally 1GB apiece and (b)
  non-block-aligned buffer usage is relatively rare.  But it's fairly
  easy to reproduce a problem if one reduces RELSEG_SIZE in a test
  build.  Kudos to Kurt Harriman for spotting the bug.

- Make CREATE/DROP/RENAME DATABASE wait a little bit to see if other
  backends will exit before failing because of conflicting DB usage.
  Per discussion, this seems a good idea to help mask the fact that
  backend exit takes nonzero time.  Remove a couple of
  thereby-obsoleted sleeps in contrib and PL regression test

- Buy back some of the cycles spent in more-expensive hash functions
  by selecting power-of-2, rather than prime, numbers of buckets in
  hash joins.  If the hash functions are doing their jobs properly by
  making all hash bits equally random, this is good enough, and it
  saves expensive integer division and modulus operations.

- Fix performance problems in multi-batch hash joins by ensuring that
  we select a well-randomized batch number even when given a
  poorly-randomized hash value.  This is a bit inefficient but seems
  the only practical solution given the constraint that we can't
  change the hash functions in released branches.  Per report from
  Joseph Shraibman.  Applied to 8.1 and 8.2 only --- HEAD is getting a
  cleaner fix, and 8.0 and before use different coding that seems less

- Fix several hash functions that were taking chintzy shortcuts
  instead of delivering a well-randomized hash value.  I got religion
  on this after observing that performance of multi-batch hash join
  degrades terribly if the higher-order bits of hash values aren't
  random, as indeed was true for say hashes of small integer values.
  It's now expected and documented that hash functions should use
  hash_any or some comparable method to ensure that all bits of their
  output are about equally random.  initdb forced because this change
  invalidates existing hash indexes.  For the same reason, this isn't
  back-patchable; the hash join performance problem will get a
  band-aid fix in the back branches.

- The shortcut exit that I recently added to ExecInitIndexScan() for
  EXPLAIN-only operation was a little too short; it skipped
  initializing the node's result tuple type, which may be needed
  depending on what's above the indexscan node.  Call
  ExecAssignResultTypeFromTL before exiting.  (For good luck I moved
  up the ExecAssignScanProjectionInfo call as well, so that everything
  except indexscan-specific initialization will still be done.) Per
  example from Grant Finnemore.

- Change build_index_pathkeys() so that the expressions it builds to
  represent index key columns always have the type expected by the
  index's associated operators, ie, we add RelabelType nodes when
  dealing with binary-compatible index opclasses.  This is needed to
  get varchar indexes to play nicely with the new EquivalenceClass
  machinery, as per recent gripe from Josh Berkus that CVS HEAD was
  failing to match a varchar index column to a constant restriction in
  the query.  It seems likely that this change will allow removal of a
  lot of ugly ad-hoc RelabelType-stripping that the planner has
  traditionally done while matching expressions to other expressions,
  but I'll worry about that some other day.

- Fix overly-strict sanity check in BeginInternalSubTransaction that
  made it fail when used in a deferred trigger.  Bug goes back to 8.0;
  no doubt the reason it hadn't been noticed is that we've been
  discouraging use of user-defined constraint triggers.  Per report
  from Frank van Vugt.

- Make large sequential scans and VACUUMs work in a limited-size
  "ring" of buffers, rather than blowing out the whole shared-buffer
  arena.  Aside from avoiding cache spoliation, this fixes the problem
  that VACUUM formerly tended to cause a WAL flush for every page it
  modified, because we had it hacked to use only a single buffer.
  Those flushes will now occur only once per ring-ful.  The exact ring
  size, and the threshold for seqscans to switch into the ring usage
  pattern, remain under debate; but the infrastructure seems done.
  The key bit of infrastructure is a new optional BufferAccessStrategy
  object that can be passed to ReadBuffer operations; this replaces
  the former StrategyHintVacuum API.  This patch also changes the
  buffer usage-count methodology a bit: we now advance usage_count
  when first pinning a buffer, rather than when last unpinning it.  To
  preserve the behavior that a buffer's lifetime starts to decrease
  when it's released, the clock sweep code is modified to not
  decrement usage_count of pinned buffers.  Work not done in this
  commit: teach GiST and GIN indexes to use the vacuum
  BufferAccessStrategy for vacuum-driven fetches.  Original patch by
  Simon, reworked by Heikki and again by Tom.

- Tweak the code in a couple of places to try to deliver more
  user-friendly error messages when a single COPY line is too long for
  us to handle.  Per example from Johann Spies.

Neil Conway committed:

- Remove incorrect semicolon in example. This was previously fixed in
  HEAD only -- backporting to 8.2. Per report from Frank van Vugt.

- Allow leading and trailing whitespace in the input to the boolean
  type. Also, add explicit casts between boolean and text/varchar.
  Both of these changes are for conformance with SQL:2003.  Update the
  regression tests, bump the catversion.

- Tweak: use memcpy() in text_time(), rather than manually copying
  bytes in a loop.

- Fix a bug in input processing for the "interval" type. Previously,
  "microsecond" and "millisecond" units were not considered valid
  input by themselves, which caused inputs like "1 millisecond" to be
  rejected erroneously.  Update the docs, add regression tests, and
  backport to 8.2 and 8.1

- mmgr README tweak: "either" is no longer correct. The previous
  wording compared PortalContext with QueryContext, but the latter no
  longer exists.

- Stop a few regression tests from needlessly disabling GEQO. This was
  necessary in 1997, when geqo_threshold did not exist, but it is no
  longer needed.

- Code cleanup: use "bool" for Boolean variables, rather than "int".

Michael Meskes committed:

- Applied patch send by Joachim Wieland to fix INTEGER_DATETIMES under

- Applied Joachim Wieland's patch for ecpg_config.h creation on Vista.
  Changed variable test to not run into infinite loops on backend

== Rejected Patches (for now) ==

No one was disappointed this week :-)

== Pending Patches ==

