PostgreSQL Weekly News - March 25 2007

== PostgreSQL Weekly News - March 25 2007 ==

The deadline for Summer of Code projects has been extended to Monday,
March 26th.  Get your submissions in!

Pavel Stehule has published his Czech language training materials.

== PostgreSQL Product News ==

Devrim GUNDUZ pushed postgresql-dbi-link to Fedora and Red Hat
Enterprise Linux Extras (EPEL).  This is the first PostgreSQL-related
package in EL.

PostgreSQL Maestro 7.3 released.

GT portalBase released.

phpPgAdmin 4.1.1 released.

Slony-I 1.1.9 and 1.2.9 are out.  Schedule those upgrades :)

Sparsegraph 0.1 released.

== PostgreSQL Jobs for March ==

== PostgreSQL Local ==

Everything this week was global.

== PostgreSQL in the News ==

Planet PostgreSQL:

General Bits, Archives and occasional new articles:

PostgreSQL Weekly News is brought to you this week by David Fetter,
Devrim GUNDUZ and Robert Treat.

To get your submission into the upcoming issue, make sure it arrives
at david(at)fetter(dot)org or in German at pwn(at)pgug(dot)de by Sunday at 3:00pm
Pacific Time.

== Applied Patches == 

Tatsuo Ishii committed:

- Add new encoding EUC_JIS_2004 and SHIFT_JIS_2004, along with new
  conversions among EUC_JIS_2004, SHIFT_JIS_2004 and UTF-8.  Bumped
  catalog version.

- Allow 4 bytes UTF-8 (UCS-4 range 00010000-001FFFFF). This is
  necessary to support JIS X 0213 <--> UTF-8 conversion.

Alvaro Herrera committed:

- Set the node properly, per Tom.

- Separate the code to start a new worker into its own function.  The
  code is exactly the same, modulo whitespace.

- Separate fetch of pg_autovacuum tuple into its own function.

- We no longer need to palloc the VacuumStmt node; keeping it on the
  stack is simpler.

- Remove the currently unused FRONTEND case in dllist.c.  This allows
  the usage of palloc instead of malloc, which means a list can be
  freed simply by deleting the memory context that contains it.

Teodor Sigaev committed:

- In contrib/tsearch2/wordparser/parser.c, fix parser bug on Windows
  with UTF8 encoding and C locale, the reason was sizeof(wchar_t) = 2
  instead of 4.

Jan Wieck committed:

- Add three new regexp functions: regexp_matches,
  regexp_split_to_array, and regexp_split_to_table. These functions
  provide access to the capture groups resulting from a POSIX regular
  expression match, and provide the ability to split a string on a
  POSIX regular expression, respectively. Patch from Jeremy Drake;
  code review by Neil Conway, additional comments and suggestions from
  Tom and Peter E.  This patch bumps the catversion, adds some
  regression tests, and updates the docs.

- Bumping catversion due to changes to pg_trigger and pg_rewrite.

- Change pg_trigger and extend pg_rewrite in order to allow triggers
  and rules to be defined with different, per session controllable,
  behaviors for replication purposes.

Tom Lane committed:

- Seems some people have been forgetting to run autoheader.

- Add -lcrypto as one of the possible link dependencies of libkrb5.
  Per report from Jim Rosenberg.  This possibly should get
  back-patched, but I'm a bit suspicious of it still because of the
  lack of prior reports.

- Fix plancache's invalidation callback to do the right thing for a SI
  reset event, namely invalidate everything.  This oversight probably
  explains the rare failures that some buildfarm machines have been
  showing for the plancache regression test.

- Make _SPI_execute_plan pass the query source string down to
  ProcessUtility if possible.  I had left this undone in the first
  pass at the API change for ProcessUtility, but forgot to revisit it
  after the plancache changes made it possible to do it.

- Remove the prohibition on executing cursor commands through
  SPI_execute.  Vadim had included this restriction in the original
  design of the SPI code, but I'm darned if I can see a reason for it.
  I left the macro definition of SPI_ERROR_CURSOR in place, so as not
  to needlessly break any SPI callers that are checking for it, but
  that code will never actually be returned anymore.

- Clean up the representation of special snapshots by including a
  "method pointer" in every Snapshot struct.  This allows removal of
  the case-by-case tests in HeapTupleSatisfiesVisibility, which should
  make it a bit faster (I didn't try any performance tests though).
  More importantly, we are no longer violating portable C practices by
  assuming that small integers are distinct from all pointer values,
  and HeapTupleSatisfiesDirty no longer has a non-reentrant API
  involving side-effects on a global variable.  There were a couple of
  places calling HeapTupleSatisfiesXXX routines directly rather than
  through the HeapTupleSatisfiesVisibility macro.  Since these places
  had to be changed anyway, I chose to make them go through the macro
  for uniformity.  Along the way I renamed HeapTupleSatisfiesSnapshot
  to HeapTupleSatisfiesMVCC to emphasize that it's only used with
  MVCC-type snapshots.  I was sorely tempted to rename
  HeapTupleSatisfiesVisibility to HeapTupleSatisfiesSnapshot, but
  forebore for the moment to avoid confusion and reduce the likelihood
  that this patch breaks some of the pending patches.  Might want to
  reconsider doing that later.

- Fix broken markup.

- Adjust DatumGetBool macro so that it isn't fooled by garbage in the
  Datum to the left of the actual bool value.  While in most cases
  there won't be any, our support for old-style user-defined functions
  violates the C spec to the extent of calling functions that might
  return char or short through a function pointer declared to return
  "char *", which we then coerce to Datum.  It is not surprising that
  the result might contain garbage high-order bits ... what is
  surprising is that we didn't see such cases long ago.  Per report
  from Magnus.

- Fix plancache so that any required replanning is done with the same
  search_path that was active when the plan was first made.  To do
  this, improve namespace.c to support a stack of "override" search
  path settings (we must have a stack since nested replan events are
  entirely possible).  This facility replaces the "special namespace"
  hack formerly used by CREATE SCHEMA, and should be able to support
  per-function search path settings as well.

- Arrange for PreventTransactionChain to reject commands submitted as
  part of a multi-statement simple-Query message.  This bug goes all
  the way back, but unfortunately is not nearly so easy to fix in
  existing releases; it is only the recent ProcessUtility API change
  that makes it fixable in HEAD.  Per report from William Garrison.  

- Allow DROP TABLESPACE to succeed (with a warning) if the pg_tblspc
  symlink doesn't exist.  This allows DROP to be used to clean out the
  pg_tablespace catalog entry in a situation where a previous DROP
  attempt failed before committing but after having removed the
  directories and symlink.  Per report from William Garrison.  Even
  though his test case depends on an unrelated bug in
  PreventTransactionChain, it's certainly possible for this situation
  to arise due to other problems, eg a system crash at just the right

- Fix some problems with selectivity estimation for partial indexes.
  First, genericcostestimate() was being way too liberal about
  including partial-index conditions in its selectivity estimate,
  resulting in substantial underestimates for situations such as an
  indexqual "x = 42" used with an index on x "WHERE x >= 40 AND x <
  50".  While the code is intentionally set up to favor selecting
  partial indexes when available, this was too much...  Second,
  choose_bitmap_and() was likewise easily fooled by cases of this
  type, since it would similarly think that the partial index had
  selectivity independent of the indexqual.  Fixed by using
  predicate_implied_by() rather than simple equality checks to
  determine redundancy.  This is a good deal more expensive but I
  don't see much alternative.  At least the extra cost is only paid
  when there's actually a partial index under consideration.  Per
  report from Jeff Davis.  I'm not going to risk back-patching this,

- Further buildfarm experience shows that actually we can't run the
  plancache test in parallel with the rules test at all, because the
  former wants to create a couple of temp views, which can sometimes
  show up in the latter's output.  Let's try it in the next parallel
  group instead.

- Fix 8.2 breakage of domains over array types, and add a regression
  test case to cover it.  Per report from Anton Pikhteryev.

Bruce Momjian committed:

- Add to TODO: "Allow BEFORE INSERT triggers on views."

- Add to TODO: "Add more logical syntax CLUSTER table ORDER BY index;
  support current syntax for backward compatibility."

- Document that LDAP URLs should be double-quoted in pg_hba.conf
  because commas are often present in the URL.  Backpatch to 8.2.X.

- Cleanup for procarray.c.

- Add to TODO: "Fix cases where invalid byte encodings are accepted by
  the database, but throw an error on SELECT."

- In pgsql/src/bin/pg_dump/pg_dump.c, add comment that pg_dump
  'append' format is used only by pg_dump, per Dave Page.

- Add to TODO in the CLUSTER section: "Add VERBOSE option to report
  tables as they are processed, like VACUUM VERBOSE."

- Nikolay Samokhvalov's version of xmlpath().

- In pg_dump, change strcasecmp to pg_strcasecmp.

- Nikolay Samokhvalov's patch which adds xmlpath() to evaluate XPath
  expressions, with namespaces support.

- Allow the pgstat process to restart immediately after a receiving
  SIGQUIT signal, rather than waiting for PGSTAT_RESTART_INTERVAL.

- Properly enforce pg_dump -F formation options; only single letter or
  full words support, per report from Mark Stosberg.

- Remove tabs from SGML files.

- Add to TODO: "During index creation, pre-sort the tuples to improve
  build speed."

- Remove TODO item, not wanted: "Add NUMERIC division operator that
  doesn't round?"

- Add URL for TODO: "Add locale-aware MONEY type, and support multiple

- Add URL for TODO: "Allow accurate statistics to be collected on
  indexes with more than one column or expression indexes, perhaps
  using per-index statistics."

- In FAQ, reference upgrade info via URL.

- In FAQ_DEV, Remove last line of patch license, per Andreas

- Add URL for TODO: "Simplify ability to create partitioned tables."

- Add to TODO: Allow sequential scans to take advantage of other
  concurrent sequential scans, also called "Synchronised Scanning"

Magnus Hagander committed:

- Add support for installing NLS files, and update support to use
  gettext from gnuwin32.

- In pgsql/src/tools/msvc/, install contrib sql and readme

- In pgsql/src/tools/msvc/, properly parse the name of
  contrib modules that aren't named the same way as their directory
  (notably xml2/pgxml and intarray/_int).

- Forgot commit: support for special-cases in pgcrypto in

- Add support for running contribcheck on MSVC.

- Make the MSVC build generate SQL files for /contrib based on

- In pgsql/src/tools/msvc/, add define to exclude
  configured libraries, to be able to easily build a stripped down
  version of libpq.  To be used by the installer.

- Remove stray headers for old sysv shmem emulation from
  pgsql/src/include/port/win32.h.  Also remove headers for old sysv
  semaphore emulation that were forgotten when that was changed about
  a year ago.

- Added file needed for PL regression test to pgsql/src/tools/msvc/

- Add documentation about vcregress.

- In  pgsql/src/tools/msvc/vcregress.bat, add support for running
  regression tests on procedural languages.

- Properly return exitcode when regression tests fails.

- Native shared memory implementation for win32.  Uses same underlying
  tech as before, but not the sysv emulation layer.

- In pgsql/src/tools/msvc/, change ecpglib to require
  libpgport, per Andrew Dunstan.

== Rejected Patches (for now) ==

Zdenek Kotala's patch to allow people to switch from default build-in
timezone to another timezone source - typically to OS timezone
location.  This was considered too invasive compared to the
alternative of removing the old file and symlinking to a new one.

Bruce Momjian's patch to improve how quickly VACUUM can expire rows.
It missed some important cases including CURSORs.

== Pending Patches ==

Magnus Hagander sent in a patch that lets people pull some stats out
of the bgwriter in order to track things.

Heikki Linnakangas sent in a patch intended to make CLUSTER an
MVCC-safe operation.

Jeff Davis sent in another revision of his Synchronized Scan patch,
this time reporting every 16 pages.

ITAGAKI Takahiro sent in another version of his Load distributed
checkpoint patch.

Pavan Deolasee sent in another revision of his HOT WIP patch.

Alvaro Herrera sent in a WIP patch to implement multiple concurrent
workers in autovacuum.

