Re: PostgreSQL roadmap for 8.2 and beyond.

From: Rod Taylor <pg(at)rbt(dot)ca>
To: karen hill <karen_hill22(at)yahoo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PostgreSQL roadmap for 8.2 and beyond.
Date: 2005-10-16 01:16:06
Message-ID: 1129425366.33171.36.camel@home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2005-10-14 at 09:57 -0700, karen hill wrote:
> Autovacuum is getting put into the 8.1 release which
> is awesome. A lot of us are wondering now that
> PostgreSQL has all the features that many of us need,
> what are the features being planned for future
> releases?

You know, as PostgreSQL becomes more advanced I find the features on my
"wanted" list growing instead of shrinking.

The reason for this is that I use it in wider and more varied
situations.

I am fairly sure there are easily 5 years worth of work remaining at the
current development pace.

> What do you see for 8.2 and beyond? What type of
> features are you devs planning for 9.0? It would be

Here is a summary of the last time this question was asked. Around when
8.0 was about to be released so a small percentage of these might be
done.

Of course, there is also everything in the TODO list and a large part of
the SQL Specs to be implemented on top of all of the below.

http://www.postgresql.org/docs/faqs.TODO.html
http://www.postgresql.org/docs/8.0/interactive/unsupported-features-sql-standard.html

Dave Fetter:
* optional interface which sends a row typeoid along with each
row in a result set
* more visibility from RULEs into the expression tree generated
by the parser and/or other RULEs
* SQL/MED (or at least things that would make it easier to
implement)
* Debugging hooks into all the PLs
* Some way of estimating a "query progress meter" for
long-running queries
* MULTISET, COLLECT, UNNEST, FUSION, INTERSECT

MERGE! MERGE! MERGE! MERGE! MERGE! MERGE!

Gavin Sherry:
Grouping sets
Recursive queries
Window functions
Updatable views
Updatable cursors
Materialised views
Debug-able PL/PgSQL -- EXPLAIN [ANALYZE] functionality, step
through?
Cost estimation for functions -- perhaps a pipe dream, I know

Performance:

Better bulk load
'Continuous' vacuum at a fraction of the IO cost of normal
vacuum
Multimaster replication
General OLTP throughput improvements -- where and how, I'm not
sure.

Indexes:

Bitmap indexes (as opposed to bitmap scans)

Merlin Moncure:
1. Proper row constructor, such that
select (1,2,1) > (2,1,1);
returns the right answer,
and
select * from t where (t1,t2,t3) > (c1, c2, c3) order by
t1,t2,t3 limit
1
returns the right answer and uses a index on t1,t2,t3 if it
exists.

this is on the TODO.

2. In the planner, a parameterized limit for prepared statements
to
assume a small value (like 1).

3. Ability to create arrays of composite types (and nest them).

William Zhang:
* Updatable Views per SQL
* INTERVAL data type per SQL
* BLOB/CLOB data type per SQL
* Faster bulk load
* Remove "current transaction is aborted, commands ignored ..."
* Compile with MSVC on Win32 platforms. MySQL support it.
* Thread safety libpq, ecpg.

Chris Browne:
- Vacuum Space Map - Maintain a map of recently-expired rows

This allows vacuum to target specific pages for possible
free
space without requiring a sequential scan.

- Deferrable unique constraint

- Probably trivially easy would be to add an index to
pg_listener

- Tougher but better would be to have pg_listener be an
in-memory
structure rather than being physically represented as a table

- MERGE / UPSERT

- Config file "#includes" for postgresql.conf, pg_hba.conf

- Some better ability to terminate backends

- Automatically updatable views (per SQL 99)

Ron Mayer:
Standards stuff:

* Updateable views (easier to use Ruby/Rails's ActiveRecord on
legacy data)
* The elementary OLAP stuff

Contrib related stuff:

* Contrib/xml2 working with XML Namespaces.
* Some sort of GIST index for querying XML data (XPath?
SQL/XML?)

* The array functions and indexes from contrib/intarray
and contrib/intagg made more general to work with other
data types. (I find these contrib modules quite useful)

Annoyances:

* more sane math with intervals. For example, try:
select '0.01 years'::interval, '0.01 months'::interval;

Ease of use:

* Nice defaults for autovacuum and checkpoints and bgwriter
that automatically avoid big I/O spikes by magically
distributing I/O in a nice way.

Easier COPY for client library authors:

* A way to efficiently insert many values like COPY from STDIN
from client libraries that don't support COPY from STDIN.
Perhaps it could happen through the apparently standards
compliant
"INSERT INTO table VALUES (1,2),(3,4),(5,6)" [feature id
F641]
or perhaps through a new
COPY tablename FROM STRING 'a big string instead of stdin'
feature that would be easier for clients to support?

It seems in most new client libraries COPY FROM STDIN
stays broken for quite a long time. Would a
alternative COPY FROM A_BIG_STRING be easier for them
to support and therefore available more often?

Meta-stuff

* A failover plus load-balancing (pgpool+slony?)
installer for dummies that handles simple cases.

* A single place to find all the useful non-core stuff
like projects on pgfoundry, gborg, contrib, and
various other places around the net (PL/R PL/Ruby Postgis).
Perhaps if the postgresql website had a small wiki
somewhere where anyone could add links with a short
description to any such projects it'd be easier to
know what's out there...

* Nice APIs and documentation [probably already exists]
to continue encouraging projects like PostGIS and PL/R
that IMHO are the biggest advantage of postgresql over
the commercial vendors' offerings.

Heikki Linnakangas:
* concurrent, partial vacuum that would for example only scan
pages that
happen to be in memory
* index-only scans
* database assertions

* lightwight PITR that wouldn't require to shut down and restore
a backup.
I'm thinking something like "REWIND TO xid 12345". It could be
implemented
by just setting already-committed transactions as aborted in the
clog
(vacuum and commit status hint bits need to be disabled
beforehand). This
would be very handy for automatic regression testing
applications. You
could load the test database just once, then run test case,
rewind, run
another test case, rewind and so on.

As more disruptive longer-term things:

* multiple alternative access plans for prepared statements. For
example,
if you have a query like "SELECT * FROM history WHERE timestamp
BETWEEN ?
AND ?", the optimal access plan depends a lot on the parameters.
Postgres
could keep all the plans that are optimal for some combination
of
parameters, and choose the most efficient one at execution time
depending
on the parameters. The execution side would actually be quite
simple to
implement. Introduce a new conditional node type that has > 1
child
nodes, and a condition that is evaluated at execution time and
determines
which child node to use. Determining the conditions would
require big
changes to the planner and estimation routines.

* support for Tutorial D as an alternative to SQL. It would be
great for
educational purposes.

My own wish list:
* Identity/generator support (per standard)
* Merge (update/insert as required)
* Multi-CPU sorts. Take a large single sort like an index
creation and split the work among multiple CPUs.

--

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2005-10-16 05:45:18 Re: Comments on columns in the pg_catalog tables/views
Previous Message Neil Conway 2005-10-16 00:13:32 Re: [HACKERS] roundoff problem in time datatype