Re: TODO list comments

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TODO list comments
Date: 2005-08-26 18:52:21
Message-ID: 200508261852.j7QIqLE22874@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Great updates! Let me comment on each one.

> I made a pass over the TODO list to see what was out of date.
>
> > * Allow administrators to safely terminate individual sessions either
> > via an SQL function or SIGTERM
> >
> > Currently SIGTERM of a backend can lead to lock table corruption.
>
> This comment may be out of date. Suggest
>
> Lock table corruption following SIGTERM of an individual backend
> has been reported in 8.0. A possible cause is fixed in 8.1, but
> it is unknown whether other trouble spots exist. This item is
> mainly a matter of doing adequate testing rather than of writing
> any new code.

Done.

>
> > o Allow postgresql.conf values to be set so they can not be changed
> > by the user
>
> Is that really a good idea? The ones that are unsafe are restricted already.

Well, a typical case would be log_statement, but I see that is
super-user now. I guess we are OK, removed. If we get more problems,
we can re-add something later.

> > * %Remove Money type, add money formatting for decimal type
>
> There's a fair-size contingent that doesn't want Money removed
> completely, but just reimplemented as an I/O wrapper around type
> numeric. Maybe that's even what you mean by the TODO item, but
> it's not clear. Please at least mention the alternative.

Updated:

* Improve the MONEY data type

Change the MONEY data type to use DECIMAL internally, with special
locale-aware output formatting.

> > o %Allow MIN()/MAX() on arrays
>
> This is done.

OK.

> > o Modify array literal representation to handle array index lower bound
> > of other than one
>
> This too.

OK.

>
> > o Add security checking for large objects
> >
> > Currently large objects entries do not have owners. Permissions can
> > only be set at the pg_largeobject table level.
>
> This comment is wrong: trying to set the permissions on pg_largeobject
> would have no effect whatsoever on the lo_xxx functions, so there is not
> even a partial solution available now.

Oh, comment removed.

> > o Auto-delete large objects when referencing row is deleted
>
> This should note that contrib/lo already offers a solution.

Done.

> > * %Have views on temporary tables exist in the temporary namespace
> > * Allow temporary views on non-temporary tables
>
> Both of these are done in 8.1.

OK.

> > * %Allow RULE recompilation
>
> Eh? Perhaps you meant "automatically regenerate cached plans when
> needed", in which case it's redundant with the Dependency Checking
> entries. Whatever it means, this doesn't seem a particularly simple
> item.

Agreed, updated to:

* Allow VIEW/RULE recompilation when the underlying tables change

> > * %Allow TRUNCATE ... CASCADE/RESTRICT
>
> Huh? What would that do?

I assume it is just like DELETE CASCADE, but it TRUNCATES rather than
DELETE. Description added.

> > * Make row-wise comparisons work per SQL spec
>
> This could probably be marked as a % item.

Done.

> > o Currently the system uses the operating system COPY command to
> > create a new database. Add ON COMMIT capability to CREATE TABLE AS
> > SELECT
>
> This seems a bit garbled, and anyway the first part is done.

Yep, garbled. I have removed the first part.

> > o %Add ALTER DOMAIN TYPE
>
> To do what, exactly? This is unclear.

I assume it would allow the underlying data type to be changed. Updated
text:

o Add ALTER DOMAIN to modify the underlying data type

> > o -Allow objects to be moved to different schemas
>
> This is only partly done --- the 8.1 patch didn't cover all object types.

Updated to:

o Add missing object types for ALTER ... SET SCHEMA

> > o %Disallow dropping of an inherited constraint
> > ...
> > o %Prevent child tables from altering constraints like CHECK that were
> > inherited from the parent table
>
> These seem to be duplicates, or at least in need of merging.

Merged and updated:

o %Prevent child tables from altering or dropping constraints
like CHECK that were inherited from the parent table

> > o Handle references to temporary tables that are created, destroyed,
> > then recreated during a session, and EXECUTE is not used
> >
> > This requires the cached PL/PgSQL byte code to be invalidated when
> > an object referenced in the function is changed.
>
> This is redundant with the Dependency Checking item about regenerating
> cached plans.

Removed and description added to dependency item:

* Track dependencies in function bodies and recompile/invalidate

This is particularly important for references to temporary tables
in PL/PgSQL because PL/PgSQL caches query plans. The only workaround
in PL/PgSQL is to use EXECUTE.

> > o Add table function support to pltcl, plperl, plpython?
>
> Isn't this done for plperl?

Right, plperl removed.

> > o Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)
>
> This doesn't seem like an amazingly good idea; would prefer to see a way
> to get the column name list and use names dynamically. Numbers have all
> the same problems as "SELECT *" ...

Yep, updated:

o Allow function argument names to be queries from PL/PgSQL

> > o Add MOVE to PL/pgSQL
>
> This should be generalized: upgrade plpgsql cursor support to have all
> the FETCH and MOVE options of the main language.
>
> > o Add support for polymorphic arguments and return types to plperl
>
> I think all the PLs except plpgsql need this.

Updated to:

o Add support for polymorphic arguments and return types to
languages other than PL/PgSQL

> Also, all the PLs except plpgsql are well behind the curve on supporting
> parameter names and OUT parameters. Please add TODO item(s) for these.

OK:

o Add support for OUT and INOUT parameters to languages other
than PL/PgSQL

> > * Allow libpq to access SQLSTATE so pg_ctl can test for connection failure
> >
> > This would be used for checking if the server is up.
>
> Huh? What has SQLSTATE got to do with connection failure checking?

I am confused too. What we have now is this check in pg_ctl.c:

(PQstatus(conn) == CONNECTION_OK ||
(strcmp(PQerrorMessage(conn),
PQnoPasswordSupplied) == 0)))

I assume it is to allow better checking of the password request, but I
am unsure. I will remove the item unless someone else understands it.
Or is it this from libpq, PG_DIAG_SQLSTATE? I think the underlying
problem is that pg_ctl doesn't have a good way to determine if the
server is running based on a failure to connect. Anyway, item removed
unless we get more reports of problems.

> > * Have initdb set DateStyle based on locale?
>
> Is this really a good idea? Being standardized on ISO format seems like
> a good thing to me, and encouraging people to adopt ambiguous formats as
> default a very bad thing. They can do it if they like, certainly, but
> having initdb do it for them just seems like not the direction we want.

True, but we still have to have a default of what do with with input
like 01/02/03. TODO updated:

* Have initdb set the input DateStyle (MDY or DMY) based on locale?

> > * Add a schema option to createlang
>
> This is superseded by events: createlang now puts the functions in
> pg_catalog, and there doesn't seem any particularly good reason to
> want to put them elsewhere.

OK, removed.

> > o Improve psql's handling of multi-line queries
>
> Uh, what's wrong with it? This item seems far too vague.

Later emails in this thread clarify this and I will address it there.

> > o Add pg_dumpall custom format dumps.
> >
> > This is probably best done by combining pg_dump and pg_dumpall
> > into a single binary.
>
> This is probably obsoleted by events, too. Now that we can dump blobs
> in text mode, I see no reason that we ever need to do this.
> pg_restore's only real reason to live is to support selective restore
> (ie, pulling out just a few objects from an existing dump) and I do not
> see that you need that for pg_dumpall dumps.

Hmm, good points. I added a question mark and removed the description.
If we add CSV output format to pg_dump, I can imagine pg_dumpall perhaps
optionally using that too.

> > o Remove unnecessary abstractions in pg_dump source code
>
> Like which?

Uh, we have talked about it. It is the complex function pointer usage
in the code that needs cleaning. Updated:

o Remove unnecessary function pointer abstractions in pg_dump source
code

> > * %Remove CREATE CONSTRAINT TRIGGER
> >
> > This was used in older releases to dump referential integrity
> > constraints.
>
> Do we really want to remove it, and thereby guarantee we can't load
> dumps from those old releases?

OK, TODO item removed, and I saw later discussion that is still useful.

> > * Fetch heap pages matching index entries in sequential order
> >
> > Rather than randomly accessing heap pages based on index entries, mark
> > heap pages needing access in a bitmap and do the lookups in sequential
> > order. Another method would be to sort heap ctids matching the index
> > before accessing the heap rows.
>
> This is done (see bitmap index scans).

OK.

> > * Hash
>
> Why doesn't the hash index section mention the need for WAL support?
> Multicolumn hash indexes might be interesting too.

Added:

o Add WAL logging for crash recovery
o Allow multi-column hash indexes

> > o Pack hash index buckets onto disk pages more efficiently
> >
> > Currently no only one hash bucket can be stored on a page. Ideally
> > several hash buckets could be stored on a single page and greater
> > granularity used for the hash algorithm.
>
> I think that should read "Currently only one ..."

Thanks, fixed.

> > * Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
>
> This item should probably point out that the optimal settings are
> certainly platform-dependent.

Added, though they are not optimizer per-platform. Rather we use the
best one we find, and assume the _best_ is the same on all platforms.

Updated:

* Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options

Ideally this requires a separate test program that can be run
at initdb time or optionally later.

> > * Improve the background writer
> >
> > Allow the background writer to more efficiently write dirty buffers
> > from the end of the LRU cache and use a clock sweep algorithm to
> > write other dirty buffers to reduced checkpoint I/O
>
> This is done.

OK.

> > * Improve speed with indexes
> >
> > For large table adjustements during vacuum, it is faster to reindex
> > rather than update the index.
>
> This applies only to VACUUM FULL, so it probably needs to be reworded.

Updated.

> > * Reduce lock time by moving tuples with read lock, then write
> > lock and truncate table
>
> Ditto.

Updated.

> > * Auto-vacuum
> >
> > o %Suggest VACUUM FULL if a table is nearly empty
>
> It seems like a fairly bad idea for auto-vacuum to do a VACUUM FULL
> ever, given the locking effects. And how is a background daemon going
> to "suggest" anything? It could write to the postmaster log but it's
> entirely likely the user would never notice.

Well, something in the logs is better than nothing. We already warn
about other server issues like too frequent checkpoints in the server
logs.

New TODO:

o %Issue log message to suggest VACUUM FULL if a table is nearly
empty?

Added a question mark because this clearly needs more thought.

> > * -Improve SMP performance on i386 machines
> >
> > i386-based SMP machines can generate excessive context switching
> > caused by lock failure in high concurrency situations. This may be
> > caused by CPU cache line invalidation inefficiencies.
>
> This isn't really done, I don't think, we just ameliorated what was the
> largest cause of it. It'll be back...

Yep, we can re-add it later once we get a report.
>
> > * Fix priority ordering of read and write light-weight locks (Neil)
>
> I think we concluded that was a bad idea.
>
> > * Add WAL index reliability improvement to non-btree indexes
>
> Oh, here it is. I'd be inclined to put this item in the index section
> instead, so that you can have separate entries for each index type.
> Besides, GIST is done.

Right, I think Hash is the only one left, and you suggested adding that
above. Item removed.

> > * -Use CHECK constraints to influence optimizer decisions
> >
> > CHECK constraints contain information about the distribution of values
> > within the table. This is also useful for implementing subtables where
> > a tables content is distributed across several subtables.
>
> This isn't completely done by any means.

Uh, isn't this constraint_elimination? The only problem is that it
isn't automatic. The remaining part seems useless now that the
optimizer statistics are pretty good. OK?

Added:

* Allow constraint_elimination to be automatically performed

This requires additional code to reduce the performance loss caused by
constraint elimination.

> > * ANALYZE should record a pg_statistic entry for an all-NULL column
>
> This is done.

OK

> > * Remove memory/file descriptor freeing before ereport(ERROR)
> > * Promote debug_query_string into a server-side function current_query()
> > * Allow the identifier length to be increased via a configure option
>
> All of those could probably be marked %.

Done.

> > * Allow cross-compiling by generating the zic database on the target system
> > * Fix cross-compiling of time zone database via 'zic'
>
> These look like duplicates to me ...

Yep, second one removed.

> > o Improve dlerror() reporting string
>
> I think this got done.

OK.

> > o Add support for Unicode
>
> This is done too, though not tested enough.

OK.

Great changes. All applied.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-08-26 18:59:52 Any MIPS assembly experts in the house?
Previous Message Junaili Lie 2005-08-26 18:47:58 Re: Call for 7.5 feature completion