ALTER SET DISTINCT vs. Oracle-like DBMS_STATS

From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: ALTER SET DISTINCT vs. Oracle-like DBMS_STATS
Date: 2009-07-06 10:28:57
Message-ID: 20090706185545.9DF3.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

A new feature "ALTER TABLE ... ALTER COLUMN ... SET DISTINCT" is
submitted to the next commetfest:
http://archives.postgresql.org/message-id/603c8f070905041913r667b3f32oa068d758ba5f17e1@mail.gmail.com

but I have another approach for the plan stability issues. It might conflict
ALTER SET DISTINCT patch in terms of duplicated functionality, so I want to
discuss them.

It is just similar to Oracle's DBMS_STATS package.
http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_stats.htm
If it were, "ALTER TABLE tablename ALTER COLUMN 3rd-column SET DISITNCT 100"
could be written as:

INSERT INTO dbms_stats.columns(starelid, ataattnum, stadistinct)
VALUES ('tablename'::regclass, 3, 100);

Of course ALTER SET DISTINCT is easy-to-use, so it could be an alias for
the above INSERT command.

The "DBMS_STATS for Postgres" is based on new statstics hooks in 8.4 --
get_relation_info_hook, get_attavgwidth_hook, get_relation_stats_hook
and get_index_stats_hook. The module has dbms_stats.relations and
dbms_stats.columns tables and hides pg_class and pg_statistics when enabled.
So, if once you set a value to dbms_stats.columns.stadistinct, the value
hides pg_statistics.stadistinct and planner always uses it for planning.

You can modify statistics of your tables by inserting values directly
to relations and columns tables. Also lock() or unlock() functions
are useful to use a bit customized stats based on existing values.

- TABLE dbms_stats.relations : hide pg_class.relpages, reltuples.
- TABLE dbms_stats.columns : hide pg_statistic.
- FUNCTION dbms_stats.lock() : copy pg_class and pg_statistic to the above tables.
- FUNCTION dbms_stats.unlock(): delete some rows from the above tables.

The module also supports backup-statstics feature.

- TABLE dbms_stats.backup, relations_backup, columns_backup
- FUNCTION dbms_stats.backup() : backup statistics to the above tables.
- FUNCTION dbms_stats.restore() : restore statistics from
- FUNCTION dbms_stats.export() : export statistics to external text file.
- FUNCTION dbms_stats.import() : import statistics from external text file.

If acceptable, I'd like to submit DBMS_STATS for Postgres module
to September commitfest. I'm not sure the feature should be in core,
in contrib, or in pgFoundry... Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2009-07-06 10:56:41 Re: WIP: generalized index constraints
Previous Message Simon Riggs 2009-07-06 10:25:59 Re: FYI: fdatasync vs sync_file_range