ANALYZE patch for review

From: "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk>
To: <pgsql-patches(at)postgresql(dot)org>
Subject: ANALYZE patch for review
Date: 2004-01-23 16:54:06
Message-ID: 8F4A22E017460A458DB7BBAB65CA6AE512D0F6@openmanage
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Hi everyone,

Here is a first attempt at a patch to allow a customised ANALYZE
function to be specified for user-defined types, relating to the
following two threads:

http://archives.postgresql.org/pgsql-hackers/2003-10/msg00113.php and
http://archives.postgresql.org/pgsql-hackers/2004-01/msg00236.php

This is my first attempt at a patch for Postgres so I don't expect it to
get applied without significant review. Included along with this email
is a sample datatype that can be used to see that the ANALYZE function
is being called by inserting a fixed value into the pg_statistic table.

Some points:

1) This patch does nothing other than provide a mechanism for
the user to place
their own values in the pg_statistic table. Currently I am still
trying to complete
the loop by finding out how and in what format a GiST
selectivity function has
access to this data.

The user-defined function uses a new STATISTICS_KIND_CUSTOM
constant to hold the data as I am not 100% sure how the planner
interprets the
STATISTIC_KIND_* values in the pg_statistic table. My aim is to
be able to store
a 2D histogram so that spatial row count estimates can be used
by the planner
in PostGIS (http://postgis.refractions.net) without having to
maintain the
statistics by manually running a stored procedure. The
pg_statistic.h file doesn't
seem clear to me as to what the values of the various columns
should be when not
dealing with single one-dimensional histograms.....

2) All existing types are assigned the default analyze function
called
pg_analyze_alg_default().

3) The function is called within examine_attribute() to setup a
valid VacAttrStats
function if the column is analyzable and point to the algorithm
function itself
which currently uses the existing API of compute_scalar_stats()
and compute_minimal_stats().

4) The VacAttrStats structure may need to be moved out to a
different .h file
for access by the programmer (currently it is in
commands/vacuum.h) - maybe a
new analyze.h would be better?

5) Currently no thought has been given to providing statistics
on functional
indexes (see
http://archives.postgresql.org/pgsql-general/2004-01/msg00978.php)

In other words, the basic work is done but there is still some way to go
before it is complete. Hopefully by putting this out now other
pgsql-hackers will get the chance to review and refine the patch
(particularly in the area of functional indexes) so this
feature can make it out into 7.5.

Many thanks to the hackers who have spent their time helping me get this
far (particularly Tom).

Mark.

---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446

This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

Attachment Content-Type Size
mytype.sql application/octet-stream 558 bytes
analyze.patch application/octet-stream 81.9 KB
Makefile application/octet-stream 102 bytes
mytype.c application/octet-stream 1.6 KB

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Jan Wieck 2004-01-23 17:37:52 Re: pltcl - "Cache lookup for attribute" error - version
Previous Message Tom Lane 2004-01-23 15:52:51 Re: pltcl - "Cache lookup for attribute" error - version 2