Skip site navigation (1) Skip section navigation (2)

New WIP patch for cross column statistics Re: TEXT vs PG_NODE_TREE in system columns (cross column and expression statistics patch)

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Juergen Schoenig <hs(at)cybertec(dot)at>
Subject: New WIP patch for cross column statistics Re: TEXT vs PG_NODE_TREE in system columns (cross column and expression statistics patch)
Date: 2011-08-04 12:13:18
Message-ID: 4E3A8CDE.7020409@cybertec.at (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi,

2011-04-28 17:20 keltezéssel, Alvaro Herrera írta:
> Excerpts from Boszormenyi Zoltan's message of jue abr 28 11:03:56 -0300 2011:
>> Hi,
>>
>> attached is the WIP patch for cross-column statistics and
>> extra expression statistics.
>>
>> My question is that why pg_node_tree is unusable as
>> syscache attribute? I attempted to alias it as text in the patch
>> but I get the following error if I try to use it by setting
>> USE_SYSCACHE_FOR_SEARCH to 1 in selfuncs.c.
>> Directly using the underlying pg_statistic3 doesn't cause an error.
> Two comments:
> 1. it seems that expression stats are mostly separate from cross-column
> stats; does it really make sense to submit the two in the same patch?
>
> 2. there are almost no code comments anywhere
>
> 3. (bonus) if you're going to copy/paste pg_attribute.h verbatim into
> the new files, please remove the bits you currently have in "#if 0".
> (Not to mention the fact that the new catalogs seem rather poorly
> named).

OK, we went to a different route this time. Here is what we came
up with. Attached are two patches.

attnum-int2vector.patch implements:

- int2vector support routines and catalog entries for them
- pg_statistic is modified so "staattnum int2" it converted to
  "staattnums int2vector". RemoveStatistics() is modified to take
  an array of AttrNumber and the length of it.
- pg_attribute.attstattarget is moved to pg_statistic.statarget,
  pg_statistic gains a new "stavalid" bool field. Two support routines
  are added: AddStatistics() and InvalidateStatistics(). Entries
  in pg_statistic for table columns are created upon table creation
  and ALTER TABLE ADD COLUMN and maintained for the lifetime
  of the column. Exceptions are system tables: calling AddStatistics()
  for them during initdb is a Catch-22 when pg_statistic doesn't yet
  exist. For these, ANALYZE creates the pg_statistic record just
  as before. ALTER TABLE ALTER COLUMN SET DATA TYPE
  only invalidates the record by setting "stavalid" to false.
- Factor out common code for getting the statistics tuple into a
  new function called validate_statistics().

cross-col-syntax.patch builds on the first patch and implements:

CREATE CROSS COLUMN STATISTICS ON TABLE tabname (col, ...)
     [ WITH ( statistics_target ) ] ;

DROP CROSS COLUMN STATISTICS ON TABLE tabname (col, ...) ;

CREATE CROSS COLUMN STATISTICS ON INDEX idxname
     [ WITH ( statistics_target ) ] ;

DROP CROSS COLUMN STATISTICS ON INDEX idxname ;

and puts new records into pg_statistic with array_length(staattnums, 1) > 1.
Note: this patch should record dependencies on the respective table or
index and the fields but doesn't.

The data structure for storing the N-dimension histogram is not yet decided.

Comments?

Best regards,
Zoltán böszörményi

-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
     http://www.postgresql.at/


Attachment: cross-col-syntax.patch
Description: text/plain (15.7 KB)
Attachment: attnum-int2vector.patch
Description: text/plain (73.2 KB)

In response to

pgsql-hackers by date

Next:From: Robert HaasDate: 2011-08-04 12:32:47
Subject: Re: TRUE/FALSE vs true/false
Previous:From: Boszormenyi ZoltanDate: 2011-08-04 10:08:21
Subject: TRUE/FALSE vs true/false

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group