From: | Jan Urbański <j(dot)urbanski(at)students(dot)mimuw(dot)edu(dot)pl> |
---|---|
To: | rahulg <rahulgupta83(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Statistic Estimation in PostgreSQL |
Date: | 2008-10-15 19:56:04 |
Message-ID: | 48F64AD4.8080401@students.mimuw.edu.pl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
rahulg wrote:
> I am facing problem in tracing in what events the selectivity
> histogram in pg_statistic is stored/updated.
>
> I went through the code in src/backend/commands/analyze.c and got to
> see the code computing the histogram but when I tried to trace the
> caller of analyze_rel or compute_index_stats, I find out that only
> during Vaccum command, these functions are invoked. I am little
> surprised as I did get to see PostgreSQL giving good selectivity
> estimations despite me running Vaccuum command even once.
>
> I did try to trace the control flow during table operations
> (creation / insertion) but couldn't find any histogram updation
> functions being invoked though I did get to know that some functions
> related to catalog updations is being invoked.
>
> Can anybody tell me in what all events is the pg_statistic table
> exactly updated?
The flow is like this:
analyze_rel() determines the type-specific analyze function to call for
each column:
http://git.postgresql.org/?p=postgresql.git;a=blob;f=src/backend/commands/analyze.c;h=d96b7666f08d9ac74cccf05851fa8fd8fc2d4948;hb=HEAD#l270
the function OID gets pulled from pg_type, from the typanalyze column:
http://www.postgresql.org/docs/8.3/static/catalog-pg-type.html
currently there is only one datatype that has a custom analyze function
(tsvector). All others get analyzed using the standard one:
http://git.postgresql.org/?p=postgresql.git;a=blob;f=src/backend/commands/analyze.c;h=d96b7666f08d9ac74cccf05851fa8fd8fc2d4948;hb=HEAD#l1489
depending on whether the datatype has a < operator or not, the column is
analyzed using compute_scalar_stats() or compute_minimal_stats().
The actual call of the chosen function happens here:
http://git.postgresql.org/?p=postgresql.git;a=blob;f=src/backend/commands/analyze.c;h=d96b7666f08d9ac74cccf05851fa8fd8fc2d4948;hb=HEAD#l425
And the write to pg_statistic happens here:
http://git.postgresql.org/?p=postgresql.git;a=blob;f=src/backend/commands/analyze.c;h=d96b7666f08d9ac74cccf05851fa8fd8fc2d4948;hb=HEAD#l446
HTH,
Jan
--
Jan Urbanski
GPG key ID: E583D7D2
ouden estin
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2008-10-15 19:58:16 | Re: Deriving Recovery Snapshots |
Previous Message | Alvaro Herrera | 2008-10-15 19:25:33 | Re: Statistic Estimation in PostgreSQL |