Modifying update_attstats of analyze.c for C Strings

From: Ashoke <s(dot)ashoke(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Modifying update_attstats of analyze.c for C Strings
Date: 2014-07-08 05:22:00
Message-ID: CALpszJOkbYcGXehaLqDMqT6P-BfurPZOHT-ywGqzGMxE+R3gSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I am trying to implement a functionality that is similar to ANALYZE, but
needs to have different values (the values will be valid and is stored in
inp->str[][]) for MCV/Histogram Bounds in case the column under
consideration is varchar (C Strings). I have written a function
*dummy_update_attstats* with the following changes. Other things remain the
same as in *update_attstats* of *~/src/backend/commands/analyze.c*

*---*
*{*

* ArrayType *arry;*
* if (*
*strcmp(col_type,"varchar") == 0*
*)*
* arry = construct_array(stats->stavalues[k],*
* stats->numvalues[k],*
* CSTRINGOID,*
* -2,*
* false,*
* 'c');*
* else*
* arry = construct_array(stats->stavalues[k],*
* stats->numvalues[k],*
* stats->statypid[k],*
* stats->statyplen[k],*
* stats->statypbyval[k],*
* stats->statypalign[k]);*
* values[i++] = PointerGetDatum(arry); /* stavaluesN */ }*
---

and I update the hist_values in the appropriate function as:
---

*if (strcmp(col_type,"varchar") == 0**)*
* hist_values[i] = datumCopy(CStringGetDatum(inp->str[i][j]),*
* false,*
* -2);*
*---*

I tried this based on the following reference :
http://www.postgresql.org/message-id/attachment/20352/vacattrstats-extend.diff

My issue is : When I use my way for strings, the MCV/histogram_bounds in
pg_stats doesn't have double quotes (" ") surrounding string. That is,

If normal *update_attstats* is used, histogram_bounds for *TPCH
nation(n_name)* are : *"ALGERIA ","ARGENTINA ",...*
If I use *dummy_update_attstats* as above, histogram_bounds for *TPCH
nation(n_name)* are : *ALGERIA,ARGENTINA,...*

This becomes an issue if the string has ',' (commas), like for example in
*n_comment* column of *nation* table.

Could someone point out the problem and suggest a solution?

Thank you.

--
Regards,
Ashoke

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-07-08 06:21:14 Re: better atomics - v0.5
Previous Message Ashutosh Bapat 2014-07-08 04:54:25 Re: Extending constraint exclusion for implied constraints/conditions