Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Date: 2009-07-31 15:09:50
Message-ID: 14A8765D-B477-47E9-BA85-AB76A68AFEC3@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Finally some update on the patch.

Le 18 juil. 09 à 20:55, Robert Haas a écrit :
> This is one of the things that I hate about the requirement to post
> context diffs: filterdiff, at least for me, strips out the git tags
> that indicate the base rev of the patch.

Yes, and as I didn't have the time to install filterdiff I've attached
a revision of your patch in git format which adresses the problem I
mentioned, in a tarball also containing raw notes, tests, and
regression.{out,diffs}.

mqke check is failing on opr_sanity test in what looks like an
ordering issue, but I didn't feel confident enough to adapt the .out
to force the regression into passing.

>> - style issue, convert at PQgetvalue() time
>
> Ah, good catch.

Fixed in the updated version, which uses a float4 variable in pg_dump
and strtod() rather than atof. This last point is maybe overkill as
I'm using:
tbinfo->attdistinct[j] = strtod(PQgetvalue(res, j, i_attdistinct),
(char **)NULL);

>> What about adding the following before the switch, to do like
>> surrounding
>> code?
>> Assert(IsA(newValue, Integer) || IsA(newValue, Float));
>
> Not a good plan. In my experience, gcc doesn't like switch ()
> statements over enums that don't list all the values, unless they have
> a default clause; it masterminds by giving you a warning that you've
> "inadvertently" left out some values.

I've left this part alone but still wonders about this, which is a new
user visible error message kind:
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(newValue));

>> Given your revised version I'll try to play around with ndistinct
>> behavior
>> and exercise dump and restore, etc, but for now I'll pause my work.

I failed to have 0 to allow for analyze to compute a value again, as
shown in the raw notes in attachement:

dim=# alter table foo alter column x set distinct 1;
ALTER TABLE
...
dim=# alter table foo alter column x set distinct 0;
ALTER TABLE
dim=# analyze verbose foo;
INFO: analyzing "public.foo"
INFO: "foo": scanned 4 of 4 pages, containing 1000 live rows and 0
dead rows; 1000 rows in sample, 1000 estimated total rows
ANALYZE
dim=# select attname, attdistinct from pg_attribute where attrelid =
'foo'::regclass and attname = 'x';
attname | attdistinct
---------+-------------
x | 0
(1 row)

What I understand from the doc part of your work contradicts what I
see here...

Regards,
--
dim

Will mark as Waiting on Author, as I need Robert to tell me what to do
next.

Attachment Content-Type Size
ndistinct.tgz application/octet-stream 13.2 KB
unknown_filename text/plain 4 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-07-31 15:10:47 Re: Review: Revise parallel pg_restore's scheduling heuristic
Previous Message Tom Lane 2009-07-31 13:51:57 Re: More thoughts on sorting