Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Date: 2009-04-05 02:56:21
Message-ID: 603c8f070904041956i18fc51e6p2236bb06aaab36d6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 4, 2009 at 10:31 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Robert Haas escribió:
>> On Sat, Apr 4, 2009 at 7:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> >> Per previous discussion.
>> >> http://archives.postgresql.org/message-id/8066.1229106059@sss.pgh.pa.us
>> >> http://archives.postgresql.org/message-id/603c8f070904021926g92eb55sdfc68141133957c1@mail.gmail.com
>> >
>> > I'm not thrilled about adding a column to pg_attribute for this.
>> > Isn't there some way of keeping it in pg_statistic?
>>
>> I don't like the idea of keeping it in pg_statistic.  Right now, all
>> of the data in pg_statistic is transient, so you could theoretically
>> truncate the table at any time without losing anything permanent.
>
> Maybe use a new catalog?

If we go that route, we would probably make sense to move
attstattarget there as well. Obviously it wouldn't make sense to move
anything that's in the critical path of ordinary database operations,
but maybe attislocal or attinhcount could be moved as well. But I'm
not sure it's really warranted because, AFAIK, we have no evidence
that this is a real as opposed to a theoretical problem, and even if
we moved all of that stuff, that's only 12 bytes, and now you have
another table that's competing for space in the system cache. If
someone could demonstrate (say, by reducing NAMEDATALEN) that a
smaller pg_attribute structure would generate a real performance
benefit, then it would be worth spending the time to figure out a way
to make that happen (obviously without actually reducing NAMEDATALEN,
that's only a possible way to measure the impact).

>> What is the specific nature of your concern?  I thought about the
>> possibility of a distributed performance penalty that might be
>> associated with enlarging pg_attribute, but increasing the size of a
>> structure that is already 112 bytes by another 4 doesn't seem likely
>> to be significant, especially since we're not crossing a power-of-two
>> boundary.
>
> FWIW it has been said that whoever is concerned about pg_attribute bloat
> should be first looking at getting rid of the redundant entries for BN
> system columns, for each and every table.

That's a different kind of bloat (more rows vs. larger rows) but a
valid point all the same. I suspect neither type has much practical
impact, and that if we listed all the performance problems that
PostgreSQL has today, neither would be in the top 500. Bad ndistinct
estimates would be, however.

...Robert

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-04-05 03:14:04 Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Previous Message Tom Lane 2009-04-05 02:56:02 Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT