Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Jameison Martin <jameisonb(at)yahoo(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Kevin Grittner <kgrittn(at)mail(dot)com>, josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Date: 2013-06-24 20:50:58
Message-ID: 21604.1372107058@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> If there's an actual performance consequence of applying this patch,
> then I think that's a good reason for rejecting it. But if the best
> argument we can come up with is that we might someday try to do even
> more clever things with the tuple's natts value, I guess I'm not very
> impressed. The reason why we have to rewrite the table when someone
> adds a column with a non-NULL default is because we need to store the
> new value in every row. Sure, we could defer that in this particular
> case. But users might be mighty dismayed to see CLUSTER or VACUUM
> FULL -- or a dump-and-reload! -- cause the table to become much LARGER
> than it was before. Having some kind of column-oriented compression
> would be darn nifty, but this particular path doesn't seem
> particularly promising to me.

The point of what I was suggesting isn't to conserve storage, but to
reduce downtime during a schema change. Remember that a rewriting ALTER
TABLE locks everyone out of that table for a long time.

> So, Tom, how's that pluggable storage format coming? :-)

Well, actually, it's looking to me like heap_form_tuple will be
underneath the API cut, because alternate storage managers will probably
have other tuple storage formats --- column stores being the poster
child here, but in any case the tuple header format is very unlikely
to be universal.

Which means that whether this patch gets applied to mainline is going
to be moot for Salesforce's purposes; they will certainly want the
equivalent logic in their storage code, because they've got tables with
many hundreds of mostly-null columns, but whether heap_form_tuple acts
this way or not won't affect them.

So unless we consider that many-hundreds-of-columns is a design center
for general purpose use of Postgres, we should be evaluating this patch
strictly on its usefulness for more typical table widths. And my take
on that is that (1) lots of columns isn't our design center (for the
reasons you mentioned among others), and (2) the case for the patch
looks pretty weak otherwise.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2013-06-24 21:02:02 Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]
Previous Message Jeff Janes 2013-06-24 20:42:45 Re: ALTER TABLE ... ALTER CONSTRAINT