Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Jameison Martin <jameisonb(at)yahoo(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Date: 2012-08-09 10:14:33
Message-ID: CA+U5nMJaE_3bR8mw5v-PQKJd2bR1gn0qFN5aeP4qFo8Zu9+_fg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17 April 2012 17:22, Jameison Martin <jameisonb(at)yahoo(dot)com> wrote:

> The following patch truncates trailing null attributes from heap rows to
> reduce the size of the row bitmap.

> The intuition for this change is that ALTER TABLE t ADD COLUMN c type NULL
> is a metadata only change. Postgres works fine when a row's metadata (tuple
> descriptor) is inconsistent with the actual row data: extra columns are
> assumed to be null. This change just adjusts the number of attributes for a
> row and the row bitmap to only track up to the last non-null attribute.

This is an interesting patch, but its has had various comments made about it.

When I look at this I see that it would change the NULL bitmap for all
existing rows, which means it forces a complete unload/reload of data.
We've moved away from doing things like that, so in its current form
we'd probably want to reject that.

If I might suggest a way forward?

Keep NULL bitmaps as they are now. Have another flag which indicates
when a partial trailing col trimmed NULL bitmap is in use. Then we can
decide whether a table will benefit from full or partial bitmap and
set that in the tupledesc. That way the tupledesc will show
heap_form_tuple which kind of null bitmap is preferred for new tuples.
That preference might be settable by user on or off, but the default
would be for postgres to decide that for us based upon null stats etc,
which we would decide at ANALYZE time.

That mechanism is both compatible with existing on-disk formats and
means that the common path for smaller tables is unaffected, yet we
gain the benefit of the patch for larger tables.

It would be good to see you take this all the way.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2012-08-09 10:30:48 Re: [WIP] Performance Improvement by reducing WAL for Update Operation
Previous Message Magnus Hagander 2012-08-09 10:10:21 Bugs in superuser_reserved_connections and max_wal_senders vs max_connections