Re: [GENERAL] Large databases, performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: [GENERAL] Large databases, performance
Date: 2002-10-08 15:51:12
Message-ID: 2502.1034092272@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers pgsql-performance pgsql-sql

Ron Johnson <ron(dot)l(dot)johnson(at)cox(dot)net> writes:
> On Tue, 2002-10-08 at 09:38, Tom Lane wrote:
>> That argument loses a lot of its force when you consider that Postgres
>> uses non-overwriting storage management. We never do an UPDATE in-place
>> anyway, and so it matters little whether the updated record is the same
>> size as the original.

> Must you update any relative indexes, in order to point to the
> new location of the record?

We make new index entries for the new record, yes. Both the old and new
records must be indexed (until one or the other is garbage-collected by
VACUUM) so that transactions can find whichever version they are
supposed to be able to see according to the tuple visibility rules.

>> It's not simpler: it's more complicated, because you need an additional
>> input item to figure out the size of any given column in a record.

> With fixed-length, why? From the metadata, you can compute the intra-
> record offsets.

Sure, but you need an additional item of metadata than you otherwise
would (this is atttypmod, in Postgres terms). I'm not certain that the
typmod is available everyplace that would need to be able to figure out
the physical width of a column.

> On that system, even variable-length records don't need record-size
> fields. Any repeating text (more that ~4 chars) is replaced with
> run-length encoding. This includes the phantom spaces at the end
> of the field.

Interesting that you should bring that up in the context of an argument
for supporting fixed-width fields ;-). Doesn't any form of data
compression bring you right back into variable-width land?

Postgres' approach to data compression is that it's done per-field,
and only on variable-width fields. We steal a couple of bits from the
length word to allow flagging of compressed and out-of-line values.
If we were to make CHAR(n) fixed-width then it would lose the ability
to participate in either compression or out-of-line storage.

Between that and the multibyte-encoding issue, I think it's very
difficult to make a case that the general-purpose CHAR(n) type should
be implemented as fixed-width. If someone has a specialized application
where they need a restricted fixed-width string type, it's not that
hard to make a user-defined type that supports only a single column
width (and thereby gets around the typmod issue). So I'm satisfied with
saying "define your own type if you want this".

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Johnson, Shaunn 2002-10-08 16:21:05 phpPgAdmin + PostgreSQL + authentication
Previous Message Josh Berkus 2002-10-08 15:33:53 Re: CHAR, VARCHAR, TEXT (Was Large Databases)

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-10-08 15:53:52 Re: inline newNode()
Previous Message Josh Berkus 2002-10-08 15:33:53 Re: CHAR, VARCHAR, TEXT (Was Large Databases)

Browse pgsql-performance by date

  From Date Subject
Next Message Ron Johnson 2002-10-08 17:42:20 Re: CHAR, VARCHAR, TEXT (Was Large Databases)
Previous Message Josh Berkus 2002-10-08 15:33:53 Re: CHAR, VARCHAR, TEXT (Was Large Databases)

Browse pgsql-sql by date

  From Date Subject
Next Message Ian Harding 2002-10-08 16:13:23 Re: Temporary tables and indexes
Previous Message Josh Berkus 2002-10-08 15:39:45 Re: [NOVICE] update question