Re: [PERFORM] Terrible performance on wide selects

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Dann Corbit <DCorbit(at)connx(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Steve Crawford <scrawford(at)pinpointresearch(dot)com>, pgsql-performance(at)postgreSQL(dot)org, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [PERFORM] Terrible performance on wide selects
Date: 2003-01-23 10:28:21
Message-ID: 1043317701.2348.32.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Dann Corbit kirjutas N, 23.01.2003 kell 02:39:
> [snip]
> > For the disk case, why not have the start of the record
> > contain an array of offsets to the start of the data for each
> > column? It would only be necessary to have a list for
> > variable fields.
> >
> > So (for instance) if you have 12 variable fields, you would
> > store 12 integers at the start of the record.
>
> You have to store this information anyway (for variable length objects).
> By storing it at the front of the record you would lose nothing (except
> the logical coupling of an object with its length). But I would think
> that it would not consume any additional storage.

I don't think it will win much either (except for possible cache
locality with really huge page sizes), as the problem is _not_ scanning
over big strings finding their end marker, but instead is chasing long
chains of pointers.

There could be some merit in the idea of storing in the beginning of
tuple all pointers starting with first varlen field (16 bit int should
be enough)
so people can minimize the overhead by moving fixlen fields to the
beginning. once we have this setup, we no longer need the varlen fields
/stored/ together with field data.

this adds complexity of converting form (len,data) to ptr,...,data) when
constructing the tuple

as tuple (int,int,int,varchar,varchar)

which is currently stored as

(intdata1, intdata2, intdata3, (len4, vardata4), (len5,vardata5))

should be rewritten on storage to

(ptr4,ptr5),(intdata1, intdata2, intdata3, vardata4,vardata5)

but it seems to solve the O(N) problem quite nicely (and forces no
storage growth for tuples with fixlen fields in the beginning of tuple)

and we must also account for NULL fields in calculations .

--
Hannu Krosing <hannu(at)tm(dot)ee>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2003-01-23 10:30:48 Re: Terrible performance on wide selects
Previous Message Hannu Krosing 2003-01-23 10:11:08 Re: Terrible performance on wide selects

Browse pgsql-performance by date

  From Date Subject
Next Message Hannu Krosing 2003-01-23 10:30:48 Re: Terrible performance on wide selects
Previous Message Hannu Krosing 2003-01-23 10:11:08 Re: Terrible performance on wide selects