Re: Terrible performance on wide selects

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Steve Crawford" <scrawford(at)pinpointresearch(dot)com>, <pgsql-performance(at)postgreSQL(dot)org>, <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Terrible performance on wide selects
Date: 2003-01-23 00:21:18
Message-ID: D90A5A6C612A39408103E6ECDD77B8294CD863@voyager.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Wednesday, January 22, 2003 4:18 PM
> To: Dann Corbit
> Cc: Steve Crawford; pgsql-performance(at)postgreSQL(dot)org;
> pgsql-hackers(at)postgreSQL(dot)org
> Subject: Re: [HACKERS] Terrible performance on wide selects
>
>
> "Dann Corbit" <DCorbit(at)connx(dot)com> writes:
> > Why not waste a bit of memory and make the row buffer the maximum
> > possible length? E.g. for varchar(2000) allocate 2000 characters +
> > size element and point to the start of that thing.
>
> Surely you're not proposing that we store data on disk that way.
>
> The real issue here is avoiding overhead while extracting
> columns out of a stored tuple. We could perhaps use a
> different, less space-efficient format for temporary tuples
> in memory than we do on disk, but I don't think that will
> help a lot. The nature of O(N^2) bottlenecks is you have to
> kill them all --- for example, if we fix printtup and don't
> do anything with ExecEvalVar, we can't do more than double
> the speed of Steve's example, so it'll still be slow. So we
> must have a solution for the case where we are disassembling
> a stored tuple, anyway.
>
> I have been sitting here toying with a related idea, which is
> to use the heap_deformtuple code I suggested before to form
> an array of pointers to Datums in a specific tuple (we could
> probably use the TupleTableSlot mechanisms to manage the
> memory for these). Then subsequent accesses to individual
> columns would just need an array-index operation, not a
> nocachegetattr call. The trick with that would be that if
> only a few columns are needed out of a row, it might be a net
> loss to compute the Datum values for all columns. How could
> we avoid slowing that case down while making the wide-tuple
> case faster?

For the disk case, why not have the start of the record contain an array
of offsets to the start of the data for each column? It would only be
necessary to have a list for variable fields.

So (for instance) if you have 12 variable fields, you would store 12
integers at the start of the record.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-01-23 00:21:24 Re: [PERFORM] Proposal: relaxing link between explicit JOINs and execution order
Previous Message Tom Lane 2003-01-23 00:18:07 Re: Terrible performance on wide selects

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-01-23 00:21:24 Re: [PERFORM] Proposal: relaxing link between explicit JOINs and execution order
Previous Message Tom Lane 2003-01-23 00:18:07 Re: Terrible performance on wide selects