Skip site navigation (1) Skip section navigation (2)

Re: Terrible performance on wide selects

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Steve Crawford" <scrawford(at)pinpointresearch(dot)com>,<pgsql-performance(at)postgreSQL(dot)org>, <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Terrible performance on wide selects
Date: 2003-01-23 00:21:18
Message-ID: D90A5A6C612A39408103E6ECDD77B8294CD863@voyager.corporate.connx.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us] 
> Sent: Wednesday, January 22, 2003 4:18 PM
> To: Dann Corbit
> Cc: Steve Crawford; pgsql-performance(at)postgreSQL(dot)org; 
> pgsql-hackers(at)postgreSQL(dot)org
> Subject: Re: [HACKERS] Terrible performance on wide selects 
> 
> 
> "Dann Corbit" <DCorbit(at)connx(dot)com> writes:
> > Why not waste a bit of memory and make the row buffer the maximum 
> > possible length? E.g. for varchar(2000) allocate 2000 characters + 
> > size element and point to the start of that thing.
> 
> Surely you're not proposing that we store data on disk that way.
> 
> The real issue here is avoiding overhead while extracting 
> columns out of a stored tuple.  We could perhaps use a 
> different, less space-efficient format for temporary tuples 
> in memory than we do on disk, but I don't think that will 
> help a lot.  The nature of O(N^2) bottlenecks is you have to 
> kill them all --- for example, if we fix printtup and don't 
> do anything with ExecEvalVar, we can't do more than double 
> the speed of Steve's example, so it'll still be slow.  So we 
> must have a solution for the case where we are disassembling 
> a stored tuple, anyway.
> 
> I have been sitting here toying with a related idea, which is 
> to use the heap_deformtuple code I suggested before to form 
> an array of pointers to Datums in a specific tuple (we could 
> probably use the TupleTableSlot mechanisms to manage the 
> memory for these).  Then subsequent accesses to individual 
> columns would just need an array-index operation, not a 
> nocachegetattr call.  The trick with that would be that if 
> only a few columns are needed out of a row, it might be a net 
> loss to compute the Datum values for all columns.  How could 
> we avoid slowing that case down while making the wide-tuple 
> case faster?

For the disk case, why not have the start of the record contain an array
of offsets to the start of the data for each column?  It would only be
necessary to have a list for variable fields.

So (for instance) if you have 12 variable fields, you would store 12
integers at the start of the record.

Responses

pgsql-performance by date

Next:From: Tom LaneDate: 2003-01-23 00:21:24
Subject: Re: [PERFORM] Proposal: relaxing link between explicit JOINs and execution order
Previous:From: Tom LaneDate: 2003-01-23 00:18:07
Subject: Re: Terrible performance on wide selects

pgsql-hackers by date

Next:From: Tom LaneDate: 2003-01-23 00:21:24
Subject: Re: [PERFORM] Proposal: relaxing link between explicit JOINs and execution order
Previous:From: Tom LaneDate: 2003-01-23 00:18:07
Subject: Re: Terrible performance on wide selects

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group