Terrible performance on wide selects

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Terrible performance on wide selects
Date: 2003-01-17 19:37:26
Message-ID: 20030117193726.1CFCB103E5@polaris.pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers pgsql-performance

I have a table which is rather wide (~800 columns) and consists of a few
columns of identifying data (run time, channel and such) and up to several
hundred columns of collected data (no, normalization does not suggest putting
collected data in another table - collected item 1 always corresponds to
collected item 1 but is completely different than item 3).

My test table is very short (62 rows) but in production would grow by several
thousand rows per day. Unfortunately if my test data is correct, performance
on wide selects is so bad that it will render the system unusable.

Here's the test. I have created two versions of the table - one stores the
collected data in an array of text and the other stores the data in
individual columns, no joins, no indexes. Times are averages of many runs -
the times varied very little and the data is small enough that I'm sure it
was served from RAM. Postgres CPU utilization observed on the longer runs was
98-99%. Changing the output format didn't seem to change things significantly.

Times for selecting all the columns in the table:
select * from columnversion;
8,000 ms

select * from arrayversion;
110 ms

select * from arraytocolumnview (data in the array version but converted to
columns in the view)
10,000 ms

Times to select a single column in a table:
select runstarttime from columversion;
32 ms

select runstarttime from arrayversion;
6 ms

So the question is, does it seem reasonable that a query on fundamentally
identical data should take 70-90 times as long when displayed as individual
columns vs. when output as a raw array and, more imporantly, what can I do to
get acceptable performance on this query?

Cheers,
Steve

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruno Wolff III 2003-01-17 19:40:44 Re: point -> double,double ?
Previous Message Tom Lane 2003-01-17 19:35:05 Re: point -> double,double ?

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2003-01-17 19:43:00 Re: Generate user/group sysids from a sequence?
Previous Message Tom Lane 2003-01-17 19:32:49 Re: Generate user/group sysids from a sequence?

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-01-17 23:06:52 Re: Terrible performance on wide selects
Previous Message Jeff 2003-01-17 18:39:41 Re: 7.3.1 New install, large queries are slow