Quick Links

Terrible performance on wide selects

From:	Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Terrible performance on wide selects
Date:	2003-01-17 19:37:26
Message-ID:	20030117193726.1CFCB103E5@polaris.pinpointresearch.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-hackers pgsql-performance

I have a table which is rather wide (~800 columns) and consists of a few
columns of identifying data (run time, channel and such) and up to several
hundred columns of collected data (no, normalization does not suggest putting
collected data in another table - collected item 1 always corresponds to
collected item 1 but is completely different than item 3).

My test table is very short (62 rows) but in production would grow by several
thousand rows per day. Unfortunately if my test data is correct, performance
on wide selects is so bad that it will render the system unusable.

Here's the test. I have created two versions of the table - one stores the
collected data in an array of text and the other stores the data in
individual columns, no joins, no indexes. Times are averages of many runs -
the times varied very little and the data is small enough that I'm sure it
was served from RAM. Postgres CPU utilization observed on the longer runs was
98-99%. Changing the output format didn't seem to change things significantly.

Times for selecting all the columns in the table:
select * from columnversion;
8,000 ms

select * from arrayversion;
110 ms

select * from arraytocolumnview (data in the array version but converted to
columns in the view)
10,000 ms

Times to select a single column in a table:
select runstarttime from columversion;
32 ms

select runstarttime from arrayversion;
6 ms

So the question is, does it seem reasonable that a query on fundamentally
identical data should take 70-90 times as long when displayed as individual
columns vs. when output as a raw array and, more imporantly, what can I do to
get acceptable performance on this query?

Cheers,
Steve

Responses

Re: Terrible performance on wide selects at 2003-01-17 23:06:52 from Tom Lane
Re: Terrible performance on wide selects at 2003-01-22 23:14:30 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Bruno Wolff III	2003-01-17 19:40:44	Re: point -> double,double ?
Previous Message	Tom Lane	2003-01-17 19:35:05	Re: point -> double,double ?

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Treat	2003-01-17 19:43:00	Re: Generate user/group sysids from a sequence?
Previous Message	Tom Lane	2003-01-17 19:32:49	Re: Generate user/group sysids from a sequence?

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2003-01-17 23:06:52	Re: Terrible performance on wide selects
Previous Message	Jeff	2003-01-17 18:39:41	Re: 7.3.1 New install, large queries are slow