crosstab speed

From: "Jeremiah Elliott" <jeremiah0(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: crosstab speed
Date: 2008-11-13 20:42:32
Message-ID: aa98dd100811131242r18325669m93c938382f3a7e28@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

ok, I have an application that I am trying to speed up. Its a reporting
application that makes heavy use of the crosstab function.

Here is some of the setup / configuration details:
Postgres 8.3.3
RedHat Enterprise 5.2 (2.6.18 kernel)
sun x4600, 8 dual core opteron 8218 processors, 32BG, StorageTek SAN
6 15k FC disks raid 10 for data,
2 15k FC disks raid 1 for xlog,
2 10k SAS disks raid 1 for OS
The table that I am querying has just under 600k records, 55 columns, 30
indexes
The table is not static, there are several hundred inserts a day into it.
This is not the only application that uses postgres on this server. There
are several other transactional apps as well

here is an example query
select "COL_HEAD"[1] as site, "COL_HEAD"[2] as product_line_description,
"COL_HEAD"[3] as report_sls, "COL_HEAD"[4] as fy_period, "2006" , "2007" ,
"2008" , "2009" from public.crosstab('select
ARRAY[site::text,product_line_description::text,report_sls::text,fy_period::text]
as COL_HEADER, fy_year, sum(invoice_value) from order_data_tbl where
fy_year is not null group by
site::text,product_line_description::text,report_sls::text,fy_period::text,
fy_year order by
site::text,product_line_description::text,report_sls::text,fy_period::text',
'select fy_year from order_data_tbl where fy_year is not null group by
fy_year order by fy_year') as order_data_tbl("COL_HEAD" text[], "2006"
numeric(20,2) , "2007" numeric(20,2) , "2008" numeric(20,2) , "2009"
numeric(20,2) )

The crostab function is taking between 5 and 15 seconds to return. While the
query is running one of the cores will be close to 100%, but watching iostat
makes be believe that the entire table is cached and none of it is being
read from disk. Depending on what report is being run the indexes may or may
not be of any assistance. In the above query the planner does not use an
index. Depending on what the user is looking for some indexes will be used
because there is more specified in the where clause, at which point the
query time can be under two seconds. The problem is that most reports that
get generated with this application don't have a where clause. Are there any
changes that can make to my config to speed up these huge aggregating
queries?

Here is my postgresql.conf

max_connections = 1500
shared_buffers = 8GB
work_mem = 2GB
maintenance_work_mem = 8GB
max_fsm_pages = 2048000
wal_buffers = 1024kB
checkpoint_segments = 256
checkpoint_timeout = 10min
effective_cache_size = 20GB
default_statistics_target = 100
log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 1GB
log_error_verbosity = default
autovacuum = on
autovacuum_max_workers = 9
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
synchronize_seqscans = on
log_min_duration_statement = 250

-Jeremiah Elliott

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2008-11-13 21:41:07 Re: crosstab speed
Previous Message Bruno Baguette 2008-11-13 15:22:47 Re: Slow SQL query (14-15 seconds)