slow query on tables with new columns added.

From: "M(dot) D(dot)" <lists(at)turnkey(dot)bz>
To: pgsql-performance(at)postgresql(dot)org
Subject: slow query on tables with new columns added.
Date: 2011-09-23 17:49:45
Message-ID: 4E7CC6B9.2040404@turnkey.bz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi everyone,

I did a software upgrade, and with it came a new feature where when
selecting a customer it queries for the sum of a few columns. This
takes 7 seconds for the 'Cash Sale' customer - by far the most active
customer. I'd like to see if it's possible to get it down a bit by
changing settings.

Query:
explain analyse select sum(item_points),sum(disc_points) from invoice
left join gltx on invoice.invoice_id = gltx.gltx_id
where gltx.inactive_on is null and gltx.posted = 'Y' and
gltx.customer_id = 'A0ZQ2gsACIsEKLI638ikyg'

item_points and disc_points are the 2 columns added, so they are mostly 0.

table info:
CREATE TABLE gltx -- rows: 894,712
(
gltx_id character(22) NOT NULL,
"version" integer NOT NULL,
created_by character varying(16) NOT NULL,
updated_by character varying(16),
inactive_by character varying(16),
created_on date NOT NULL,
updated_on date,
inactive_on date,
external_id numeric(14,0),
data_type integer NOT NULL,
"number" character varying(14) NOT NULL,
reference_str character varying(14),
post_date date NOT NULL,
post_time time without time zone NOT NULL,
work_date date NOT NULL,
memo text,
customer_id character(22),
vendor_id character(22),
station_id character(22),
employee_id character(22),
store_id character(22) NOT NULL,
shift_id character(22),
link_id character(22),
link_num integer NOT NULL,
printed character(1) NOT NULL,
paid character(1) NOT NULL,
posted character(1) NOT NULL,
amount numeric(18,4) NOT NULL,
card_amt numeric(18,4) NOT NULL,
paid_amt numeric(18,4) NOT NULL,
paid_date date,
due_date date,
CONSTRAINT gltx_pkey PRIMARY KEY (gltx_id),
CONSTRAINT gltx_c0 FOREIGN KEY (customer_id)
REFERENCES customer (customer_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT gltx_c1 FOREIGN KEY (vendor_id)
REFERENCES vendor (vendor_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT gltx_c2 FOREIGN KEY (station_id)
REFERENCES station (station_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT gltx_c3 FOREIGN KEY (employee_id)
REFERENCES employee (employee_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT gltx_c4 FOREIGN KEY (store_id)
REFERENCES store (store_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT gltx_c5 FOREIGN KEY (shift_id)
REFERENCES shift (shift_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT gltx_c6 FOREIGN KEY (link_id)
REFERENCES gltx (gltx_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE SET NULL
)
WITH (
OIDS=FALSE
);
ALTER TABLE gltx OWNER TO quasar;
GRANT ALL ON TABLE gltx TO quasar;

CREATE INDEX gltx_i0
ON gltx
USING btree
(data_type);

CREATE INDEX gltx_i1
ON gltx
USING btree
(post_date);

CREATE INDEX gltx_i2
ON gltx
USING btree
(number);

CREATE INDEX gltx_i3
ON gltx
USING btree
(data_type, number);

CREATE INDEX gltx_i4
ON gltx
USING btree
(customer_id, paid);

CREATE INDEX gltx_i5
ON gltx
USING btree
(vendor_id, paid);

CREATE INDEX gltx_i6
ON gltx
USING btree
(work_date);

CREATE INDEX gltx_i7
ON gltx
USING btree
(link_id);

CREATE TABLE invoice -- 623,270 rows
(
invoice_id character(22) NOT NULL,
ship_id character(22),
ship_via character varying(20),
term_id character(22),
promised_date date,
tax_exempt_id character(22),
customer_addr text,
ship_addr text,
comments text,
item_points numeric(14,0) NOT NULL,
disc_points numeric(14,0) NOT NULL,
CONSTRAINT invoice_pkey PRIMARY KEY (invoice_id),
CONSTRAINT invoice_c0 FOREIGN KEY (invoice_id)
REFERENCES gltx (gltx_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE,
CONSTRAINT invoice_c1 FOREIGN KEY (ship_id)
REFERENCES customer (customer_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT invoice_c2 FOREIGN KEY (term_id)
REFERENCES term (term_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT invoice_c3 FOREIGN KEY (tax_exempt_id)
REFERENCES tax (tax_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);

Both tables have mostly writes, some updates, very few deletes.

Explain analyse: (http://explain.depesz.com/s/SYW)

Aggregate (cost=179199.52..179199.53 rows=1 width=10) (actual time=7520.922..7520.924 rows=1 loops=1)
-> Merge Join (cost=9878.78..177265.66 rows=386771 width=10) (actual time=104.651..6690.194 rows=361463 loops=1)
Merge Cond: (invoice.invoice_id = gltx.gltx_id)
-> Index Scan using invoice_pkey on invoice (cost=0.00..86222.54 rows=623273 width=33) (actual time=0.010..1316.507 rows=623273 loops=1)
-> Index Scan using gltx_pkey on gltx (cost=0.00..108798.53 rows=386771 width=23) (actual time=104.588..1822.886 rows=361464 loops=1)
Filter: ((gltx.inactive_on IS NULL) AND (gltx.posted = 'Y'::bpchar) AND (gltx.customer_id = 'A0ZQ2gsACIsEKLI638ikyg'::bpchar))
Total runtime: 7521.026 ms

PostgreSQL: 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50), 64-bit - self compiled
Linux: Linux server.domain.lan 2.6.18-238.12.1.el5xen #1 SMP Tue May 31 13:35:45 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux

Hardware: single CPU: model name : Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
RAM: 8GB
DB Size: 5876MB
HDs: Raid 1 Sata drives - dell PowerEdge 1900 - lower middle class server

Postgres config:
max_connections = 200 #it's a bit high I know, but most connections are idle
shared_buffers = 2048MB #
work_mem = 8MB # tried up to 32MB, but no diff
maintenance_work_mem = 16MB #
bgwriter_delay = 2000ms #
checkpoint_segments = 15 #
checkpoint_completion_target = 0.8 #
seq_page_cost = 5.0 #
random_page_cost = 2.5 #
effective_cache_size = 2048MB # just upgraded to 2GB. had another aggressive memory using program before, so did not want to have this high
log_destination = 'stderr' #
logging_collector = off #
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' #
log_rotation_age = 1d #
log_min_duration_statement = 10000 #
log_line_prefix='%t:%r:%u(at)%d:[%p]: ' #
track_activities = on
track_counts = on
track_activity_query_size = 1024 #
autovacuum = on #
autovacuum_max_workers = 5 #
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8' #
lc_monetary = 'en_US.UTF-8' #
lc_numeric = 'en_US.UTF-8' #
lc_time = 'en_US.UTF-8' #
default_text_search_config = 'pg_catalog.english'

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Antonio Rodriges 2011-09-23 20:03:57 [PERFORMANCE] Insights: fseek OR read_cluster?
Previous Message Hany ABOU-GHOURY 2011-09-23 03:43:04 Re: Fwd: [PERFORM] PG 9 adminstrations