Re: 7.3.1 New install, large queries are slow

From: "Roman Fail" <rfail(at)posportal(dot)com>
To: <josh(at)agliodbs(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 7.3.1 New install, large queries are slow
Date: 2003-01-15 23:30:55
Message-ID: 9B1C77393DED0D4B9DAA1AA1742942DA0E4BFD@pos_pdc.posportal.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks to everyone for the quick replies! I'm sure that my lack of skill with SQL queries is the main problem. What's strange to me is how MSSQL takes my bad queries and makes them look good anyway. It must have a real smart planner.

Several changes: shared_buffers = 131072, sort_mem = 32768, shmmax = 2097152000, shmall = 131072000. I couldn't find any info out there on the relationship between shmmax and shmall, so I just preserved the ratio from the RedHat defaults (1:16). As far as sort_mem goes, I expect to be running no more than 3 concurrent queries and they will all be just as complex as this one. Do you think sort_mem=32768 is a reasonable size? None of these changes seemed to help speed up things however.

REINDEX INDEX batchdetail_ix_tranamount_idx; was executed successfully, although it took 15 minutes.
ANALYZE executed in 2 minutes, even though I increased default_statistics_target = 30. Should I increase it even more? I don't mind the extra overhead each night if it will make my queries faster. (Idiot check: I did actually stop and start the postmaster after changing all these settings).

Andrew Sullivan wrote:
>First, the performance of foreign keys is flat-out awful in Postgres.
>I suggest avoiding them if you can.

I don't have any problem getting rid of FKs, especially if it might actually help performance. The nightly data import is well-defined and should always observe referential integrity, so I guess the db doesn't really need to enforce it. In MSSQL, adding FKs was supposed to actually benefit SELECT performance. Is it pretty much universally accepted that I should drop all my foreign keys?

>Second, ordering joins explicitly (with the JOIN keyword) constrains
>the planner, and may select bad plan. The explain analyse output
>was nice, but I didn't see the query, so I can't tell what the plan
>maybe ought to be.

I think this is the most likely problem. I've read through Chapter 10 of the 7.3 docs, but I still don't feel like I know what would be a good order. How do you learn this stuff anyway? Trial and error?

>Third, I didn't see any suggestion that you'd moved the WAL onto its
>own disk. That will mostly help when you are under write load;

I don't think I'm going to bother with moving the WAL....the write load during the day is very, very light (when queries are run). Disk I/O is clearly not the limiting factor (yet!).

So here's the query, and another EXPLAIN ANALYZE to go with it (executed after all setting changes). The same result columns and JOINS are performed all day with variations on the WHERE clause; other possible search columns are the ones that are indexed (see below). The 4 tables that use LEFT JOIN only sometimes have matching records, hence the OUTER join.

EXPLAIN ANALYZE
SELECT b.batchdate, d.batchdetailid, t.bankno, d.trandate, d.tranamount,
d.submitinterchange, d.authamount, d.authno, d.cardtypeid, d.mcccode,
m.name AS merchantname, c.cardtype, m.merchid,
p1.localtaxamount, p1.productidentifier, dr.avsresponse,
cr.checkoutdate, cr.noshowindicator, ck.checkingacctno,
ck.abaroutingno, ck.checkno
FROM tranheader t
INNER JOIN batchheader b ON t.tranheaderid = b.tranheaderid
INNER JOIN merchants m ON m.merchantid = b.merchantid
INNER JOIN batchdetail d ON d.batchid = b.batchid
INNER JOIN cardtype c ON d.cardtypeid = c.cardtypeid
LEFT JOIN purc1 p1 ON p1.batchdetailid = d.batchdetailid
LEFT JOIN direct dr ON dr.batchdetailid = d.batchdetailid
LEFT JOIN carrental cr ON cr.batchdetailid = d.batchdetailid
LEFT JOIN checks ck ON ck.batchdetailid = d.batchdetailid
WHERE t.clientid = 6
AND d.tranamount BETWEEN 500.0 AND 700.0
AND b.batchdate > '2002-12-15'
AND m.merchid = '701252267'
ORDER BY b.batchdate DESC
LIMIT 50

Limit (cost=1829972.39..1829972.39 rows=1 width=285) (actual time=1556497.79..1556497.80 rows=5 loops=1)
-> Sort (cost=1829972.39..1829972.39 rows=1 width=285) (actual time=1556497.78..1556497.79 rows=5 loops=1)
Sort Key: b.batchdate
-> Nested Loop (cost=1771874.32..1829972.38 rows=1 width=285) (actual time=1538783.03..1556486.64 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=1771874.32..1829915.87 rows=1 width=247) (actual time=1538760.60..1556439.67 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=1771874.32..1829915.86 rows=1 width=230) (actual time=1538760.55..1556439.50 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=1771874.32..1829915.85 rows=1 width=221) (actual time=1538760.51..1556439.31 rows=5 loops=1)
Join Filter: ("inner".batchdetailid = "outer".batchdetailid)
-> Nested Loop (cost=1771874.32..1773863.81 rows=1 width=202) (actual time=1529153.84..1529329.65 rows=5 loops=1)
Join Filter: ("outer".cardtypeid = "inner".cardtypeid)
-> Merge Join (cost=1771874.32..1773862.58 rows=1 width=188) (actual time=1529142.55..1529317.99 rows=5 loops=1)
Merge Cond: ("outer".batchid = "inner".batchid)
-> Sort (cost=116058.42..116058.43 rows=3 width=118) (actual time=14184.11..14184.14 rows=17 loops=1)
Sort Key: b.batchid
-> Hash Join (cost=109143.44..116058.39 rows=3 width=118) (actual time=12398.29..14184.03 rows=17 loops=1)
Hash Cond: ("outer".merchantid = "inner".merchantid)
-> Merge Join (cost=109137.81..114572.94 rows=295957 width=40) (actual time=12359.75..13848.67 rows=213387 loops=1)
Merge Cond: ("outer".tranheaderid = "inner".tranheaderid)
-> Index Scan using tranheader_ix_tranheaderid_idx on tranheader t (cost=0.00..121.15 rows=1923 width=16) (actual time=0.17..10.91 rows=1923 loops=1)
Filter: (clientid = 6)
-> Sort (cost=109137.81..109942.73 rows=321966 width=24) (actual time=12317.83..12848.43 rows=329431 loops=1)
Sort Key: b.tranheaderid
-> Seq Scan on batchheader b (cost=0.00..79683.44 rows=321966 width=24) (actual time=29.93..10422.75 rows=329431 loops=1)
Filter: (batchdate > '2002-12-15 00:00:00'::timestamp without time zone)
-> Hash (cost=5.63..5.63 rows=1 width=78) (actual time=21.06..21.06 rows=0 loops=1)
-> Index Scan using merchants_ix_merchid_idx on merchants m (cost=0.00..5.63 rows=1 width=78) (actual time=21.05..21.05 rows=1 loops=1)
Index Cond: (merchid = '701252267'::character varying)
-> Sort (cost=1655815.90..1656810.15 rows=397698 width=70) (actual time=1513860.73..1514497.92 rows=368681 loops=1)
Sort Key: d.batchid
-> Index Scan using batchdetail_ix_tranamount_idx on batchdetail d (cost=0.00..1597522.38 rows=397698 width=70) (actual time=14.05..1505397.17 rows=370307 loops=1)
Index Cond: ((tranamount >= 500.0) AND (tranamount <= 700.0))
-> Seq Scan on cardtype c (cost=0.00..1.10 rows=10 width=14) (actual time=2.25..2.28 rows=10 loops=5)
-> Seq Scan on purc1 p1 (cost=0.00..44285.35 rows=941335 width=19) (actual time=2.40..3812.43 rows=938770 loops=5)
-> Seq Scan on direct dr (cost=0.00..0.00 rows=1 width=9) (actual time=0.00..0.00 rows=0 loops=5)
-> Seq Scan on carrental cr (cost=0.00..0.00 rows=1 width=17) (actual time=0.00..0.00 rows=0 loops=5)
-> Seq Scan on checks ck (cost=0.00..40.67 rows=1267 width=38) (actual time=0.50..7.05 rows=1267 loops=5)
Total runtime: 1556553.76 msec


Tomasz Myrta wrote:
>Seq Scan on batchheader b (cost=0.00..79587.23 rows=308520 width=56)
>Can you write what condition and indexes does batchheader have?

batchheader has 2.6 million records:
CREATE TABLE public.batchheader (
batchid int8 DEFAULT nextval('"batchheader_batchid_key"'::text) NOT NULL,
line int4,
tranheaderid int4,
merchantid int4,
batchdate timestamp,
merchref char(16),
carryindicator char(1),
assocno varchar(6),
merchbankno char(4),
debitcredit char(1),
achpostdate timestamp,
trancode char(4),
netdeposit numeric(18, 4),
CONSTRAINT batchheader_ix_batchid_idx UNIQUE (batchid),
CONSTRAINT batchheader_pkey PRIMARY KEY (batchid),
CONSTRAINT fk_bh_th FOREIGN KEY (tranheaderid) REFERENCES tranheader (tranheaderid) ON DELETE RESTRICT ON UPDATE NO ACTION NOT DEFERRABLE INITIALLY IMMEDIATE
) WITH OIDS;
CREATE UNIQUE INDEX batchheader_ix_batchid_idx ON batchheader USING btree (batchid);
CREATE INDEX batchheader_ix_batchdate_idx ON batchheader USING btree (batchdate);
CREATE INDEX batchheader_ix_merchantid_idx ON batchheader USING btree (merchantid);
CREATE INDEX batchheader_ix_merchref_idx ON batchheader USING btree (merchref);
CREATE INDEX batchheader_ix_netdeposit_idx ON batchheader USING btree (netdeposit);

And here's batchdetail too, just for kicks. 23 million records.
CREATE TABLE public.batchdetail (
batchdetailid int8 DEFAULT nextval('public.batchdetail_batchdetailid_seq'::text) NOT NULL,
line int4,
batchid int4,
merchno varchar(16),
assocno varchar(6),
refnumber char(23),
trandate timestamp,
tranamount numeric(18, 4),
netdeposit numeric(18, 4),
cardnocfb bytea,
bestinterchange char(2),
submitinterchange char(2),
downgrader1 char(4),
downgrader2 char(4),
downgrader3_1 char(1),
downgrader3_2 char(1),
downgrader3_3 char(1),
downgrader3_4 char(1),
downgrader3_5 char(1),
downgrader3_6 char(1),
downgrader3_7 char(1),
onlineentry char(1),
achflag char(1),
authsource char(1),
cardholderidmeth char(1),
catindicator char(1),
reimbattribute char(1),
motoindicator char(1),
authcharind char(1),
banknetrefno char(9),
banknetauthdate char(6),
draftaflag char(1),
authcurrencycode char(3),
authamount numeric(18, 4),
validcode char(4),
authresponsecode char(2),
debitnetworkid char(3),
switchsetindicator char(1),
posentrymode char(2),
debitcredit char(1),
reversalflag char(1),
merchantname varchar(25),
authno char(6),
rejectreason char(4),
cardtypeid int4,
currencycode char(3),
origtranamount numeric(18, 4),
foreigncard char(1),
carryover char(1),
extensionrecord char(2),
mcccode char(4),
terminalid char(8),
submitinterchange3b char(3),
purchaseid varchar(25),
trancode char(4),
CONSTRAINT batchdetail_pkey PRIMARY KEY (batchdetailid)
) WITH OIDS;
CREATE INDEX batchdetail_ix_authno_idx ON batchdetail USING btree (authno);
CREATE INDEX batchdetail_ix_batchdetailid_idx ON batchdetail USING btree (batchdetailid);
CREATE INDEX batchdetail_ix_cardnocfb_idx ON batchdetail USING btree (cardnocfb);
CREATE INDEX batchdetail_ix_posentrymode_idx ON batchdetail USING btree (posentrymode);
CREATE INDEX batchdetail_ix_submitinterchange3b_idx ON batchdetail USING btree (submitinterchange3b);
CREATE INDEX batchdetail_ix_tranamount_idx ON batchdetail USING btree (tranamount);

Roman Fail
Sr. Web Application Developer
POS Portal, Inc.
Sacramento, CA


Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Brown 2003-01-16 02:05:27 Re: 7.3.1 New install, large queries are slow
Previous Message Hannu Krosing 2003-01-15 20:42:59 Re: 7.3.1 New install, large queries are slow