Re: need to speed up query

From: Shaun Thomas <sthomas(at)leapfrogonline(dot)com>
To: Justin <justin(at)emproshunts(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: need to speed up query
Date: 2008-05-06 16:43:29
Message-ID: 1210092209.14833.32.camel@berners-lee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 2008-05-06 at 03:01 +0100, Justin wrote:

> i've had to write queries to get trail balance values out of the GL
> transaction table and i'm not happy with its performance

Go ahead and give this a try:

SELECT p.period_id, p.period_start, p.period_end, a.accnt_id,
a.accnt_number, a.accnt_descrip, p.period_yearperiod_id,
a.accnt_type,
SUM(CASE WHEN g.gltrans_date < p.period_start
THEN g.gltrans_amount ELSE 0.0
END)::text::money AS beginbalance,
SUM(CASE WHEN g.gltrans_date < p.period_end
AND g.gltrans_date >= p.period_start
AND g.gltrans_amount <= 0::numeric
THEN g.gltrans_amount ELSE 0.0
END)::text::money AS negative,
SUM(CASE WHEN g.gltrans_date <= p.period_end
AND g.gltrans_date >= p.period_start
AND g.gltrans_amount >= 0::numeric
THEN g.gltrans_amount ELSE 0.0
END)::text::money AS positive,
SUM(CASE WHEN g.gltrans_date <= p.period_end
AND g.gltrans_date >= p.period_start
THEN g.gltrans_amount ELSE 0.0
END)::text::money AS difference,
SUM(CASE WHEN g.gltrans_date <= p.period_end
THEN g.gltrans_amount ELSE 0.0
END)::text::money AS endbalance,
FROM period p
CROSS JOIN accnt a
LEFT JOIN gltrans g ON (g.gltrans_accnt_id = a.accnt_id
AND g.gltrans_posted = true)
ORDER BY period.period_id, accnt.accnt_number;

Depending on how the planner saw your old query, it may have forced
several different sequence or index scans to get the information from
gltrans. One thing all of your subqueries had in common was a join on
the account id and listing only posted transactions. It's still a big
gulp, but it's only one gulp.

The other thing I did was that I guessed you added the coalesce clause
because the subqueries individually could return null rowsets for
various groupings, and you wouldn't want that. This left-join solution
only lets it add to your various sums if it matches all the conditions,
otherwise it falls through the list of cases until nothing matches. If
some of your transactions can have null amounts, you might consider
turning g.gltrans into COALESCE(g.gltrans, 0.0) instead.

Otherwise, this *might* work; without knowing more about your schema,
it's only a guess. I'm a little skeptical about the conditionless
cross-join, but whatever.

Either way, by looking at this query, it looks like some year-end
summary piece, or an at-a-glance idea of your account standings. The
problem you're going to have with this is that there's no way to truly
optimize this. One way or another, you're going to incur some
combination of three sequence scans or three index scans; if those
tables get huge, you're in trouble. You might want to consider a
denormalized summary table that contains this information (and maybe
more) maintained by a trigger or regularly invoked stored-procedure and
then you can select from *that* with much less agony.

Then there's fact-tables, but that's beyond the scope of this email. ;)

Good luck!

--

Shaun Thomas
Database Administrator

Leapfrog Online
807 Greenwood Street
Evanston, IL 60201
Tel. 847-440-8253
Fax. 847-570-5750
www.leapfrogonline.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Shaun Thomas 2008-05-06 16:43:40 Re: multiple joins + Order by + LIMIT query performance issue
Previous Message Antoine Baudoux 2008-05-06 16:42:34 Re: multiple joins + Order by + LIMIT query performance issue