Re: Nested loop performance

From: "Nick Fankhauser" <nickf(at)ontko(dot)com>
To: "Richard Poole" <richard(at)ruthie(dot)org>, "Pgsql-Performance(at)Postgresql(dot) Org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Nested loop performance
Date: 2003-12-17 15:26:25
Message-ID: NEBBLAAHGLEEPCGOBHDGMEPAKBAA.nickf@ontko.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> It seems that your basic problem is that you're fetching lots of rows
> from two big ol' tables.

> It doesn't seem to me that there would be a substantially better plan
> for this query with your tables as they stand.

That's more or less the conclusion I had come to. I was just hoping someone
else could point out an approach I've been missing. (sigh!)

> If your data were more
> normalised, then your big scans might be quicker (because their rows
> would be smaller so they would hit fewer disk pages),

This started off as a 5-table join on well-normalized data. Unfortunately,
the actor table doesn't get any smaller, and the work involved in
calculating the "case_count" information on the fly was clearly becoming a
problem- particularly with actors that had a heavy caseload. (Busy attorneys
and judges.) The actor_summary approach makes these previous problem cases
go away, but the payback is that (as you correctly pointed out) queries on
average citizens who only have one case suffer from the de-normalized
approach.

We're currently considering the approach of just returning all of the rows
to our application, and doing the aggregation and limit work in the app. The
inconsistency of the data makes it very tough for the query planner to come
up with an strategy that is always a winner.

Thanks for your thoughts!

-Nick

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Zeugswetter Andreas SB SD 2003-12-17 16:57:52 Re: [HACKERS] fsync method checking
Previous Message Nick Fankhauser 2003-12-17 15:26:20 Re: Nested loop performance