From: | Oleg Lebedev <oleg(dot)lebedev(at)waterford(dot)org> |
---|---|
To: | "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com> |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: TPC-R benchmarks |
Date: | 2003-10-02 22:27:29 |
Message-ID: | 993DBE5B4D02194382EC8DF8554A52731E783E@postoffice.waterford.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Thanks everyone for the help.
I have another question. How do I optimize my indexes for the query that
contains a lot of ORed blocks, each of which contains a bunch of ANDed
expressions? The structure of each ORed block is the same except the
right-hand-side values vary.
The first expression of each AND-block is a join condition. However,
postgres tries to use a sequential scan on both of the tables applying
the OR-ed blocks of ANDed expressions. So, the cost of the plan is
around 700,000,000,000.
Here is an example:
select
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#24'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM
PKG')
and l_quantity >= 4 and l_quantity <= 4 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#22'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG',
'MED PACK')
and l_quantity >= 18 and l_quantity <= 18 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#33'
and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG
PKG')
and l_quantity >= 24 and l_quantity <= 24 + 10
and p_size between 1 and 15
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);
-----Original Message-----
From: scott.marlowe [mailto:scott(dot)marlowe(at)ihs(dot)com]
Sent: Thursday, October 02, 2003 1:44 PM
To: Oleg Lebedev
Cc: Josh Berkus; pgsql-performance(at)postgresql(dot)org
Subject: RE: [PERFORM] TPC-R benchmarks
On Thu, 2 Oct 2003, Oleg Lebedev wrote:
> I was trying to get the pg_stats information to Josh and decided to
> recreate the indexes on all my tables. After that I ran vacuum full
> analyze, re-enabled nestloop and ran explain analyze on the query. It
> ran in about 2 minutes. I attached the new query plan. I am not sure
> what did the trick, but 2 minutes is much better than 2 hours. But
> then again, I can't take long lunches anymore :)
> Is there any way to make this query run even faster without increasing
> the memory dedicated to postgres?
> Thanks.
As long as the estimated row counts and real ones match up, and
postgresql
seems to be picking the right plan, there's probably not a lot to be
done.
You might want to look at increasing sort_mem a bit, but don't go crazy,
as being too high can result in swap storms under load, which are a very
bad thing.
I'd check for index growth. You may have been reloading your data over
and over and had an index growth problem. Next time instead of
recreating
the indexed completely, you might wanna try reindex indexname.
Also, 7.4 mostly fixes the index growth issue, especially as it applies
to
truncating/reloading a table over and over, so moving to 7.4 beta3/4 and
testing might be a good idea (if you aren't there already).
What you want to avoid is having postgresql switch back to that nestloop
join on you in the middle of the day, and to prevent that you might need
to have higher statistics targets so the planner gets the right number
all the time.
*************************************
This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.
*************************************
From | Date | Subject | |
---|---|---|---|
Next Message | Dror Matalon | 2003-10-02 22:33:13 | Re: count(*) slow on large tables |
Previous Message | Christopher Browne | 2003-10-02 21:57:30 | Re: count(*) slow on large tables |