Re: Join Query Perfomance Issue

From: Thomas Zaksek <zaksek(at)ptt(dot)uni-due(dot)de>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Join Query Perfomance Issue
Date: 2008-02-12 10:11:11
Message-ID: 47B170BF.5000206@ptt.uni-due.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Scott Marlowe schrieb:
> On Feb 11, 2008 12:08 PM, Thomas Zaksek <zaksek(at)ptt(dot)uni-due(dot)de> wrote:
>
>> I have serious performance problems with the following type of queries:
>> /
>> /explain analyse SELECT '12.11.2007 18:04:00 UTC' AS zeit,
>> 'M' AS datatyp,
>> p.zs_nr AS zs_de,
>> j_ges,
>> de_mw_abh_j_lkw(mw_abh) AS j_lkw,
>> de_mw_abh_v_pkw(mw_abh) AS v_pkw,
>> de_mw_abh_v_lkw(mw_abh) AS v_lkw,
>> de_mw_abh_p_bel(mw_abh) AS p_bel
>> FROM messpunkt p, messungen_v_dat_2007_11_12 m, de_mw w
>> WHERE m.ganglinientyp = 'M'
>> AND 381 = m.minute_tag
>> AND (p.nr, p.mw_nr) = (m.messpunkt, w.nr);
>>
>> Explain analze returns
>>
>> Nested Loop (cost=0.00..50389.39 rows=3009 width=10) (actual
>> time=0.503..320.872 rows=2189 loops=1)
>> -> Nested Loop (cost=0.00..30668.61 rows=3009 width=8) (actual
>> time=0.254..94.116 rows=2189 loops=1)
>>
>
> This nested loop is using us most of your time. Try increasing
> work_mem and see if it chooses a better join plan, and / or turn off
> nested loops for a moment and see if that helps.
>
> set enable_nestloop = off
>
> Note that set enable_xxx = off
>
> Is kind of a hammer to the forebrain setting. It's not subtle, and
> the planner can't work around it. So use them with caution. That
> said, I had one reporting query that simply wouldn't run fast without
> turning off nested loops for that one. But don't turn off nested
> queries universally, they are still a good choice for smaller amounts
> of data.
>
I tried turning off nestloop, but with terrible results:

Hash Join (cost=208328.61..228555.14 rows=3050 width=10) (actual
time=33421.071..40362.136 rows=2920 loops=1)
Hash Cond: (w.nr = p.mw_nr)
-> Seq Scan on de_mw w (cost=0.00..14593.79 rows=891479 width=10)
(actual time=0.012..3379.971 rows=891479 loops=1)
-> Hash (cost=208290.49..208290.49 rows=3050 width=8) (actual
time=33420.877..33420.877 rows=2920 loops=1)
-> Merge Join (cost=5303.71..208290.49 rows=3050 width=8)
(actual time=31.550..33407.688 rows=2920 loops=1)
Merge Cond: (p.nr = m.messpunkt)
-> Index Scan using messpunkt_nr_idx on messpunkt p
(cost=0.00..238879.39 rows=6306026 width=12) (actual
time=0.056..17209.317 rows=4339470 loops=1)
-> Sort (cost=5303.71..5311.34 rows=3050 width=4)
(actual time=25.973..36.858 rows=2920 loops=1)
Sort Key: m.messpunkt
-> Index Scan using
messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx on
messungen_v_dat_2007_11_12 m (cost=0.00..5127.20 rows=3050 width=4)
(actual time=0.124..12.822 rows=2920 loops=1)
Index Cond: ((ganglinientyp = 'M'::bpchar)
AND (651 = minute_tag))
Total runtime: 40373.512 ms
(12 rows)
Looks crappy, isn't it?

I also tried to increase work_men, now the config is
work_mem = 4MB
maintenance_work_mem = 128MB,
in regard to performance, it wasnt effective at all.

The postgresql runs on a HP Server with dual Opteron, 3GB of Ram, what
are good settings here? The database will have to work with tables of
several 10Millions of Lines, but only a few columns each. No more than
maybe ~5 clients accessing the database at the same time.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Chris Kratz 2008-02-12 14:02:43 Re: mis-estimate in nested query causes slow runtimes
Previous Message Linux Guru 2008-02-12 08:46:34 Re: Update with Subquery Performance