Problem with large query

From: Marc Cousin <mcousin(at)sigma(dot)fr>
To: pgsql-performance(at)postgresql(dot)org
Subject: Problem with large query
Date: 2004-09-08 13:49:49
Message-ID: 200409081549.49537.mcousin@sigma.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi. I hope I'm not asking a too trivial question here...

I'm having trouble with a (quite big) query, and can't find a way to make it
faster.

Here is the information :

Tables :
============================
sces_vte -> 2753539 rows
sces_art -> 602327
sces_fsf -> 8126
sces_frc -> 7763
sces_tps -> 38
sces_gtr -> 35

Query :
===========================
SELECT
sces_gtr_art.gtr_cod,
sces_gtr_art.gtr_lib,
sces_frc_art.fou_cod,
sces_frc_art.fou_lib,
sces_tps.tps_annee_mois,
TO_NUMBER('200401','999999'),
TO_NUMBER('200405','999999'),
sces_tps.tps_libc,
sum(sces_vte.vte_mnt),
sum(sces_vte.vte_qte),
sum(sces_vte.vte_ton),
sces_famille.fsf_codfam,
sces_famille.fsf_lib,
sces_s_famille.fsf_codsfm,
sces_s_famille.fsf_lib
FROM
sces_vte,
sces_art,
sces_fsf sces_famille,
sces_fsf sces_s_famille,
sces_frc sces_frc_art,
sces_tps,
sces_gtr sces_gtr_art
WHERE
( sces_famille.fsf_codfam=sces_s_famille.fsf_codfam )
AND ( sces_famille.fsf_codseg= 0 and sces_famille.fsf_codsfm = 0 )
AND ( sces_vte.tps_annee_mois=sces_tps.tps_annee_mois )
AND ( sces_vte.art_cod=sces_art.art_cod and
sces_vte.dos_cod=sces_art.dos_cod )
AND ( sces_gtr_art.gtr_cod=sces_frc_art.gtr_cod )
AND ( sces_frc_art.gtr_cod=sces_art.gtr_cod and
sces_frc_art.fou_cod=sces_art.fou_cod )
AND ( sces_s_famille.fsf_codfam=sces_art.fsf_codfam and
sces_s_famille.fsf_codsfm=sces_art.fsf_codsfm )
AND ( sces_s_famille.fsf_codseg = 0 )
AND (
( ( ( sces_tps.tps_annee_mois ) >= ( TO_NUMBER('200401','999999') ) and
( sces_tps.tps_annee_mois ) <= (
TO_NUMBER('200405','999999') )
)
OR
(
( sces_tps.tps_annee_mois ) >= ( TO_NUMBER('200401','999999') )-100 and
( sces_tps.tps_annee_mois ) <= (
TO_NUMBER('200405','999999') )-100
) )
AND ( sces_gtr_art.gtr_cod in (2))
)
GROUP BY
sces_gtr_art.gtr_cod,
sces_gtr_art.gtr_lib,
sces_frc_art.fou_cod,
sces_frc_art.fou_lib,
sces_tps.tps_annee_mois,
TO_NUMBER('200401','999999'),
TO_NUMBER('200405','999999'),
sces_tps.tps_libc,
sces_famille.fsf_codfam,
sces_famille.fsf_lib,
sces_s_famille.fsf_codsfm,
sces_s_famille.fsf_lib

Explain Analyze Plan :
====================================
GroupAggregate (cost=27161.91..27938.72 rows=16354 width=280) (actual time=484509.210..544436.148 rows=4115 loops=1)
-> Sort (cost=27161.91..27202.79 rows=16354 width=280) (actual time=484496.188..485334.151 rows=799758 loops=1)
Sort Key: sces_gtr_art.gtr_cod, sces_gtr_art.gtr_lib, sces_frc_art.fou_cod, sces_frc_art.fou_lib, sces_tps.tps_annee_mois, 200401::numeric, 200405::numeric, sces_tps.tps_libc, sces_famille.fsf_codfam, sces_famille.fsf_lib, sces_s_famille.fsf_codsfm, sces_s_famille.fsf_lib
-> Merge Join (cost=25727.79..26017.34 rows=16354 width=280) (actual time=58945.821..69321.146 rows=799758 loops=1)
Merge Cond: (("outer".fsf_codfam = "inner".fsf_codfam) AND ("outer".fsf_codsfm = "inner".fsf_codsfm))
-> Sort (cost=301.36..304.60 rows=1298 width=83) (actual time=27.926..28.256 rows=332 loops=1)
Sort Key: sces_s_famille.fsf_codfam, sces_s_famille.fsf_codsfm
-> Seq Scan on sces_fsf sces_s_famille (cost=0.00..234.24 rows=1298 width=83) (actual time=0.042..19.124 rows=1341 loops=1)
Filter: (fsf_codseg = 0::numeric)
-> Sort (cost=25426.43..25448.05 rows=8646 width=225) (actual time=58917.106..59693.810 rows=799758 loops=1)
Sort Key: sces_art.fsf_codfam, sces_art.fsf_codsfm
-> Merge Join (cost=24726.32..24861.08 rows=8646 width=225) (actual time=19036.709..29404.943 rows=799758 loops=1)
Merge Cond: ("outer".tps_annee_mois = "inner".tps_annee_mois)
-> Sort (cost=2.49..2.53 rows=17 width=23) (actual time=0.401..0.428 rows=20 loops=1)
Sort Key: sces_tps.tps_annee_mois
-> Seq Scan on sces_tps (cost=0.00..2.14 rows=17 width=23) (actual time=0.068..0.333 rows=20 loops=1)
Filter: (((tps_annee_mois >= 200301::numeric) OR (tps_annee_mois >= 200401::numeric)) AND ((tps_annee_mois <= 200305::numeric) OR (tps_annee_mois >= 200401::numeric)) AND ((tps_annee_mois >= 200301::numeric) OR (tps_annee_mois <= 200405::numeric)) AND ((tps_annee_mois <= 200305::numeric) OR (tps_annee_mois <= 200405::numeric)))
-> Sort (cost=24723.83..24747.97 rows=9656 width=214) (actual time=19036.223..19917.214 rows=799757 loops=1)
Sort Key: sces_vte.tps_annee_mois
-> Nested Loop (cost=21825.09..24084.74 rows=9656 width=214) (actual time=417.603..8644.294 rows=399879 loops=1)
-> Nested Loop (cost=21825.09..21837.50 rows=373 width=195) (actual time=417.444..672.741 rows=14158 loops=1)
-> Seq Scan on sces_gtr sces_gtr_art (cost=0.00..1.44 rows=1 width=40) (actual time=0.026..0.085 rows=1 loops=1)
Filter: (gtr_cod = 2::numeric)
-> Merge Join (cost=21825.09..21832.34 rows=373 width=165) (actual time=417.400..568.247 rows=14158 loops=1)
Merge Cond: ("outer".fsf_codfam = "inner".fsf_codfam)
-> Sort (cost=255.24..255.30 rows=24 width=74) (actual time=16.597..16.692 rows=106 loops=1)
Sort Key: sces_famille.fsf_codfam
-> Seq Scan on sces_fsf sces_famille (cost=0.00..254.69 rows=24 width=74) (actual time=0.029..15.971 rows=155 loops=1)
Filter: ((fsf_codseg = 0::numeric) AND (fsf_codsfm = 0::numeric))
-> Sort (cost=21569.85..21571.64 rows=715 width=91) (actual time=400.631..416.871 rows=14162 loops=1)
Sort Key: sces_art.fsf_codfam
-> Nested Loop (cost=0.00..21535.95 rows=715 width=91) (actual time=1.320..230.975 rows=14162 loops=1)
-> Seq Scan on sces_frc sces_frc_art (cost=0.00..182.75 rows=728 width=51) (actual time=1.195..14.316 rows=761 loops=1)
Filter: (2::numeric = gtr_cod)
-> Index Scan using ind_art_02 on sces_art (cost=0.00..29.24 rows=7 width=61) (actual time=0.040..0.160 rows=19 loops=761)
Index Cond: ((2::numeric = sces_art.gtr_cod) AND ("outer".fou_cod = sces_art.fou_cod))
-> Index Scan using idx_vte_02 on sces_vte (cost=0.00..6.01 rows=1 width=62) (actual time=0.037..0.259 rows=28 loops=14158)
Index Cond: ((sces_vte.art_cod = "outer".art_cod) AND (sces_vte.dos_cod = "outer".dos_cod))
Total runtime: 545435.989 ms

From what I understand from the plan, the worst part of it is the sort. Is there a way I can improve this query ?
(Obviously, as it has many rows, it will still be a slow query, but here it's too slow for us...).

I allready extended the sort_mem (up to 500 MB to be sure, the server has plenty of RAM),
the query has become faster, but I don't know what else to do, to speed up the sort.

BTW the query was generated, not written. We allready are trying to write something better, but we are still facing a
gigantic sort at the end (we need the group by, and there are many lines from the main table (sces_vte) to be retrieved and aggregated)...

Thanks in advance...

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-09-08 14:40:43 Re: Problem with large query
Previous Message Jeremy M. Guthrie 2004-09-07 21:32:03 Stuck using Sequential Scan