Re: limiting results makes the query slower

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: psql performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: limiting results makes the query slower
Date: 2009-08-23 21:41:39
Message-ID: 603c8f070908231441n19440d94vef2e32e28e514076@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Aug 20, 2009 at 9:50 PM, Jaime
Casanova<jcasanov(at)systemguards(dot)com(dot)ec> wrote:
> in a web app we have a query that we want to show in limited results
> at a time, this one executes in 10 seconds if i use limit but executes
> in 300ms if i remove it.
> why is that happening? the query is using and index for avoiding the
> sort so the nestloop should go only for the first 20 records on
> tgen_persona, no?
> below some more info
>
> postgresql 8.3.7
> ram 32GB
> shared_buffers 8GB
> work_mem 8MB
>
> tgen_persona has 185732 records and tcom_invitacion is a partitioned
> (by dates: 1 month every partition) table and has more than 29million
> records in the partitions
>
> explain analyze here: http://explain.depesz.com/s/B4
>
> the situation improves if i disable nestloops, explain analyze with
> nestloop off here: http://explain.depesz.com/s/Jv
>
> select Per.razon_social as MAIL,inv.cata_esta_calificacion,
>       inv.observa_calificacion,
>       to_char(inv.fech_crea,'YYYY:MM:DD') as fech_crea,
>       case when (( select cod_estado FROM TPRO_PROVEEDOR
>                     WHERE id_proveedor = (select max(a.id_proveedor)
>                                             from tpro_proveedor a
>                                            where persona_id = Inv.persona_id )
>                  )='Habilitado')
>            then 'Habilitado'
>            else 'Deshabilitado'
>       end as empresa_id
>  from tgen_persona Per, tcom_invitacion Inv
>  where Per.persona_id = Inv.persona_id
>   and inv.id_soli_compra = '60505'
>  ORDER BY Per.razon_social asc limit 20 offset 0

This is pretty common. Tom Lane pointed out in a message I don't
feel like searching for right now that LIMIT tends to magnify the
effect of bad selectivity estimates. In this case, the join
selectivity is off by more than 3 orders of magnitude right here:

Nested Loop (cost=0.00..4280260.77 rows=8675 width=588) (actual
time=4835.934..11335.731 rows=2 loops=1)

I'm not familiar with how we estimate join selectivity in this case,
but it's obviously giving really, really wrong answers. The problem
may be here:

Append (cost=0.00..22.00 rows=23 width=560) (actual time=0.055..0.055
rows=0 loops=185732)

It appears that we're estimating 23 rows because we have 23
partitions, and we're estimating one row for each. There are a lot of
places in the planner where we round off to an integer with a floor of
1, which may be part of the problem here... but I don't know without
looking at the code.

...Robert

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2009-08-23 22:26:16 Re: [PERFORMANCE] how to set wal_buffers
Previous Message Jaime Casanova 2009-08-23 20:25:59 Re: [PERFORMANCE] how to set wal_buffers