Re: PostgreSQL 10.0 SELECT LIMIT performance problem

From: Victor Yegorov <vyegorov(at)gmail(dot)com>
To: Mareks(dot)Kalnacs(at)datakom(dot)lv
Cc: pgsql-bugs(at)postgresql(dot)org, Maris(dot)Rucis(at)datakom(dot)lv
Subject: Re: PostgreSQL 10.0 SELECT LIMIT performance problem
Date: 2018-09-12 12:31:50
Message-ID: CAGnEbohpDWcJbuCJexRWW-XsjmPVH=ZYoWn+b8zs5a3xBcawkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

ср, 12 сент. 2018 г. в 11:39, Mareks Kalnačs <Mareks(dot)Kalnacs(at)datakom(dot)lv>:

> We have some serious performance problem with SELECTS when add limit, for
> example, execute time without limit ~250msec (316 rows returned), when add
> limit 20, execute time 15 – 50secs.
>

Hi, Mareks.

As this is not actually a bug, it's better to use pgsql-performance or
pgsql-general for such questions next time.

We have select with subselect filter:
>
> select a.id, a.jdata
>
> from oss_alarms a
>
> where
>
> a.jdata->>'dn' in
>
> (
>
> select o.jdata->>'ossDn'
>
> from oss_objects o, tvc_entity e
>
> where e.jtype='object'
>
> and
> o.jdata->>'sid'=e.jdata->>'siteId'
>
> and
> tvc_unaccent(e.jdata->>'name') like tvc_unaccent('%zaube%')
>
> )
>
> order by (tvc_convert_array_to_date(a.jdata -> 'alarmTime')) desc
>
> limit 20;
>

As execution plan without limit shows, your join conditions yield 316 rows,
explicit sort is fast for this amount

For the plan with the limit, planner prefers to use
`idx_oss_alarms_alarm_time` index, to avoid extra sort.
This leads to:
Rows Removed by Join Filter: 10717797

I.e. you're reading 10M rows via index scan and later throw them away, as
they do not match your join condition:
Join Filter: ((a.jdata ->> 'dn'::text) = (o.jdata ->>
'ossDn'::text))

It looks like a.jdata->'dn' and a.jdata->'alarmTime' are correlated,
although planner doesn't knows that.
PostgreSQL 10 has `CREATE STATISTICS`, but it works on table columns, not
expressions.
You can try disabling `idx_oss_alarms_alarm_time` by using expression in
`ORDER BY`:

order by (idx_oss_alarms_alarm_timetvc_convert_array_to_date(a.jdata ->
'alarmTime'))+INTERVAL '0' desc

But be warned — this can help in some situations and make things worse in
others.

I would recommend to move JOIN and ORDER BY columns out of JSON and make
them direct table columns.

--
Victor Yegorov

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Pavel Stehule 2018-09-12 12:43:06 Re: PostgreSQL 10.0 SELECT LIMIT performance problem
Previous Message Mareks Kalnačs 2018-09-12 08:31:50 PostgreSQL 10.0 SELECT LIMIT performance problem