Re:

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Yevhenii Kurtov <yevhenii(dot)kurtov(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re:
Date: 2017-06-28 07:43:25
Message-ID: CAFj8pRBqvyQrEK-msP5DO7aame+Sfc_yCuxRUDQ+xxhXM3tdfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2017-06-28 9:28 GMT+02:00 Yevhenii Kurtov <yevhenii(dot)kurtov(at)gmail(dot)com>:

> Hello Pavel,
>
> Can you please give a tip how to rewrite the query with UNION clause? I
> didn't use it at all before actually and afraid that will not get it
> properly from the first time :)
>

SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
UNION SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE ((c0."status" = $3) AND (c0."failed_at" > $4))
UNION SELECT c0."id" FROM "campaign_jobs" AS c0
WHERE ((c0."status" = $5) AND (c0."started_at" < $6))
ORDER BY c0."priority" DESC, c0."times_failed"
LIMIT $7
FOR UPDATE SKIP LOCKED

Something like this

Pavel

>
> On Wed, Jun 28, 2017 at 2:12 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>
>>
>>
>> 2017-06-28 8:47 GMT+02:00 Yevhenii Kurtov <yevhenii(dot)kurtov(at)gmail(dot)com>:
>>
>>> Hello,
>>>
>>> We have a query that is run almost each second and it's very important
>>> to squeeze every other ms out of it. The query is:
>>>
>>> SELECT c0."id" FROM "campaign_jobs" AS c0
>>> WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
>>> OR ((c0."status" = $3) AND (c0."failed_at" > $4))
>>> OR ((c0."status" = $5) AND (c0."started_at" < $6))
>>> ORDER BY c0."priority" DESC, c0."times_failed"
>>> LIMIT $7
>>> FOR UPDATE SKIP LOCKED
>>>
>>> I added following index:
>>>
>>> CREATE INDEX ON campaign_jobs(id, status, failed_at, started_at,
>>> priority DESC, times_failed);
>>>
>>> And it didn't help at all, even opposite - the planning phase time grew
>>> up from ~2ms up to ~40 ms leaving execution time intact:
>>>
>>> Limit (cost=29780.02..29781.27 rows=100 width=18) (actual
>>> time=827.753..828.113 rows=100 loops=1)
>>> -> LockRows (cost=29780.02..32279.42 rows=199952 width=18) (actual
>>> time=827.752..828.096 rows=100 loops=1)
>>> -> Sort (cost=29780.02..30279.90 rows=199952 width=18)
>>> (actual time=827.623..827.653 rows=100 loops=1)
>>> Sort Key: priority DESC, times_failed
>>> Sort Method: external sort Disk: 5472kB
>>> -> Seq Scan on campaign_jobs c0 (cost=0.00..22138.00
>>> rows=199952 width=18) (actual time=1.072..321.410 rows=200000 loops=1)
>>> Filter: (((status = 0) AND (id <> ALL
>>> ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
>>> 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,
>>> 43,44,45,46,47,48}'::integer[]))) OR ((status = 2) AND (failed_at >
>>> '2017-06-22 03:18:09'::timestamp without time zone)) OR ((status = 1) AND
>>> (started_at < '2017-06-23 03:11:09'::timestamp without time zone)))
>>> Planning time: 40.734 ms
>>> Execution time: 913.638 ms
>>> (9 rows)
>>>
>>>
>>> I see that query still went through the Seq Scan instead of Index Scan.
>>> Is it due to poorly crafted index or because of query structure? Is it
>>> possible to make this query faster?
>>>
>>
>> There are few issues
>>
>> a) parametrized LIMIT
>> b) complex predicate with lot of OR
>> c) slow external sort
>>
>> b) signalize maybe some strange in design .. try to replace "OR" by
>> "UNION" query
>> c) if you can and you have good enough memory .. try to increase work_mem
>> .. maybe 20MB
>>
>> if you change query to union queries, then you can use conditional indexes
>>
>> create index(id) where status = 0;
>> create index(failed_at) where status = 2;
>> create index(started_at) where status = 1;
>>
>> Regards
>>
>> Pavel
>>
>>
>>>
>>> Thanks
>>>
>>
>>
>

In response to

  • Re: at 2017-06-28 07:28:00 from Yevhenii Kurtov

Responses

  • Re: at 2017-06-28 13:10:13 from Brad DeJong

Browse pgsql-performance by date

  From Date Subject
Next Message Ulf Lohbrügge 2017-06-28 09:38:14 Re: Performance of information_schema with many schemata and tables
Previous Message Yevhenii Kurtov 2017-06-28 07:28:00 Re: