Quick Links

Re:

From:	Yevhenii Kurtov <yevhenii(dot)kurtov(at)gmail(dot)com>
To:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc:	"pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re:
Date:	2017-06-28 07:28:00
Message-ID:	CAJhrTGxKR099VKLPYm-scCpj7En0cD9TegHQM9AFDzjs0tkXQw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hello Pavel,

Can you please give a tip how to rewrite the query with UNION clause? I
didn't use it at all before actually and afraid that will not get it
properly from the first time :)

On Wed, Jun 28, 2017 at 2:12 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
wrote:

>
>
> 2017-06-28 8:47 GMT+02:00 Yevhenii Kurtov <yevhenii(dot)kurtov(at)gmail(dot)com>:
>
>> Hello,
>>
>> We have a query that is run almost each second and it's very important to
>> squeeze every other ms out of it. The query is:
>>
>> SELECT c0."id" FROM "campaign_jobs" AS c0
>> WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
>> OR ((c0."status" = $3) AND (c0."failed_at" > $4))
>> OR ((c0."status" = $5) AND (c0."started_at" < $6))
>> ORDER BY c0."priority" DESC, c0."times_failed"
>> LIMIT $7
>> FOR UPDATE SKIP LOCKED
>>
>> I added following index:
>>
>> CREATE INDEX ON campaign_jobs(id, status, failed_at, started_at, priority
>> DESC, times_failed);
>>
>> And it didn't help at all, even opposite - the planning phase time grew
>> up from ~2ms up to ~40 ms leaving execution time intact:
>>
>> Limit (cost=29780.02..29781.27 rows=100 width=18) (actual
>> time=827.753..828.113 rows=100 loops=1)
>> -> LockRows (cost=29780.02..32279.42 rows=199952 width=18) (actual
>> time=827.752..828.096 rows=100 loops=1)
>> -> Sort (cost=29780.02..30279.90 rows=199952 width=18) (actual
>> time=827.623..827.653 rows=100 loops=1)
>> Sort Key: priority DESC, times_failed
>> Sort Method: external sort Disk: 5472kB
>> -> Seq Scan on campaign_jobs c0 (cost=0.00..22138.00
>> rows=199952 width=18) (actual time=1.072..321.410 rows=200000 loops=1)
>> Filter: (((status = 0) AND (id <> ALL
>> ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
>> 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,
>> 43,44,45,46,47,48}'::integer[]))) OR ((status = 2) AND (failed_at >
>> '2017-06-22 03:18:09'::timestamp without time zone)) OR ((status = 1) AND
>> (started_at < '2017-06-23 03:11:09'::timestamp without time zone)))
>> Planning time: 40.734 ms
>> Execution time: 913.638 ms
>> (9 rows)
>>
>>
>> I see that query still went through the Seq Scan instead of Index Scan.
>> Is it due to poorly crafted index or because of query structure? Is it
>> possible to make this query faster?
>>
>
> There are few issues
>
> a) parametrized LIMIT
> b) complex predicate with lot of OR
> c) slow external sort
>
> b) signalize maybe some strange in design .. try to replace "OR" by
> "UNION" query
> c) if you can and you have good enough memory .. try to increase work_mem
> .. maybe 20MB
>
> if you change query to union queries, then you can use conditional indexes
>
> create index(id) where status = 0;
> create index(failed_at) where status = 2;
> create index(started_at) where status = 1;
>
> Regards
>
> Pavel
>
>
>>
>> Thanks
>>
>
>

In response to

Re: at 2017-06-28 07:12:35 from Pavel Stehule

Responses

Re: at 2017-06-28 07:43:25 from Pavel Stehule

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Pavel Stehule	2017-06-28 07:43:25	Re:
Previous Message	Pavel Stehule	2017-06-28 07:12:35	Re: