Re:

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Yevhenii Kurtov <yevhenii(dot)kurtov(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re:
Date: 2017-06-28 07:12:35
Message-ID: CAFj8pRCxqJsOtpahwtdh4M12OGEs9z-zvz8EfAWE19s+Wh075w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2017-06-28 8:47 GMT+02:00 Yevhenii Kurtov <yevhenii(dot)kurtov(at)gmail(dot)com>:

> Hello,
>
> We have a query that is run almost each second and it's very important to
> squeeze every other ms out of it. The query is:
>
> SELECT c0."id" FROM "campaign_jobs" AS c0
> WHERE (((c0."status" = $1) AND NOT (c0."id" = ANY($2))))
> OR ((c0."status" = $3) AND (c0."failed_at" > $4))
> OR ((c0."status" = $5) AND (c0."started_at" < $6))
> ORDER BY c0."priority" DESC, c0."times_failed"
> LIMIT $7
> FOR UPDATE SKIP LOCKED
>
> I added following index:
>
> CREATE INDEX ON campaign_jobs(id, status, failed_at, started_at, priority
> DESC, times_failed);
>
> And it didn't help at all, even opposite - the planning phase time grew up
> from ~2ms up to ~40 ms leaving execution time intact:
>
> Limit (cost=29780.02..29781.27 rows=100 width=18) (actual
> time=827.753..828.113 rows=100 loops=1)
> -> LockRows (cost=29780.02..32279.42 rows=199952 width=18) (actual
> time=827.752..828.096 rows=100 loops=1)
> -> Sort (cost=29780.02..30279.90 rows=199952 width=18) (actual
> time=827.623..827.653 rows=100 loops=1)
> Sort Key: priority DESC, times_failed
> Sort Method: external sort Disk: 5472kB
> -> Seq Scan on campaign_jobs c0 (cost=0.00..22138.00
> rows=199952 width=18) (actual time=1.072..321.410 rows=200000 loops=1)
> Filter: (((status = 0) AND (id <> ALL
> ('{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
> 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,
> 43,44,45,46,47,48}'::integer[]))) OR ((status = 2) AND (failed_at >
> '2017-06-22 03:18:09'::timestamp without time zone)) OR ((status = 1) AND
> (started_at < '2017-06-23 03:11:09'::timestamp without time zone)))
> Planning time: 40.734 ms
> Execution time: 913.638 ms
> (9 rows)
>
>
> I see that query still went through the Seq Scan instead of Index Scan. Is
> it due to poorly crafted index or because of query structure? Is it
> possible to make this query faster?
>

There are few issues

a) parametrized LIMIT
b) complex predicate with lot of OR
c) slow external sort

b) signalize maybe some strange in design .. try to replace "OR" by "UNION"
query
c) if you can and you have good enough memory .. try to increase work_mem
.. maybe 20MB

if you change query to union queries, then you can use conditional indexes

create index(id) where status = 0;
create index(failed_at) where status = 2;
create index(started_at) where status = 1;

Regards

Pavel

>
> Thanks
>

In response to

  • at 2017-06-28 06:47:44 from Yevhenii Kurtov

Responses

  • Re: at 2017-06-28 07:28:00 from Yevhenii Kurtov

Browse pgsql-performance by date

  From Date Subject
Next Message Yevhenii Kurtov 2017-06-28 07:28:00 Re:
Previous Message Yevhenii Kurtov 2017-06-28 06:47:44