Re: Very slow Query compared to Oracle / SQL - Server

From: Alexey M Boltenkov <padrebolt(at)yandex(dot)ru>
To: Semen Yefimenko <semen(dot)yefimenko(at)gmail(dot)com>
Cc: "luis(dot)roberto(at)siscobra(dot)com(dot)br" <luis(dot)roberto(at)siscobra(dot)com(dot)br>, pgsql-performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Very slow Query compared to Oracle / SQL - Server
Date: 2021-05-06 20:17:24
Message-ID: 612406bc-0ca3-19ef-c77a-dcab91fda40d@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 05/06/21 23:02, Alexey M Boltenkov wrote:
> On 05/06/21 22:58, Alexey M Boltenkov wrote:
>> Have you try of excluding not null from index? Can you give
>> dispersion of archivestatus?
>>
>>
>> 06.05.2021, 21:59, "Semen Yefimenko" <semen(dot)yefimenko(at)gmail(dot)com>:
>>
>> Yes, rewriting the query with an IN clause was also my first
>> approach, but I didn't help much.
>> The Query plan did change a little bit but the performance was
>> not impacted.
>>
>> CREATE INDEX idx_arcstatus_le1 ON schema.logtable (
>> archivestatus ) where (archivestatus <= 1)
>> ANALYZE  schema.logtable
>>
>>
>> This resulted in this query plan:
>>
>> Gather Merge  (cost=344618.96..394086.05 rows=423974
>> width=2549) (actual time=7327.777..9142.358 rows=516031 loops=1)
>>   Output: column1, .. , column54
>>   Workers Planned: 2
>>   Workers Launched: 2
>>   Buffers: shared hit=179817 read=115290
>>   ->  Sort  (cost=343618.94..344148.91 rows=211987
>> width=2549) (actual time=7258.314..7476.733 rows=172010 loops=3)
>>         Output: column1, .. , column54
>>         Sort Key: logtable.timestampcol DESC
>>         Sort Method: quicksort  Memory: 64730kB
>>         Worker 0:  Sort Method: quicksort  Memory: 55742kB
>>         Worker 1:  Sort Method: quicksort  Memory: 55565kB
>>         Buffers: shared hit=179817 read=115290
>>         Worker 0: actual time=7231.774..7458.703 rows=161723
>> loops=1
>>           Buffers: shared hit=55925 read=36265
>>         Worker 1: actual time=7217.856..7425.754 rows=161990
>> loops=1
>>           Buffers: shared hit=56197 read=36242
>>         ->  Parallel Bitmap Heap Scan on schema.logtable
>>  (cost=5586.50..324864.86 rows=211987 width=2549) (actual
>> time=1073.266..6805.850 rows=172010 loops=3)
>>               Output: column1, .. , column54
>>               Recheck Cond: ((logtable.entrytype = 4000) OR
>> (logtable.entrytype = 4001) OR (logtable.entrytype = 4002))
>>               Filter: (logtable.archivestatus <= 1)
>>               Heap Blocks: exact=109146
>>               Buffers: shared hit=179803 read=115290
>>               Worker 0: actual time=1049.875..6809.231
>> rows=161723 loops=1
>>                 Buffers: shared hit=55918 read=36265
>>               Worker 1: actual time=1035.156..6788.037
>> rows=161990 loops=1
>>                 Buffers: shared hit=56190 read=36242
>>               ->  BitmapOr  (cost=5586.50..5586.50
>> rows=514483 width=0) (actual time=945.179..945.179 rows=0
>> loops=1)
>>                     Buffers: shared hit=3 read=1329
>>                     ->  Bitmap Index Scan on idx_entrytype
>>  (cost=0.00..738.13 rows=72893 width=0) (actual
>> time=147.915..147.916 rows=65970 loops=1)
>> Index Cond: (logtable.entrytype = 4000)
>> Buffers: shared hit=1 read=171
>>                     ->  Bitmap Index Scan on idx_entrytype
>>  (cost=0.00..2326.17 rows=229965 width=0) (actual
>> time=473.450..473.451 rows=225040 loops=1)
>> Index Cond: (logtable.entrytype = 4001)
>> Buffers: shared hit=1 read=579
>>                     ->  Bitmap Index Scan on idx_entrytype
>>  (cost=0.00..2140.61 rows=211624 width=0) (actual
>> time=323.801..323.802 rows=225021 loops=1)
>> Index Cond: (logtable.entrytype = 4002)
>> Buffers: shared hit=1 read=579
>> Settings: random_page_cost = '1', search_path = '"$user",
>> schema, public', temp_buffers = '80MB', work_mem = '1GB'
>> Planning Time: 0.810 ms
>> Execution Time: 9647.406 ms
>>
>>
>> seemingly faster.
>> After doing a few selects, I reran ANALYZE:
>> Now it's even faster, probably due to cache and other mechanisms.
>>
>> Gather Merge  (cost=342639.19..391676.44 rows=420290
>> width=2542) (actual time=2944.803..4534.725 rows=516035 loops=1)
>>   Output: column1, .. , column54
>>   Workers Planned: 2
>>   Workers Launched: 2
>>   Buffers: shared hit=147334 read=147776
>>   ->  Sort  (cost=341639.16..342164.53 rows=210145
>> width=2542) (actual time=2827.256..3013.960 rows=172012 loops=3)
>>         Output: column1, .. , column54
>>         Sort Key: logtable.timestampcol DESC
>>         Sort Method: quicksort  Memory: 71565kB
>>         Worker 0:  Sort Method: quicksort  Memory: 52916kB
>>         Worker 1:  Sort Method: quicksort  Memory: 51556kB
>>         Buffers: shared hit=147334 read=147776
>>         Worker 0: actual time=2771.975..2948.928 rows=153292
>> loops=1
>>           Buffers: shared hit=43227 read=43808
>>         Worker 1: actual time=2767.752..2938.688 rows=148424
>> loops=1
>>           Buffers: shared hit=42246 read=42002
>>         ->  Parallel Bitmap Heap Scan on schema.logtable
>>  (cost=5537.95..323061.27 rows=210145 width=2542) (actual
>> time=276.401..2418.925 rows=172012 loops=3)
>>               Output: column1, .. , column54
>>               Recheck Cond: ((logtable.entrytype = 4000) OR
>> (logtable.entrytype = 4001) OR (logtable.entrytype = 4002))
>>               Filter: (logtable.archivestatus <= 1)
>>               Heap Blocks: exact=122495
>>               Buffers: shared hit=147320 read=147776
>>               Worker 0: actual time=227.701..2408.580
>> rows=153292 loops=1
>>                 Buffers: shared hit=43220 read=43808
>>               Worker 1: actual time=225.996..2408.705
>> rows=148424 loops=1
>>                 Buffers: shared hit=42239 read=42002
>>               ->  BitmapOr  (cost=5537.95..5537.95
>> rows=509918 width=0) (actual time=203.940..203.941 rows=0
>> loops=1)
>>                     Buffers: shared hit=1332
>>                     ->  Bitmap Index Scan on idx_entrytype
>>  (cost=0.00..680.48 rows=67206 width=0) (actual
>> time=31.155..31.156 rows=65970 loops=1)
>> Index Cond: (logtable.entrytype = 4000)
>> Buffers: shared hit=172
>>                     ->  Bitmap Index Scan on idx_entrytype
>>  (cost=0.00..2220.50 rows=219476 width=0) (actual
>> time=112.459..112.461 rows=225042 loops=1)
>> Index Cond: (logtable.entrytype = 4001)
>> Buffers: shared hit=580
>>                     ->  Bitmap Index Scan on idx_entrytype
>>  (cost=0.00..2258.70 rows=223236 width=0) (actual
>> time=60.313..60.314 rows=225023 loops=1)
>> Index Cond: (logtable.entrytype = 4002)
>> Buffers: shared hit=580
>> Settings: random_page_cost = '1', search_path = '"$user",
>> schema, public', temp_buffers = '80MB', work_mem = '1GB'
>> Planning Time: 0.609 ms
>> Execution Time: 4984.490 ms
>>
>> I don't see the new index used but it seems it's boosting the
>> performance nevertheless.
>> I kept the query, so I didn't rewrite the query to be WITHOUT nulls.
>> Thank you already for the hint. What else can I do? With the
>> current parameters, the query finishes in about 3.9-5.2 seconds
>> which is already much better but still nowhere near the speeds of
>> 280 ms in oracle.
>> I would love to get it to at least 1 second.
>>
>>
>> Am Do., 6. Mai 2021 um 20:20 Uhr schrieb Alexey M Boltenkov
>> <padrebolt(at)yandex(dot)ru <mailto:padrebolt(at)yandex(dot)ru>>:
>>
>> On 05/06/21 21:15, Alexey M Boltenkov wrote:
>>
>> On 05/06/21 19:11, luis(dot)roberto(at)siscobra(dot)com(dot)br
>> <mailto:luis(dot)roberto(at)siscobra(dot)com(dot)br> wrote:
>>
>> ----- Mensagem original -----
>>
>> De: "Semen Yefimenko"<semen(dot)yefimenko(at)gmail(dot)com>
>> <mailto:semen(dot)yefimenko(at)gmail(dot)com>
>> Para: "pgsql-performance"<pgsql-performance(at)lists(dot)postgresql(dot)org>
>> <mailto:pgsql-performance(at)lists(dot)postgresql(dot)org>
>> Enviadas: Quinta-feira, 6 de maio de 2021 11:38:39
>> Assunto: Very slow Query compared to Oracle / SQL - Server
>>
>> SELECT column1,..., column54 where ((entrytype = 4000 or entrytype = 4001 or
>> entrytype = 4002) and (archivestatus <= 1)) order by timestampcol desc;
>>
>>
>>
>> The first thing I would try is rewriting the query to:
>>
>> SELECT column1,..., column54
>> FROM logtable
>> WHERE (entrytype in (4000,4001,4002))
>> AND (archivestatus <= 1))
>> ORDER BY timestampcol DESC;
>>
>> Check if that makes a difference...
>>
>> Luis R. Weck
>>
>>
>>
>> The IN statement will probable result in just recheck
>> condition change to entrytype = any('{a,b,c}'::int[]).
>> Looks like dispersion of archivestatus is not enough to
>> use index idx_arcstatus.
>>
>> Please try to create partial index with condition like
>> (archivestatus <= 1) and rewrite select to use
>> (archivestatus is not null and archivestatus <= 1).
>>
>> CREATE INDEX idx_arcstatus_le1 ON schema.logtable (
>> archivestatus ) where (archivestatus <= 1) TABLESPACE
>> tablespace;
>>
>> I'm sorry, 'archivestatus is not null' is only necessary for
>> index without nulls.
>>
>>
>> CREATE INDEX idx_arcstatus_le1 ON schema.logtable (
>> archivestatus ) where (archivestatus is not null and
>> archivestatus <= 1) TABLESPACE tablespace;
>>
> BTW, please try to reset random_page_cost.
>
>
The root of problem is in:

              Heap Blocks: exact=122495

  Buffers: shared hit=147334 read=147776

Have you tune shared buffers enough? Each block is of 8k by default.
Пиши напрямую мимо рассылки, если что :)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Imre Samu 2021-05-06 21:16:45 Re: Very slow Query compared to Oracle / SQL - Server
Previous Message Alexey M Boltenkov 2021-05-06 20:02:07 Re: Very slow Query compared to Oracle / SQL - Server