Re: Execution plan analysis

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Neto pr <netopr9(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Execution plan analysis
Date: 2017-08-25 15:33:54
Message-ID: 9fdf19f5-a49f-2d85-1e9e-a3cdb335eb80@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

So looking at the plans, essentially the only part that is different is
the scan node at the very bottom - in one case it's a sequential scan,
in the other case (the slow one) it's the bitmap index scan.

Essentially it's this:

-> Seq Scan on lineitem
(cost=0.00..2624738.17 ...)
(actual time=0.839..74391.087 ...)

vs. this:

-> Bitmap Heap Scan on lineitem
(cost=336295.10..1970056.39 ...)
(actual time=419620.817..509685.421 ...)
-> Bitmap Index Scan on idx_l_shipmodelineitem000
(cost=0.00..336227.49 ...)
(actual time=419437.172..419437.172 ...)

All the nodes are the same and perform about the same in both cases, so
you can ignore them. This difference it the the root cause you need to
investigate.

The question is why is the sequential scan so much faster than bitmap
index scan? Ideally, the bitmap heap scan should scan the index (in a
mostly sequential way), build a bitmap, and then read just the matching
part of the table (sequentially, by skipping some of the pages).

Now, there are a few reasons why this might not work that well.

Perhaps the table fits into RAM, but table + index does not. That would
make the sequential scan much faster than the index path. Not sure if
this is the case, as you haven't mentioned which TPC-H scale are you
testing, but you only have 4GB of RAM which if fairly low.

Another bit is prefetching - with sequential scans, the OS is able to
prefetch the next bit of data automatically (read-ahead). With bitmap
index scans that's not the case, producing a lot of individual
synchronous I/O requests. See if increasing effective_cache_size (from
default 1 to 16 or 32) helps.

Try generating the plans with EXPLAIN (ANALYZE, BUFFERS), that should
tell us more about how many blocks are found in shared buffers, etc.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Aniko Belim 2017-08-25 15:36:29 Partitioned table - scans through every partitions
Previous Message Tom Lane 2017-08-25 15:07:21 Re: 10x faster sort performance on Skylake CPU vs Ivy Bridge