Re: why is bitmap index chosen for this query?

From: Stephen Byers <stephenabyers(at)yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, Stephen Byers <stephenabyers(at)yahoo(dot)com>
Subject: Re: why is bitmap index chosen for this query?
Date: 2006-05-18 19:38:18
Message-ID: 20060518193818.81332.qmail@web30710.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I repeated explain analyze on the query 5 times and it came up with the same plan.

You asked about index order and physical table order. In general the index order is indeed close to the same order as the physical table order. However, this query is likely an exception. The data is actually from a backup server that has filled a hole for some of the time range that I'm specifying in my query.

Work_mem was set to 10240. After your suggestion, I bumped it to 102400 and it looks like it did significantly impact performance.

adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets
adbs_db-# where environment_name='PASITCTX01'
adbs_db-# and system_time_secs>=1132272000 and system_time_secs<=1143244800;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on packets (cost=247205.64..2838556.55 rows=12473252 width=47) (actual time=32118.943..187075.742 rows=13365371 loops=1)
Recheck Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
-> Bitmap Index Scan on packets_i4 (cost=0.00..247205.64 rows=12473252 width=0) (actual time=30370.789..30370.789 rows=13365371 loops=1)
Index Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
Total runtime: 191995.431 ms
(5 rows)
adbs_db=#
adbs_db=#
adbs_db=# adbs_db=# set enable_bitmapscan to off;
SET
adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets
adbs_db-# where environment_name='PASITCTX01'
adbs_db-# and system_time_secs>=1132272000 and system_time_secs<=1143244800;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on packets (cost=0.00..3046021.47 rows=12473252 width=47) (actual time=56616.457..475839.789 rows=13365371 loops=1)
Filter: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
Total runtime: 481228.409 ms
(3 rows)
adbs_db=#
adbs_db=#
adbs_db=# adbs_db=# set enable_seqscan to off;
SET
adbs_db=# explain analyze select spacecraft_time_secs,mnemonic,volume_label from packets
adbs_db-# where environment_name='PASITCTX01'
adbs_db-# and system_time_secs>=1132272000 and system_time_secs<=1143244800;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using packets_i4 on packets (cost=0.00..19909080.77 rows=12473273 width=47) (actual time=3.511..188273.177 rows=13365371 loops=1)
Index Cond: (((environment_name)::text = 'PASITCTX01'::text) AND (system_time_secs >= 1132272000) AND (system_time_secs <= 1143244800))
Total runtime: 194061.497 ms
(3 rows)



Wow -- so what does that mean? Do I need to leave my work_mem at 100MB?? I mentioned that my application actually uses a cursor to walk through this data. Even though the bitmap scan technically had the fastest time with explain analyze, it takes a long while (20 seconds) before the results start to come back through the cursor. Conversely, with the index scan, results immediately come back through the cursor method (which is more desirable). Thoughts?

Example:
begin;
declare myCursor cursor for
select spacecraft_time_secs,mnemonic,volume_label from packets
where environment_name='PASITCTX01'
and system_time_secs>=1132272000
and system_time_secs<=1143244800;
fetch 10 from myCursor;
end;


PS, this is on a Sun Fire V240 with 4GB RAM, Solaris 8.

Thanks,
Steve


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
"Steinar H. Gunderson" writes:
> What about the working set? Have you tried running the queries multiple times
> in a row to see if the results change? It might be that your initial bitmap
> scan puts all the relevant bits into cache, which will skew the results.

If the examples were done in the order shown, the seqscan ought to have
pretty well blown out the cache ... but I concur that it'd be
interesting to check whether repeated executions of the same plan show
markedly different times.

Also, is the index order closely correlated to the actual physical
table order?

What is work_mem set to, and does increasing it substantially make the
bitmap scan work any better?

Considering that the query is fetching about half of the table, I'd have
thought that the planner was correct to estimate that bitmap or seqscan
ought to win. For the plain indexscan to win, the order correlation
must be quite strong, and I'm also guessing that the bitmap scan must
have run into some substantial trouble (like discarding a lot of info
because of lack of work_mem).

IIRC, the planner doesn't currently try to model the effects of a bitmap
scan going into lossy mode, which is something it probably should try to
account for.

regards, tom lane


---------------------------------
Sneak preview the all-new Yahoo.com. It's not radically different. Just radically better.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Steinar H. Gunderson 2006-05-18 19:48:13 Re: why is bitmap index chosen for this query?
Previous Message Tom Lane 2006-05-18 17:34:19 Re: why is bitmap index chosen for this query?