From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, dilipbalaut(at)gmail(dot)com |
Subject: | Parallel Bitmap scans a bit broken |
Date: | 2017-03-09 15:47:37 |
Message-ID: | CAKJS1f8OtrHE+-P+=E=4ycnL29e9idZKuaTQ6o2MbhvGN9D8ig@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I was just doing some testing on [1] when I noticed that there's a problem
with parallel bitmap index scans scans.
Test case:
patch with [1]
=# create table r1(value int);
CREATE TABLE
=# insert into r1 select (random()*1000)::int from
generate_Series(1,1000000);
INSERT 0 1000000
=# create index on r1 using brin(value);
CREATE INDEX
=# set enable_seqscan=0;
SET
=# explain select * from r1 where value=555;
QUERY PLAN
-----------------------------------------------------------------------------------------
Gather (cost=3623.52..11267.45 rows=5000 width=4)
Workers Planned: 2
-> Parallel Bitmap Heap Scan on r1 (cost=2623.52..9767.45 rows=2083
width=4)
Recheck Cond: (value = 555)
-> Bitmap Index Scan on r1_value_idx (cost=0.00..2622.27
rows=522036 width=0)
Index Cond: (value = 555)
(6 rows)
=# explain analyze select * from r1 where value=555;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
The crash occurs in tbm_shared_iterate() at:
PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
I see in tbm_prepare_shared_iterate() tbm->npages is zero. I'm unsure if
bringetbitmap() does something different with npages than btgetbitmap()
around setting npages?
But anyway, due to the npages being 0 the tbm->ptpages is not allocated
in tbm_prepare_shared_iterate()
if (tbm->npages)
{
tbm->ptpages = dsa_allocate(tbm->dsa, sizeof(PTIterationArray) +
tbm->npages * sizeof(int));
so when tbm_shared_iterate runs this code;
/*
* If both chunk and per-page data remain, must output the numerically
* earlier page.
*/
if (istate->schunkptr < istate->nchunks)
{
PagetableEntry *chunk = &ptbase[idxchunks[istate->schunkptr]];
PagetableEntry *page = &ptbase[idxpages[istate->spageptr]];
BlockNumber chunk_blockno;
chunk_blockno = chunk->blockno + istate->schunkbit;
if (istate->spageptr >= istate->npages ||
chunk_blockno < page->blockno)
{
/* Return a lossy page indicator from the chunk */
output->blockno = chunk_blockno;
output->ntuples = -1;
output->recheck = true;
istate->schunkbit++;
LWLockRelease(&istate->lock);
return output;
}
}
it fails, due to idxpages pointing to random memory
Probably this is a simple fix for the authors, so passing it along. I'm a
bit unable to see how the part above is meant to work.
[1]
https://www.postgresql.org/message-id/attachment/50164/brin-correlation-v3.patch
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2017-03-09 15:54:11 | Re: [bug fix] dblink leaks unnamed connections |
Previous Message | Amit Kapila | 2017-03-09 15:44:05 | Re: Write Ahead Logging for Hash Indexes |