Re: Parallel Seq Scan

From: Daniel Bausch <bausch(at)dvs(dot)tu-darmstadt(dot)de>
To: David Fetter <david(at)fetter(dot)org>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, John Gorman <johngorman2(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Parallel Seq Scan
Date: 2015-02-06 14:05:29
Message-ID: 8761bfdtjq.fsf@gelnhausen.dvs.informatik.tu-darmstadt.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi David and others!

David Fetter <david(at)fetter(dot)org> writes:

> On Tue, Jan 27, 2015 at 08:02:37AM +0100, Daniel Bausch wrote:
>>
>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>>
>> >> Wait for first IO, issue second IO request
>> >> Compute
>> >> Already have second IO request, issue third
>> >> ...
>> >
>> >> We'd be a lot less sensitive to IO latency.
>> >
>> > It would take about five minutes of coding to prove or disprove this:
>> > stick a PrefetchBuffer call into heapgetpage() to launch a request for the
>> > next page as soon as we've read the current one, and then see if that
>> > makes any obvious performance difference. I'm not convinced that it will,
>> > but if it did then we could think about how to make it work for real.
>>
>> Sorry for dropping in so late...
>>
>> I have done all this two years ago. For TPC-H Q8, Q9, Q17, Q20, and Q21
>> I see a speedup of ~100% when using IndexScan prefetching + Nested-Loops
>> Look-Ahead (the outer loop!).
>> (On SSD with 32 Pages Prefetch/Look-Ahead + Cold Page Cache / Small RAM)
>
> Would you be so kind as to pass along any patches (ideally applicable
> to git master), tests, and specific measurements you made?

Attached find my patches based on the old revision
36f4c7843cf3d201279855ed9a6ebc1deb3c9463
(Adjust cube.out expected output for new test queries.)

I did not test applicability against HEAD by now.

Disclaimer: This was just a proof-of-concept and so is poor
implementation quality. Nevertheless, performance looked promising
while it still needs a lot of extra rules for special cases, like
detecting accidential sequential scans. General assumption is: no
concurrency - a single query owning the machine.

Here is a comparison using dbt3. Q8, Q9, Q17, Q20, and Q21 are
significantly improved.

| | baseline | indexscan | indexscan+nestloop |
| | | patch 1+2 | patch 3 |
|-----+------------+------------+--------------------|
| Q1 | 76.124261 | 73.165161 | 76.323119 |
| Q2 | 9.676956 | 11.211073 | 10.480668 |
| Q3 | 36.836417 | 36.268022 | 36.837226 |
| Q4 | 48.707501 | 64.2255 | 30.872218 |
| Q5 | 59.371467 | 59.205048 | 58.646096 |
| Q6 | 70.514214 | 73.021006 | 72.64643 |
| Q7 | 63.667594 | 63.258499 | 62.758288 |
| Q8 | 70.640973 | 33.144454 | 32.530732 |
| Q9 | 446.630473 | 379.063773 | 219.926094 |
| Q10 | 49.616125 | 49.244744 | 48.411664 |
| Q11 | 6.122317 | 6.158616 | 6.160189 |
| Q12 | 74.294292 | 87.780442 | 87.533936 |
| Q13 | 32.37932 | 32.771938 | 33.483444 |
| Q14 | 47.836053 | 48.093996 | 47.72221 |
| Q15 | 139.350038 | 138.880208 | 138.681336 |
| Q16 | 12.092429 | 12.120661 | 11.668971 |
| Q17 | 9.346636 | 4.106042 | 4.018951 |
| Q18 | 66.106875 | 123.754111 | 122.623193 |
| Q19 | 22.750504 | 23.191532 | 22.34084 |
| Q20 | 80.481986 | 29.906274 | 28.58106 |
| Q21 | 396.897269 | 355.45988 | 214.44184 |
| Q22 | 6.834841 | 6.600922 | 6.524032 |

Regards,
Daniel
--
MSc. Daniel Bausch
Research Assistant (Computer Science)
Technische Universität Darmstadt
http://www.dvs.tu-darmstadt.de/staff/dbausch

Attachment Content-Type Size
0001-Quick-proof-of-concept-for-indexscan-prefetching.patch text/x-diff 5.3 KB
0002-Fix-index-only-scan-and-rescan.patch text/x-diff 4.3 KB
0003-First-try-on-tuple-look-ahead-in-nestloop.patch text/x-diff 16.5 KB
0004-Limit-recursive-prefetching-for-merge-join.patch text/x-diff 10.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Syed, Rahila 2015-02-06 14:35:12 Re: [REVIEW] Re: Compression of full-page-writes
Previous Message Jan Wieck 2015-02-06 13:49:03 Re: Possible problem with pgcrypto