Re: BitmapHeapScan streaming read user and prelim refactoring

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-03-29 11:17:13
Message-ID: CA+hUKGKVT1DpFG_k7xAm_6Mepeg0U76XwRPh+gx9nAGvUzZOwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I spent a bit of time today testing Melanie's v11, except with
read_stream.c v13, on Linux, ext4, and 3000 IOPS cloud storage. I
think I now know roughly what's going on. Here are some numbers,
using your random table from above and a simple SELECT * FROM t WHERE
a < 100 OR a = 123456. I'll keep parallelism out of this for now.
These are milliseconds:

eic unpatched patched
0 4172 9572
1 30846 10376
2 18435 5562
4 18980 3503
8 18980 2680
16 18976 3233

So with eic=0, unpatched wins. The reason is that Linux readahead
wakes up and scans the table at 150MB/s, because there are enough
clusters to trigger it. But patched doesn't look quite so sequential
because we removed the sequential accesses by I/O combining...

At eic=1, unpatched completely collapses. I'm not sure why exactly.

Once you go above eic=1, Linux seems to get out of the way and just do
what we asked it to do: iostat shows exactly 3000 IOPS, exactly 8KB
avg read size, and (therefore) throughput of 24MB/sec, though you can
see the queue depth being exactly what we asked it to do,eg 7.9 or
whatever for eic=8, while patched eats it for breakfast because it
issues wide requests, averaging around 27KB.

It seems more informative to look at the absolute numbers rather than
the A/B ratios, because then you can see how the numbers themselves
are already completely nuts, sort of interference patterns from
interaction with kernel heuristics.

On the other hand this might be a pretty unusual data distribution.
People who store random numbers or hashes or whatever probably don't
really search for ranges of them (unless they're trying to mine
bitcoins in SQL). I dunno. Maybe we need more realistic tests, or
maybe we're just discovering all the things that are bad about the
pre-existing code.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Danil Anisimow 2024-03-29 11:20:11 Re: Comments on Custom RMGRs
Previous Message Tomas Vondra 2024-03-29 11:05:15 Re: BitmapHeapScan streaming read user and prelim refactoring