Re: Parallel Seq Scan

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, John Gorman <johngorman2(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel Seq Scan
Date: 2015-01-27 21:46:44
Message-ID: 20150127214644.GN3854@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert, all,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> There is a considerable amount of variation in the amount of time this
> takes to run based on how much of the relation is cached. Clearly,
> there's no way for the system to cache it all, but it can cache a
> significant portion, and that affects the results to no small degree.
> dd on hydra prints information on the data transfer rate; on uncached
> 1GB segments, it runs at right around 400 MB/s, but that can soar to
> upwards of 3GB/s when the relation is fully cached. I tried flushing
> the OS cache via echo 1 > /proc/sys/vm/drop_caches, and found that
> immediately after doing that, the above command took 5m21s to run -
> i.e. ~321000 ms. Most of your test times are faster than that, which
> means they reflect some degree of caching. When I immediately reran
> the command a second time, it finished in 4m18s the second time, or
> ~258000 ms. The rate was the same as the first test - about 400 MB/s
> - for most of the files, but 27 of the last 28 files went much faster,
> between 1.3 GB/s and 3.7 GB/s.

[...]

> With 0 workers, first run took 883465.352 ms, and second run took 295050.106 ms.
> With 8 workers, first run took 340302.250 ms, and second run took 307767.758 ms.
>
> This is a confusing result, because you expect parallelism to help
> more when the relation is partly cached, and make little or no
> difference when it isn't cached. But that's not what happened.

These numbers seem to indicate that the oddball is the single-threaded
uncached run. If I followed correctly, the uncached 'dd' took 321s,
which is relatively close to the uncached-lots-of-workers and the two
cached runs. What in the world is the uncached single-thread case doing
that it takes an extra 543s, or over twice as long? It's clearly not
disk i/o which is causing the slowdown, based on your dd tests.

One possibility might be round-trip latency. The multi-threaded case is
able to keep the CPUs and the i/o system going, and the cached results
don't have as much latency since things are cached, but the
single-threaded uncached case going i/o -> cpu -> i/o -> cpu, ends up
with a lot of wait time as it switches between being on CPU and waiting
on the i/o.

Just some thoughts.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-01-27 22:30:46 Re: Shortcoming in CLOBBER_FREED_MEMORY coverage: disk buffer pointers
Previous Message Robert Haas 2015-01-27 21:10:30 Re: Parallel Seq Scan