Re: [REVIEW] Re: Compression of full-page-writes

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Date: 2014-12-10 14:10:46
Message-ID: CAH2L28ujKkJHP--cYob9J1Q_dX3Yy2g-rKmeeOr-vQXoPrwSog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>What I would suggest is instrument the backend with getrusage() at
>startup and shutdown and have it print the difference in user time and
>system time. Then, run tests for a fixed number of transactions and
>see how the total CPU usage for the run differs.

Folllowing are the numbers obtained on tests with absolute CPU usage, fixed
number of transactions and longer duration with latest fpw compression
patch

pgbench command : pgbench -r -t 250000 -M prepared

To ensure that data is not highly compressible, empty filler columns were
altered using

alter table pgbench_accounts alter column filler type text using
gen_random_uuid()::text

checkpoint_segments = 1024
checkpoint_timeout = 5min
fsync = on

The tests ran for around 30 mins.Manual checkpoint was run before each test.

Compression WAL generated %compression Latency-avg CPU usage
(seconds) TPS Latency
stddev

on 1531.4 MB ~35 % 7.351 ms
user diff: 562.67s system diff: 41.40s 135.96
13.759 ms

off 2373.1 MB 6.781 ms
user diff: 354.20s system diff: 39.67s 147.40
14.152 ms

The compression obtained is quite high close to 35 %.
CPU usage at user level when compression is on is quite noticeably high as
compared to that when compression is off. But gain in terms of reduction of
WAL is also high.

Server specifications:
Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
RAM: 32GB
Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm

Thank you,

Rahila Syed

On Fri, Dec 5, 2014 at 10:38 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Dec 5, 2014 at 1:49 AM, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>
> wrote:
> >>If that's really true, we could consider having no configuration any
> >>time, and just compressing always. But I'm skeptical that it's
> >>actually true.
> >
> > I was referring to this for CPU utilization:
> >
> http://www.postgresql.org/message-id/1410414381339-5818552.post@n5.nabble.com
> > <http://>
> >
> > The above tests were performed on machine with configuration as follows
> > Server specifications:
> > Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
> > RAM: 32GB
> > Disk : HDD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
> > 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm
>
> I think that measurement methodology is not very good for assessing
> the CPU overhead, because you are only measuring the percentage CPU
> utilization, not the absolute amount of CPU utilization. It's not
> clear whether the duration of the tests was the same for all the
> configurations you tried - in which case the number of transactions
> might have been different - or whether the number of operations was
> exactly the same - in which case the runtime might have been
> different. Either way, it could obscure an actual difference in
> absolute CPU usage per transaction. It's unlikely that both the
> runtime and the number of transactions were identical for all of your
> tests, because that would imply that the patch makes no difference to
> performance; if that were true, you wouldn't have bothered writing
> it....
>
> What I would suggest is instrument the backend with getrusage() at
> startup and shutdown and have it print the difference in user time and
> system time. Then, run tests for a fixed number of transactions and
> see how the total CPU usage for the run differs.
>
> Last cycle, Amit Kapila did a bunch of work trying to compress the WAL
> footprint for updates, and we found that compression was pretty darn
> expensive there in terms of CPU time. So I am suspicious of the
> finding that it is free here. It's not impossible that there's some
> effect which causes us to recoup more CPU time than we spend
> compressing in this case that did not apply in that case, but the
> projects are awfully similar, so I tend to doubt it.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-12-10 14:22:25 Re: On partitioning
Previous Message Alex Shulgin 2014-12-10 14:07:26 Re: Small TRUNCATE glitch