Re: Some ideas about Vacuum

From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Markus Schiltknecht" <markus(at)bluegap(dot)ch>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "pgsql-hackers list" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Some ideas about Vacuum
Date: 2008-01-10 09:16:31
Message-ID: 9362e74e0801100116s368a30der4d6b2a29d588b45c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Markus,
I was re-thinking about what you said. I feel, if we read the WAL
through archiver(Where the archiver is switched on), which anyway reads the
entire WAL Log, it might save some CPU cycles off updates, inserts and
deletes.
The question is about reducing I/Os and i have no doubt about it.
But if we create the WAL Log in a seperate disk and we make the Vacuum scan
through it(in case the archiver is absent), it would reduce the I/O off the
disk containing the data. Essentially the I/O effects are seperated. We
might end up doing more I/Os, but it would not affect the OLTP transactions.
I would also like to clarify one more thing. I am not asking to
remove the DSM approach. But i am just thinking of creating the DSM by
reading through the WAL Logs, instead of asking the Inserts, updates and
deletes to do the DSM creation.
Of course, if a person places both WAL logs and Data files in the
same disk drives, this would reduce the performance. But can we take that
hit?
I think what Gregory is coming at is, "if we schedule the Vacuum
after 20% of table changes, then we essentially say we need 120% of the disk
space and hence our select operations might end up doing more I/Os."
Please put forward your suggestions.

Hi All,

Essentially concluding
a) If there is a archiver running, we are putting slightly more CPU cycles
on the archiver to help form the DSM.
b) If there is no archiver, if the DBA places the WAL in a seperate disk,
Vacuum will do more I/O on that disk to form the DSM.
c) In case someone has not schedules both archiver and is not ready to spare
a disk for WAL, this approach reduces the performance of that setup.
Are my conclusions right?
If they are right, how much percentage constitute the third part? (Field
experts out there!!)
If the percentage is more, we should stop this line of thinking.

Thanks,
Gokul.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Schiltknecht 2008-01-10 10:13:35 Re: Some ideas about Vacuum
Previous Message Warren Turkal 2008-01-10 07:30:42 Re: flex/bison output wrongly created in the source directory