Re: wait events for disk I/O

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: wait events for disk I/O
Date: 2017-03-06 08:27:55
Message-ID: CAGPqQf2-CYrM+dxPoge5Gm_G+QLq-PObbezWyjt=Vs2RsspnMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 4, 2017 at 7:53 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Mon, Feb 20, 2017 at 4:04 PM, Rushabh Lathia
> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
> >
> > My colleague Rahila reported compilation issue with
> > the patch. Issue was only coming with we do the clean
> > build on the branch.
> >
> > Fixed the same into latest version of patch.
> >
>
> Few assorted comments:
>
> 1.
> + <row>
> + <entry><literal>WriteBuffile</></entry>
> + <entry>Waiting during
> buffile read operation.</entry>
> + </row>
>
> Operation name and definition are not matching.
>
>
Will fix this.

>
> 2.
> +FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info)
> {
> #if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED)
> int returnCode;
> @@ -1527,8 +1527,10 @@ FilePrefetch(File file, off_t offset, int amount)
> if (returnCode < 0)
> return returnCode;
>
> + pgstat_report_wait_start(wait_event_info);
> returnCode = posix_fadvise(VfdCache[file].fd, offset, amount,
> POSIX_FADV_WILLNEED);
> + pgstat_report_wait_end();
>
> AFAIK, this call is non-blocking and will just initiate a read, so do
> you think we should record wait event for such a call.
>
>
Yes, prefecth call is a non-blocking and will just initiate a read. But
this info
about the prefetch into wait events will give more info about the system.

3.
> - written = FileWrite(src->vfd, waldata_start, len);
> + written = FileWrite(src->vfd, waldata_start, len,
> + WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK);
> if (written != len)
> ereport(ERROR,
> (errcode_for_file_access(),
> @@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state)
> hash_seq_init(&seq_status, state->rs_logical_mappings);
> while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) !=
> NULL)
> {
> - if (FileSync(src->vfd) != 0)
> + if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0)
>
>
> Do we want to consider recording wait event for both write and sync?
> It seems to me OS level writes are relatively cheap and sync calls are
> expensive, so shouldn't we just log for sync calls? I could see the
> similar usage at multiple places in the patch.
>
>
Yes, I thought of adding wait event only for the sync but then recording the
wait event for both write and sync. I understand that OS level writes are
cheap but it still have some cost attached to that. Also I thought for the
monitoring tool being develop using this wait events, will have more useful
capture data if we try to collect as much info as we can. Or may be not.

I am open for other opinion/suggestions.

--
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>

--
Rushabh Lathia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-03-06 08:29:01 Re: Partitioned tables and relfilenode
Previous Message Amit Langote 2017-03-06 08:23:33 Re: Partitioned tables and relfilenode