Re: _mdfd_getseg can be expensive

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: _mdfd_getseg can be expensive
Date: 2015-12-15 18:04:22
Message-ID: 20151215180422.GA6858@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-11-01 18:23:47 +0100, Andres Freund wrote:
> On 2014-11-01 12:57:40 -0400, Tom Lane wrote:
> > Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > > On 2014-10-31 18:48:45 -0400, Tom Lane wrote:
> > >> While the basic idea is sound, this particular implementation seems
> > >> pretty bizarre. What's with the "md_seg_no" stuff, and why is that
> > >> array typed size_t?
> >
> > > It stores the length of the array of _MdfdVec entries.
> >
> > Oh. "seg_no" seems like not a very good choice of name then.
> > Perhaps "md_seg_count" or something like that would be more intelligible.
>
> That's fine with me.

Went with md_num_open_segs - reading the code that was easier to
understand.

> So I'll repost a version with those fixes.

Took a while. But here we go. The attached version is a significantly
revised version of my earlier patch. Notably I've pretty much entirely
revised the code in _mdfd_getseg() to be more similar to the state in
master. Also some comment policing.

As it's more than a bit painful to test with large numbers of 1GB
segments, I've rebuilt my local postgres with 100kb segments. Running a
pgbench -s 300 with 128MB shared buffers clearly shows the efficiency
differnces: 320tps vs 4900 in an assertion enabled build. Obviously
that's kind of an artificial situation...

But I've also verified this with a) fake relations built out of sparse
files, b) actually large relations. The latter shows performance
benefits as well, but my patience limits the amount of testing I was
willing to do...

Kevin, Robert: I've CCed you because Robert commented in the recent
mdnblocks() blocks thread that Kevin lamented the O(n)'ness of md.c...

Andres

Attachment Content-Type Size
0001-Faster-PageIsVerified-for-the-all-zeroes-case.patch text/x-patch 1.4 KB
0002-Improve-scalability-of-md.c-for-large-relations.patch text/x-patch 19.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2015-12-15 18:46:03 Re: Cube extension kNN support
Previous Message Tomas Vondra 2015-12-15 17:53:02 Re: pam auth - add rhost item