Re: Relation extension scalability

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: David Steele <david(at)pgmasters(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Relation extension scalability
Date: 2015-04-02 00:24:05
Message-ID: 551C8C25.1070904@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/30/15 10:48 PM, Amit Kapila wrote:
> > If we're able to extend based on page-level locks rather than the global
> > relation locking that we're doing now, then I'm not sure we really need
> > to adjust how big the extents are any more. The reason for making
> > bigger extents is because of the locking problem we have now when lots
> > of backends want to extend a relation, but, if I'm following correctly,
> > that'd go away with Andres' approach.
> >
>
> The benefit of extending in bigger chunks in background is that backend
> would need to perform such an operation at relatively lesser frequency
> which in itself could be a win.

The other potential advantage (and I have to think this could be a BIG
advantage) is extending by a large amount makes it more likely you'll
get contiguous blocks on the storage. That's going to make a big
difference for SeqScan speed. It'd be interesting if someone with access
to some real systems could test that. In particular, seqscan of a
possibly fragmented table vs one of the same size but created at once.
For extra credit, compare to dd bs=8192 of a file of the same size as
the overall table.

What I've seen in the real world is very, very poor SeqScan performance
of tables that were relatively large. So bad that I had to SeqScan 8-16
tables in parallel to max out the IO system the same way I could with a
single dd bs=8k of a large file (in this case, something like 480MB/s).
A single SeqScan would only do something like 30MB/s.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-04-02 00:38:55 Re: POLA violation with \c service=
Previous Message Michael Paquier 2015-04-02 00:20:51 Re: The return value of allocate_recordbuf()