Re: Relation extension scalability

From: Qingqing Zhou <zhouqq(dot)postgres(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Relation extension scalability
Date: 2015-04-17 18:19:19
Message-ID: CAJjS0u0HgsTzFKEjg6+yOVgQswS5W0pySCa=nRXJcbhXvi13Uw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 29, 2015 at 11:56 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I'm not sure whether the above is the best solution however. For one I
> think it's not necessarily a good idea to opencode this in hio.c - I've
> not observed it, but this probably can happen for btrees and such as
> well. For another, this is still a exclusive lock while we're doing IO:
> smgrextend() wants a page to write out, so we have to be careful not to
> overwrite things.
>
I think relaxing a global lock will fix the contention mostly.
However, several people suggested that extending with many pages have
other benefits. This hints for a more fundamental change in our
storage model. Currently we map one file per relation. While it is
simple and robust, considering partitioned table, maybe later columnar
storage are integrated into the core, this model needs some further
thoughts. Think about a 1000 partitioned table with 100 columns: that
is 100K files, no to speak of other forks - surely we can continue
challenging file system's limit or playing around vfds, but we have a
chance now to think ahead.

Most commercial database employs a DMS storage model, where it manages
object mapping and freespace itself. So different objects are sharing
storage within several files. Surely it has historic reasons, but it
has several advantages over current model:
- remove fd pressure
- remove double buffering (by introducing ADIO)
- controlled layout and access pattern (sequential and read-ahead)
- better quota management
- performance potentially better

Considering platforms supported and the stableness period needed, we
shall support both current storage model and DMS model. I will stop
here to see if this deserves further discussion.

Regards,
Qingqing

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2015-04-17 18:19:48 Re: INSERT ... ON CONFLICT IGNORE (and UPDATE) 3.0
Previous Message Heikki Linnakangas 2015-04-17 18:18:08 Re: Replication identifiers, take 4