Modeling consumed shmem sizes, and some thorns

From: Daniel Farina <daniel(at)heroku(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Modeling consumed shmem sizes, and some thorns
Date: 2012-05-02 20:38:34
Message-ID: CAAZKuFZrsYpA85c3uBwsrVVdAYxMJ_K1SkxKPp+M9KxomCq8SQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello List,

I'd like to share with you some experiences we've had while
investigating what we'd have to do to make very-very tiny databases.

First, the formulae at
http://www.postgresql.org/docs/9.1/static/kernel-resources.html#SHARED-MEMORY-PARAMETERS
(17-2) seem misleading, particularly with regard to the overhead of
supporting a large number of connections: it undercounts by a rather
lot. I think the estimate of 270 bytes per lock seems too far low on
Linux-amd64. In addition, this number seem undercounted by 50% or
more because of the addition of predicate locks in 9.1.

Presuming the base cost of 1800-base-cost-per-connection is still
right, experimentally it seems to me that the right numbers are closer
to 700 bytes per max_locks_per_transaction, and 650 for each
max_pred_locks_per_transaction, although there appear to be some
non-linear behavior that make this a hazy projection.

Besides accuracy, there is a thornier problem here that has to do with
hot standby (although the use case is replication more generally) when
one has heterogeneously sized database resources. As-is, it is
required that locking-related structures -- max_connections,
max_prepared_xacts, and max_locks_per_xact (but not predicate locks,
is that an oversight?) must be a larger number on a standby than on a
primary.

In a heterogeneous environment where one uses WAL-based replication,
that means that to obtain unity and full compatibility among
different-sized systems one must always permit a large number of
connections (specifically, the largest number supported by any
database configuration), and those large number of connections can
occupy a large fraction of the overall memory allotted to a small
database, making the amount of lock-related memory consumption on,
say, a database that is intended to only receive 100MB of shmem
approach nearly 50% of the overall total, and that is rather
unfortunate. I can see why that'd be hard to fix (maybe, instead, a
more logical replication layer is a better investment of time), but I
thought it an interesting consideration that was worth discussing.

--
fdr

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-05-02 20:39:04 Re: online debloatification (was: extending relations more efficiently)
Previous Message Merlin Moncure 2012-05-02 19:32:10 Re: How hard would it be to support LIKE in return declaration of generic record function calls ?