Re: Postgres as In-Memory Database?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Stefan Keller <sfkeller(at)gmail(dot)com>
Cc: pgsql-general List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgres as In-Memory Database?
Date: 2014-04-01 02:20:54
Message-ID: CAMkU=1zAY9hWXKWyDqBG1SoU50UkwKegP5Ts+s3QSoNqjKjwTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sunday, March 30, 2014, Stefan Keller <sfkeller(at)gmail(dot)com> wrote:

> Hi Jeff
>
> 2013/11/20 Jeff Janes <jeff(dot)janes(at)gmail(dot)com<javascript:_e(%7B%7D,'cvml','jeff(dot)janes(at)gmail(dot)com');>
> >
>
>>
>> I don't know what you mean about enhancements in the buffer pool. For an
>> in-memory database, there shouldn't be a buffer pool in the first place, as
>> it is *all* in memory.
>>
>
> You are right: In-memory DBs are making buffer-pooling obsolete - except
> for making data persistent (see below).
>

I would be very reluctant to use any database engine which considered disk
access obsolete. There will always be a risk where data grows to exceed
RAM, and where it would be inconvenient to expand RAM fast enough to
accommodate it. I've played those games enough with Perl and C in-memory
systems. You fight and squeeze to fit the data into RAM, then the data
size grows 3% and all of our work is for naught. You can buy more RAM, if
you have the budget, and the RAM isn't back-ordered for 3 months because
the factory that makes it had a fire, and if more RAM fits on your
motherboard, and....

> >
> > Do you know why it is slow? I'd give high odds that it would be a
> specific implementation detail in
> > the code that is suboptimal, or maybe a design decision of PostGIS,
> rather than some high level
> > architectural decision of PostgreSQL.
>
> Referring to the application is something you can always say - but
> shouldn't prevent on enhancing Postgres.
>
>
> Postgres has been enhanced. Now we need to change osm2pgsql to take
> advantage of them. It defines indexes on the tables that are going to be
> bulk loaded with COPY, which defeats some recent optimizations made to
> COPY. The creation of the indexes should be delayed until after the bulk
> load is done.
>
> A further enhancement to Postgres would be would be to automatically defer
> creation of the indexes when a table is truncated or created within a
> transaction, so that users get the benefit of the improvement
>
> These enhancements to osm2pgsql seem to be reasonable to me. I hope
> somebody has time to care about.
>

I have a fork of osm2pgsql on github which delays the index build until the
COPY is done. I'm not really motivated to convince anyone to merge it (as
my interest is postgresql not osm itself), but if someone wants to pick it
up, that is fine with me. It helps somewhat, but it is not a game-changer
because there are other bigger bottlenecks, at least for HDD based systems.

One of the bigger bottlenecks is building the GIN indexes on the way table
at the end. Setting maintenance_work_mem to huge values helps a lot, if
you can find a safe setting for it considering multiple index builds it
might be doing (at that point in the load, osm2pgsql's node cache has been
released, so there is substantial RAM to re-purpose). It would be better
for this use if PostgreSQL built the index by using an external sort,
rather than iterating over the table building maintenance_work_mem sized
chunks of red-black trees. The problem there is that osm uses the gin
index in an odd way (the vast majority of nodes occur in exactly one way,
with a minority occurring in more than one), and using a disk sort might
not be ideal for the more common use cases where GIN is used, where a given
token usually occurs in far more than one document. So an improvement that
only improves osm2pgsql and degrades other uses is unlikely to be adopted.

Another bottleneck is just the raw COPY into the node table. When that is
running against an unindexed table which was created in the same
transaction, I see that osm2pgsql takes about 50% of a CPU to print a
copy-stream, and postgresql uses about 50% of a CPU to parse that stream
and insert into the table. So they add up to about 1 CPU despite the fact
this a is multiple CPU machine. So they seem to be playing ping-pong with
the pipe buffer when in theory they should each by able to run at almost
full speed. I don't know how to get it stop playing ping-pong, but I have
other use cases where this shows up, so trade-off-free solution would be
nifty. I suspect that that is more of a kernel issue than either
postgresql or osm2pgsql.

You could do the COPY in parallel in multiple threads, but the problem
there is you can't use the "created in same transaction" optimization to
avoid WAL overhead. There is no fix to this without changing PostgreSQL to
accommodate it, but i have no clear idea how one would do that. Importing
a snapshot doesn't seem like it would be enough, as you can only import
snapshots for reading, not for writing.

Also, I'm not too sure how seriously to take the goal of optimizing
osm2pgsql. Development on it seems to be less than vigorous. And its
purpose is to create a database to be used, so wouldn't it make more sense
to optimize the use, not the creation? And if you do want to optimize the
creation, the obvious way to do it so to create the export in a way more
closely aligned to that need, rather than a generic export.

> In the meantime I discussed with HANA users an thought about what makes
> in-memory dbs special and how to configure Postgres to be an in-memory db.
>
> There seem to be two main things which make in-memory dbs special:
> 1. Index: Having all data in memory there is no need for a default index.
> A full-table scan "suddenly" becomes the default.
>

Surely not. I would say that full table scans are *already* the default,
deviated from only if it thinks an index seems to be better. If you don't
make an index, it can't seem to be better. And I don't know of any
higher-level in-memory language which fails to provide a way to do
efficient searching into an in-memory structure, usually in the form of
hash tables or balanced trees. If "let's seq scan everything as long as it
is already in memory" is a good idea, why would Perl, Java, Python, etc.
(not to mention C libraries and the source code of PostgreSQL itself)
provide ways to do efficient searches in memory?

The caveat here is you can't make certain constraints without an index. In
theory you *could* have an unique constraint without an index to support
it, but if it were implemented you would probably rapidly learn that you
don't actually want to do that.

> 2. Persistence: All updates/changes to the db are streamed to disk.
>

That is already done through WAL files. But they are also written to
memory in their proper location, and then that location is written to disk
as well in a scattered manner once per checkpoint. If you suppress
checkpoints entirely then every time the database is restarted it would
have to read the entire history of the database since creation from the WAL
to create the initial in-memory image. So, how should checkpoints be
handled?

> 3. An increase of data simply needs to be compensated with more memory
> (since memory became cheap).
>

The size of the data seems to increasing at the same rate as the RAM is
getting cheaper, if not faster. And the new RAM might need a new
motherboard, and then a new power supply, and a new dev environment and
new test environment and then a new rack to mount them, and then a union
electrician to wire up the new rack, and then a new data center to hold the
next new rack.....

The same could be said for new hard drives, too, but I usually have more
head room in my hard drives than in my RAM.

> AFAIK one can configure Postgres for all these properties: One can force
> the use of full-table scan(?) and replicate to disk(?).
>

Yes, already can be done. Don't build an index, and delay checkpoints as
long as you dare (1 hour is currently the max, but a simple hack can extend
that. Whether you will like the consequences of that is another matter).

>
> What do you think about this?
>

I think there is no reason to think these changes will do much good. In
memory databases usually run as libraries, not as client-server
infrastructures. Without that, you already have one hand behind your back.
Also, I think they usually map datatypes directly to hardware supported
types, which is surely faster but would render PostgreSQL's flexible type
system nonfunctional. I don't know how they deal with visibility, but I
have to assume they either compromise on correctness, or limit the amount
of updating which is possible on the in memory portion. By the time you do
that stuff, it isn't clear what would be left anymore of the things that
make PostgreSQL be PostgreSQL.

> Are there any show cases out there?
>

What did the HANA users have to say? Seems like they would be in the best
position to provide the test cases.

Cheers,

Jeff

>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rajeev rastogi 2014-04-01 05:37:09 Re: Doubts on startup costs
Previous Message Moshe Jacobson 2014-03-31 22:23:50 Re: Wanted: ALTER TRIGGER ... OWNED BY EXTENSION