Re: Postgres as In-Memory Database?

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Edson Richter <edsonrichter(at)hotmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Postgres as In-Memory Database?
Date: 2013-11-20 03:30:10
Message-ID: CAMkU=1y0_QQ3qOvgRsFbCTEfVcHG1qigbvGQwt+FUNCTdDJp7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tuesday, November 19, 2013, Edson Richter wrote:

> Em 19/11/2013 22:29, Jeff Janes escreveu:
>
> On Sun, Nov 17, 2013 at 4:46 PM, Edson Richter <edsonrichter(at)hotmail(dot)com<javascript:_e({}, 'cvml', 'edsonrichter(at)hotmail(dot)com');>
> > wrote:
>
>
>> Yes, those optimizations I was talking about: having database server
>> store transaction log in high speed solid state disks and consider it done
>> while background thread will update data in slower disks...
>>
>> There is no reason to wait for fsync in slow disks to guarantee
>> consistency... If database server crashes, then it just need to "redo" log
>> transactions from fast disk into slower data storage and database server is
>> ready to go (I think this is Sybase/MS SQL strategy for years).
>>
>
>
> Using a nonvolatile write cache for pg_xlog is certainly possible and
> often done with PostgreSQL. It is not important that the nonvolatile write
> cache is fronting for SSD, fronting for HDD is fine as the write cache
> turns the xlog into pure sequential writes and HDD should not have a
> problem keeping up.
>
> Cheers,
>
> Jeff
>
> Hum... I agree about the tecnology (SSD x HDD, etc) - but may be I
> misunderstood, but I have read that to keep always safe data, I must use
> fsync, and as result every transaction must wait for data to be written in
> disk before returning as success.
>

A transaction must wait for the *xlog* to fsynced to "disk", but
non-volatile write cache counts as disk. It does not need to wait for the
ordinary data files to be fsynced. Checkpoints do need to wait for the
ordinary data files to be fsynced, but the checkpoint process is a
background process and it can wait for that without impeding user processes.

If the checkpointer falls far enough behind, then things do start to fall
apart, but I think that this is true of any system. So you can't just get
get a BBU for the xlog and ignore all other IO entirely--eventually the
other data does need to reach disk, and if it gets dirtied faster than it
gets cleaned for a prolonged period then things will freeze up.

> By using the approach I've described you will have fsync (and data will be
> 100% safe), but transaction is considered success once written in the
> transaction log that is pure sequencial (and even pre-allocated space,
> without need to ask OS for new files or new space) - and also no need to
> wait for slow operations to write data in data pages.
>
> Am I wrong?
>

No user-facing process needs to wait for the data pages to fsync, unless
things have really gotten fouled up.

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2013-11-20 03:38:39 Re: Clang 3.3 Analyzer Results
Previous Message Peter Eisentraut 2013-11-20 03:03:31 Re: Clang 3.3 Analyzer Results