From: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> |
---|---|
To: | pgsql-performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: large dataset with write vs read clients |
Date: | 2010-10-10 06:43:12 |
Message-ID: | 4CB16080.1050406@postnewspapers.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 10/10/2010 5:35 AM, Mladen Gogala wrote:
> I have a logical problem with asynchronous commit. The "commit" command
> should instruct the database to make the outcome of the transaction
> permanent. The application should wait to see whether the commit was
> successful or not. Asynchronous behavior in the commit statement breaks
> the ACID rules and should not be used in a RDBMS system. If you don't
> need ACID, you may not need RDBMS at all. You may try with MongoDB.
> MongoDB is web scale: http://www.youtube.com/watch?v=b2F-DItXtZs
That argument makes little sense to me.
Because you can afford a clearly defined and bounded loosening of the
durability guarantee provided by the database, such that you know and
accept the possible loss of x seconds of work if your OS crashes or your
UPS fails, this means you don't really need durability guarantees at all
- let alone all that atomic commit silliness, transaction isolation, or
the guarantee of a consistent on-disk state?
Some of the other flavours of non-SQL databases, both those that've been
around forever (PICK/UniVerse/etc, Berkeley DB, Cache, etc) and those
that're new and fashionable Cassandra, CouchDB, etc, provide some ACID
properties anyway. If you don't need/want an SQL interface to your
database you don't have to throw out all that other database-y goodness
if you haven't been drinking too much of the NoSQL kool-aid.
There *are* situations in which it's necessary to switch to relying on
distributed, eventually-consistent databases with non-traditional
approaches to data management. It's awfully nice not to have to, though,
and can force you to do a lot more wheel reinvention when it comes to
querying, analysing and reporting on your data.
FWIW, a common approach in this sort of situation has historically been
- accepting that RDBMSs aren't great at continuous fast loading of
individual records - to log the records in batches to a flat file,
Berkeley DB, etc as a staging point. You periodically rotate that file
out and bulk-load its contents into the RDBMS for analysis and
reporting. This doesn't have to be every hour - every minute is usually
pretty reasonable, and still gives your database a much easier time
without forcing you to modify your app to batch inserts into
transactions or anything like that.
--
Craig Ringer
Tech-related writing at http://soapyfrogs.blogspot.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Mladen Gogala | 2010-10-10 06:55:39 | Re: large dataset with write vs read clients |
Previous Message | Samuel Gendler | 2010-10-10 03:07:00 | Re: Slow count(*) again... |