Re: Core team statement on replication in PostgreSQL

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: adsmail(at)wars-nicht(dot)de, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Core team statement on replication in PostgreSQL
Date: 2008-05-31 01:32:26
Message-ID: 4840AAAA.7010109@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-advocacy pgsql-hackers

Tatsuo Ishii wrote:
>> Andreas 'ads' Scherbaum wrote:
>>
>>> On Thu, 29 May 2008 23:02:56 -0400 Andrew Dunstan wrote:
>>>
>>>
>>>
>>>> Well, yes, but you do know about archive_timeout, right? No need to wait
>>>> 2 hours.
>>>>
>>>>
>>> Then you ship 16 MB binary stuff every 30 second or every minute but
>>> you only have some kbyte real data in the logfile. This must be taken
>>> into account, especially if you ship the logfile over the internet
>>> (means: no high-speed connection, maybe even pay-per-traffic) to the
>>> slave.
>>>
>>>
>>>
>>>
>>>
>> Sure there's a price to pay. But that doesn't mean the facility doesn't
>> exist. And I rather suspect that most of Josh's customers aren't too
>> concerned about traffic charges or affected by such bandwidth
>> restrictions. Certainly, none of my clients are, and they aren't in the
>> giant class. Shipping a 16Mb file, particularly if compressed, every
>> minute or so, is not such a huge problem for a great many commercial
>> users, and even many domestic users.
>>
>
> Sumitomo Electric Co., Ltd., a 20 billion dollars selling company in
> Japan (parent company of Sumitomo Electric Information Systems Co.,
> Ltd., which is one of the Recursive SQL development support company)
> uses 100 PostgreSQL servers. They are doing backups by using log
> shipping to another data center and have problems with the amount of
> the transferring log data. They said this is one of the big problems
> they have with PostgreSQL and hope it will be solved in the near
> future.
>
>

Excellent data point. Now, what I'd like to know is whether they are
getting into trouble simply because of the volume of log data generated
or because they have a short archive_timeout set. If it's the former
(which seems more likely) then none of the ideas I have seen so far in
this discussion seemed likely to help, and that would indeed be a major
issue we should look at. Another question is this: are they being
overwhelmed by the amount of network traffic generated, or by difficulty
in postgres producers and/or consumers to keep up? If it's network
traffic, then perhaps compression would help us.

Maybe we need to set some goals for the level of log volumes we expect
to be able to create/send/comsume.

cheers

andrew

In response to

Browse pgsql-advocacy by date

  From Date Subject
Next Message Gurjeet Singh 2008-05-31 03:39:14 Re: Core team statement on replication in PostgreSQL
Previous Message Tatsuo Ishii 2008-05-31 00:56:19 Re: Core team statement on replication in PostgreSQL

Browse pgsql-hackers by date

  From Date Subject
Next Message Gurjeet Singh 2008-05-31 03:39:14 Re: Core team statement on replication in PostgreSQL
Previous Message Tatsuo Ishii 2008-05-31 00:56:19 Re: Core team statement on replication in PostgreSQL