Re: Sync Rep v17

From: Daniel Farina <daniel(at)heroku(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep v17
Date: 2011-02-25 02:13:25
Message-ID: AANLkTinFcM494Vn+Fj2nqctAVo4zZMv6zKn30TMWR18N@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 22, 2011 at 11:43 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Sat, Feb 19, 2011 at 11:26 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>> DEBUG:  write 0/3027BC8 flush 0/3014690 apply 0/3014690
>> DEBUG:  released 0 procs up to 0/3014690
>> DEBUG:  write 0/3027BC8 flush 0/3027BC8 apply 0/3014690
>> DEBUG:  released 2 procs up to 0/3027BC8
>> WARNING:  could not locate ourselves on wait queue
>> server closed the connection unexpectedly
>>        This probably means the server terminated abnormally
>>        before or while processing the request.
>> The connection to the server was lost. Attempting reset: DEBUG:
>
> you can make this happen more easily, i just run "pgbench -n -c10 -j10
> test" and qot that warning and sometimes a segmentation fault and
> sometimes a failed assertion

I have also reproduced this. Notably, things seem fine as long as
pgbench is confined to one backend, but as soon as two are used (-c 2)
by the feature I can get segfaults.

In the UI department, I am finding it somewhat difficult to affirm
that I am, in fact, synchronously replicating anything in the HA
scenario (where I never want to block. However, by enjoying the patch
at DEBUG3 and running what I think to be syncrepped and non-syncrepped
runs, I believe that I am not committing user error (seeing syncrep
waiting vs. lack thereof). This is in part hard to confirm because
the single-backend performance (if DEBUG3 is to be believed, I will
write a real torture test later) of syncrep is actually very good, I
was expecting a more perceptible performance dropoff. But then again,
I imagine the real kicker will happen when we can run concurrent
backends. Also, Amazon EBS doesn't have the fastest disks, and within
an availability zone network latency is awfully low.

I can't quite explain what I was seeing before w.r.t. memory usage,
and more pressingly, a very slow recover. I turned off hot standby and
was messing around and, before I knew it, the server was caught up. I
do not know if that was just coincidence (probably) or overhead
imposed by HS. The very high RES number was linux fooling me, as most
of it was SHR and in SHMEM.

--
fdr

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Farina 2011-02-25 02:26:01 Re: Sync Rep v17
Previous Message Robert Haas 2011-02-25 00:08:54 Re: Allow pg_archivecleanup to ignore extensions