Quick Links

Re: pg_dump and pgpool

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	smarlowe(at)g2switchworks(dot)com
Cc:	tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-general(at)postgresql(dot)org
Subject:	Re: pg_dump and pgpool
Date:	2004-12-31 06:55:37
Message-ID:	20041231.155537.42772569.t-ishii@sra.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

> On Thu, 2004-12-30 at 09:20, Tom Lane wrote:
> > Scott Marlowe <smarlowe(at)g2switchworks(dot)com> writes:
> > >> I don't think it's worth that price to support a fundamentally bogus
> > >> approach to backup.
> >
> > > But it's not bogus. IT allows me to compare two databases running under
> > > a pgpool synchronous cluster and KNOW if there are inconsistencies in
> > > data between them, so it is quite useful to me.
> >
> > As a data comparison tool it is certainly bogus. What about different
> > row ordering between the two databases, for instance?
>
> Apparently pgpool knows that different order is ok.

I think pgpool actually behaves different from what you expect.
pgpool just ignores the content of data. Let me show you an example.

on the master:

test=# select * from t1;
i
---
1
2

on the secondary:
test=# select * from t1;
i
---
1
3
(2 rows)

result from pgpool:

test=# select * from t1;
i
---
1
2

However it checks the packet length. Here is another example.

on the master:

test=# select * from t2;
t
-----
abc
(1 row)

on the secondary:

test=# select * from t2;
t
------
abcd
(1 row)

result from pgpool:

test=# select * from t2;
t
-----
abc
(1 row)

LOG: pid 1093: SimpleForwardToFrontend: length does not match between backends master(13) secondary(14) kind:(D)

> Having three psql's
> open, one to the front end pgpool, one to each of the backends, I can
> insert data in different orders on each backend, select it on each, and
> get a different order, but from the front end it works:
>
> on the MASTER database:
> test=# \d
> List of relations
> Schema | Name | Type | Owner
> --------+------+-------+----------
> public | test | table | postgres
> (1 row)
>
> test=# insert into test values (2);
> INSERT 11839388 1
> test=# insert into test values (1);
> INSERT 11839389 1
> test=# select * from test;
> id
> ----
> 2
> 1
> (2 rows)
>
> on the SLAVE database:
> test=# \d
> List of relations
> Schema | Name | Type | Owner
> --------+------+-------+----------
> public | test | table | postgres
> (1 row)
>
> test=# insert into test values (1);
> INSERT 13612414 1
> test=# insert into test values (2);
> INSERT 13612415 1
> test=# select * from test;
> id
> ----
> 1
> 2
> (2 rows)
>
> On the front end:
> test=# select * from test;
> id
> ----
> 2
> 1
> (2 rows)
>
> Now I add a wrong row to the slave database:
>
> test=# insert into test values (3);
> INSERT 13612416 1
>
> and I get this error from the front end:
> test=# select * from test;
> ERROR: kind mismatch between backends
> HINT: check data consistency between master and secondary
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Succeeded.
> After deleting the row, things return to normal:
> test=# delete from test where id=3;
> DELETE 1
> >From the front end I get:
> test=# select * from test;
> id
> ----
> 2
> 1
> (2 rows)
>
> > AFAICS this could only work if you were doing physical rather than
> > logical replication (eg, shipping WAL logs) in which case the OIDs would
> > be just as much in sync as everything else.
>
> So, for me, the OIDs are the ONLY problem I'm getting here. Note that
> the application we're running on the front end only connects to the
> database with a single thread, and serializes in the intermediate layer
> (not my choice, but it seems to work pretty well so far...) so sequences
> also aren't an issue, as all the queries will go in one at a time.

I think in this case the row ordering problem will not hurt you.

> > Basically my point is that you are proposing to do a lot of work in
> > order to solve the first problem you are running up against, but that
> > will only get you to the next problem. I'm not prepared to accept a
> > significant increase in complexity and loss of maintainability in
> > pg_dump in order to move one step closer to the dead end that you will
> > certainly hit.
>
> I'm certainly willing to do the vast majority of the work. As Greg I
> think mentioned, maybe a fresh start using the information_schema would
> make sense as a sort of non-pg specific backup tool or something.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

In response to

Re: pg_dump and pgpool at 2004-12-30 18:15:32 from Scott Marlowe

Browse pgsql-general by date

	From	Date	Subject
Next Message	Secrétariat	2004-12-31 09:22:37	Re: Update rule
Previous Message	Bruno Wolff III	2004-12-31 06:21:40	Re: Large Objects