Re: PostgreSQL clustering VS MySQL clustering

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: josh(at)agliodbs(dot)com
Cc: pgsql-performance(at)postgresql(dot)org, darcy(at)wavefire(dot)com, jd(at)www(dot)commandprompt(dot)com, sfrost(at)snowman(dot)net, herve(at)elma(dot)fr
Subject: Re: PostgreSQL clustering VS MySQL clustering
Date: 2005-01-22 03:01:28
Message-ID: 20050122.120128.74753619.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Tatsuo,
>
> > Suppose table A gets updated on the master at time 00:00. Until 00:03
> > pgpool needs to send all queries regarding A to the master only. My
> > question is, how can pgpool know a query is related to A?
>
> Well, I'm a little late to head off tangental discussion about this, but ....
>
> The systems where I've implemented something similar are for web applications.
> In the case of the web app, you don't care if a most users see data which is
> 2 seconds out of date; with caching and whatnot, it's often much more than
> that!
>
> The one case where it's not permissable for a user to see "old" data is the
> case where the user is updating the data. Namely:
>
> (1) 00:00 User A updates "My Profile"
> (2) 00:01 "My Profile" UPDATE finishes executing.
> (3) 00:02 User A sees "My Profile" re-displayed
> (6) 00:04 "My Profile":UserA cascades to the last Slave server
>
> So in an application like the above, it would be a real problem if User A were
> to get switched over to a slave server immediately after the update; she
> would see the old data, assume that her update was not saved, and update
> again. Or send angry e-mails to webmaster(at)(dot)
>
> However, it makes no difference what User B sees:
>
> (1) 00:00 User A updates "My Profile"v1 Master
> (2) 00:01 "My Profile" UPDATE finishes executing. Master
> (3) 00:02 User A sees "My Profile"v2 displayed Master
> (4) 00:02 User B requests "MyProfile":UserA Slave2
> (5) 00:03 User B sees "My Profile"v1 Slave2
> (6) 00:04 "My Profile"v2 cascades to the last Slave server Slave2
>
> If the web application is structured properly, the fact that UserB is seeing
> UserA's information which is 2 seconds old is not a problem (though it might
> be for web auctions, where it could result in race conditions. Consider
> memcached as a helper). This means that pgPool only needs to monitor
> "update switching" by *connection* not by *table*.
>
> Make sense?

I'm not clear what "pgPool only needs to monitor "update switching" by
*connection* not by *table*" means. In your example:

> (1) 00:00 User A updates "My Profile"
> (2) 00:01 "My Profile" UPDATE finishes executing.
> (3) 00:02 User A sees "My Profile" re-displayed
> (6) 00:04 "My Profile":UserA cascades to the last Slave server

I think (2) and (3) are on different connections, thus pgpool cannot
judge if SELECT in (3) should go only to the master or not.

To solve the problem you need to make pgpool understand "web sessions"
not "database connections" and it seems impossible for pgpool to
understand "sessions".
--
Tatsuo Ishii

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tatsuo Ishii 2005-01-22 03:13:00 Re: PostgreSQL clustering VS MySQL clustering
Previous Message Ioannis Theoharis 2005-01-22 01:09:28 Re: inheritance performance