Re: Partitioning / Clustering

From: Alex Stapleton <alexs(at)advfn(dot)com>
To: Alex Turner <armtuk(at)gmail(dot)com>
Cc: PFC <lists(at)boutiquenumerique(dot)com>, josh(at)agliodbs(dot)com, David Roussel <pgsql-performance(at)diroussel(dot)xsmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Partitioning / Clustering
Date: 2005-05-12 15:16:53
Message-ID: 360CDE65-3FAF-428A-BA25-6F6ECCAA5689@advfn.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 12 May 2005, at 15:08, Alex Turner wrote:

> Having local sessions is unnesesary, and here is my logic:
>
> Generaly most people have less than 100Mb of bandwidth to the
> internet.
>
> If you make the assertion that you are transferring equal or less
> session data between your session server (lets say an RDBMS) and the
> app server than you are between the app server and the client, an out
> of band 100Mb network for session information is plenty of bandwidth.
> This also represents OLTP style traffic, which postgresql is pretty
> good at. You should easily be able to get over 100Tps. 100 hits per
> second is an awful lot of traffic, more than any website I've managed
> will ever see.
>
> Why solve the complicated clustered sessions problem, when you don't
> really need to?

100 hits a second = 8,640,000 hits a day. I work on a site which does
> 100 million dynamic pages a day. In comparison Yahoo probably does
> 100,000,000,000 (100 billion) views a day
if I am interpreting Alexa's charts correctly. Which is about
1,150,000 a second.

Now considering the site I work on is not even in the top 1000 on
Alexa, theres a lot of sites out there which need to solve this
problem I would assume.

There are also only so many hash table lookups a single machine can
do, even if its a Quad Opteron behemoth.

> Alex Turner
> netEconomist
>
> On 5/11/05, PFC <lists(at)boutiquenumerique(dot)com> wrote:
>
>>
>>
>>
>>> However, memcached (and for us, pg_memcached) is an excellent way to
>>> improve
>>> horizontal scalability by taking disposable data (like session
>>> information)
>>> out of the database and putting it in protected RAM.
>>>
>>
>> So, what is the advantage of such a system versus, say, a
>> "sticky
>> sessions" system where each session is assigned to ONE application
>> server
>> (not PHP then) which keeps it in RAM as native objects instead of
>> serializing and deserializing it on each request ?
>> I'd say the sticky sessions should perform a lot better,
>> and if one
>> machine dies, only the sessions on this one are lost.
>> But of course you can't do it with PHP as you need an app
>> server which
>> can manage sessions. Potentially the savings are huge, though.
>>
>> On Google, their distributed system spans a huge number of
>> PCs and it has
>> redundancy, ie. individual PC failure is a normal thing and is a
>> part of
>> the system, it is handled gracefully. I read a paper on this
>> matter, it's
>> pretty impressive. The google filesystem has nothing to do with
>> databases
>> though, it's more a massive data store / streaming storage.
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 1: subscribe and unsubscribe commands go to
>> majordomo(at)postgresql(dot)org
>>
>>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alex Turner 2005-05-12 16:05:32 Re: Partitioning / Clustering
Previous Message John A Meinel 2005-05-12 14:53:31 Re: tuning Postgres for large data import (using Copy from)