Re: ext3 filesystem / linux 7.3

From: Kevin Brown <kevin(at)sysexperts(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: ext3 filesystem / linux 7.3
Date: 2003-04-08 23:22:47
Message-ID: 20030408232247.GH1847@filer
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Josh Berkus wrote:
> Jeffery,
>
> > Can't we generate data? Random data stored in random formats at random
> > sizes would stress the file system wouldn't it?
>
> In my experience, randomly generated data tends to resemble real data very
> little in distribution patterns and data types. This is one of the
> limitations of PGBench.

Okay, from this it sounds like what we need is information on the data
types typically used for real world applications and information on
the the distribution patterns for each type (the latter could get
quite complex and varied, I'm sure, but since we're after something
that's typical, we only need a few examples).

So perhaps the first step in this is to write something that will show
what the distribution pattern for data in a table is? With that
information, we *could* randomly generate data that would conform to
the statistical patterns seen in the real world.

In fact, even though the databases you have access to are all
proprietary, I'm pretty sure their owners would agree to let you run a
program that would gather statistical distribution about it. Then (as
long as they agree) you could copy the schema itself, recreate it on
the test system, and randomly generate the data.

--
Kevin Brown kevin(at)sysexperts(dot)com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Martijn van Oosterhout 2003-04-08 23:46:23 Re: Yet Another (Simple) Case of Index not used
Previous Message Josh Berkus 2003-04-08 21:52:40 Re: [SQL] Yet Another (Simple) Case of Index not used