Quick Links

Re: multiple sampling from tables and saving output

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	David Orme <d(dot)orme(at)imperial(dot)ac(dot)uk>
Cc:	pgsql-novice(at)postgresql(dot)org
Subject:	Re: multiple sampling from tables and saving output
Date:	2005-02-07 15:54:46
Message-ID:	27273.1107791686@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-novice

David Orme <d(dot)orme(at)imperial(dot)ac(dot)uk> writes:
> The process I need to do is a loop of 1000 repetitions of the
> following:

> 1) select a random subset of the data from a table
> 2) save various summaries of the randomly selected data

> I can think of various external ways of doing this - my current plan is
> to use a shell script to resend the same set of instructions repeated
> times using 'psql -f instruction_set.sql' - but I was wondering if
> there was a canonical way of doing this within pgsql.

If you want a sample of, say, 1% of the rows in a table, you can do

select * from mytable where random() < 0.01;

and get a genuinely unbiased sample. Keep in mind though that you can't
get an exact sample size this way --- it'll be close to 1% but probably
not spot on.

regards, tom lane

In response to

multiple sampling from tables and saving output at 2005-02-07 11:57:26 from David Orme

Browse pgsql-novice by date

	From	Date	Subject
Next Message	Tom Lane	2005-02-07 16:00:00	Re: Percent of update completed
Previous Message	DAVANNE Eric - NTR	2005-02-07 15:33:33	password expiration interval