Quick Links

Re: Gsoc2012 Idea --- Social Network database schema

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Qi Huang <huangqiyx(at)hotmail(dot)com>, "neil(dot)conway" <neil(dot)conway(at)gmail(dot)com>, daniel <daniel(at)heroku(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: Gsoc2012 Idea --- Social Network database schema
Date:	2012-03-21 15:34:58
Message-ID:	1481.1332344098@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Well, the standard syntax apparently aims to reduce the number of
> returned rows, which ORDER BY does not. Maybe you could do it with
> ORDER BY .. LIMIT, but the idea here I think is that we'd like to
> sample the table without reading all of it first, so that seems to
> miss the point.

I think actually the traditional locution is more like
WHERE random() < constant
where the constant is the fraction of the table you want. And yeah,
the presumption is that you'd like it to not actually read every row.
(Though unless the sampling density is quite a bit less than 1 row
per page, it's not clear how much you're really going to win.)

regards, tom lane

In response to

Re: Gsoc2012 Idea --- Social Network database schema at 2012-03-21 15:26:17 from Robert Haas

Responses

Re: Gsoc2012 Idea --- Social Network database schema at 2012-03-21 15:49:55 from Robert Haas
Re: Gsoc2012 Idea --- Social Network database schema at 2012-03-22 16:38:17 from Kevin Grittner

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Qi Huang	2012-03-21 15:48:51	Re: Gsoc2012 Idea --- Social Network database schema
Previous Message	Pavel Stehule	2012-03-21 15:30:17	Re: Proposal: PL/pgPSM for 9.3