Quick Links

Re: TB-sized databases

From:	Michael Stone <mstone+postgres(at)mathom(dot)us>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: TB-sized databases
Date:	2007-12-06 15:42:14
Message-ID:	20071206154212.GP5294@mathom.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Thu, Dec 06, 2007 at 09:38:16AM +0000, Simon Riggs wrote:
>The issue is that if somebody issues a "large query" then it will be a
>problem whichever plan the query takes. Forcing index scans can make a
>plan more expensive than a seq scan in many cases.

OTOH, the planner can really screw up queries on really large databases.
IIRC, the planner can use things like unique constraints to get some
idea, e.g., of how many rows will result from a join. Unfortunately,
the planner can't apply those techniques to certain constructs common in
really large db's (e.g., partitioned tables--how do you do a unique
constraint on a partitioned table?) I've got some queries that the
planner thinks will return on the order of 10^30 rows for that sort of
reason. In practice, the query may return 10^3 rows, and the difference
between the seq scan and the index scan is the difference between a
query that takes a few seconds and a query that I will never run to
completion. I know the goal would be to make the planner understand
those queries better, but for now the answer seems to be to craft the
queries very carefully and run explain first, making sure you see index
scans in the right places.

Mike Stone

In response to

Re: TB-sized databases at 2007-12-06 09:38:16 from Simon Riggs

Responses

Re: TB-sized databases at 2007-12-06 16:13:18 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2007-12-06 16:13:18	Re: TB-sized databases
Previous Message	Simon Riggs	2007-12-06 09:38:16	Re: TB-sized databases