From: | "Josh Harrison" <joshques(at)gmail(dot)com> |
---|---|
To: | "Ivan Sergio Borgonovo" <mail(at)webthatworks(dot)it> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: quick question abt pg_dump and restore |
Date: | 2008-01-09 16:46:14 |
Message-ID: | 8d89ea1d0801090846n7923b520y74200801b0f09351@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Jan 9, 2008 11:28 AM, Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> wrote:
> On Wed, 09 Jan 2008 10:54:21 -0500
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> > Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > > Josh Harrison escribió:
> > >> Fine. I can use order by when I want to order it in terms of
> > >> some columns. But What if I want to maintain the same order as
> > >> in the database1? ie., I want my rows of TableABC in Database2
> > >> to be the same order as the rows in TableABC in Database 1 ???
> >
> > > You can't.
> >
> > According to the SQL standard, a table is an *unordered* collection
> > of rows, and the results of any query are produced in an
> > unspecified order (unless you use ORDER BY). The ambiguity about
> > row ordering is intentional and is exploited by most DBMSes
> > including Postgres to improve implementation efficiency. If you
> > assume there is such a thing as a specific ordering within a table,
> > you'll live to regret it eventually.
>
> Does it make any sense *knowing* how the implementation works to load
> records in a table in a specific order to improve performances?
>
> And yeah I know that once you start deleting/updating row you may
> lose the advantage you gained betting on some peculiarity of the
> implementation... but in case you're dealing with a mostly static
> table?
>
> eg. if I'm importing a table does it make any sense to pre-sort it
> before importing it in postgres?
>
Okay. Let me explain this again
Lets say you load the data related to a particular person sequentially into
a table in the test database. This results in all of the data for any one
person being located one or a very few sequential data blocks. Testing
access to the person's data then reveals access to be very fast. However,
the data is never loaded in that way in the production database. It is
almost always spread out across many data blocks within the database,
roughly organized by the date-time in which the data arrived.In this case
access to a particular person's data is not as fast as compared to the
previous one where the data are located close to each other.
we have this problem when we compare Oracle's performance with postgres
since Oracle has index-only scan where it can access the data just from the
index when the query involves only indexed columns. But since postgres
currently doesn't implement index-only scan and goes to the heap for
fetching the tuples it becomes very slow when the data are shuffled
Let me know if it makes sense now
Thanks
josh
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Sullivan | 2008-01-09 16:46:15 | Re: count(*) and bad design was: Experiences with extensibility |
Previous Message | Andrew Sullivan | 2008-01-09 16:39:43 | Re: quick question abt pg_dump and restore |