Re: performance modality in 7.1 for large text attributes?

From: Paul A Vixie <vixie(at)mfnx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: performance modality in 7.1 for large text attributes?
Date: 2000-12-20 00:06:35
Message-ID: 200012200006.QAA34651@redpaul.mfnx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 1) Have you ran vacuum analyze after all these inserts to update database
> statistics? :) Without vacuum, pgsql will opt to table scan even when
> there's an index.

i hadn't, but i did, and it didn't make that particular difference:

vixie=# explain select file from rss where addr = '127.0.0.2';
NOTICE: QUERY PLAN:

Seq Scan on rss (cost=0.00..0.00 rows=1 width=12)

EXPLAIN

that sounded bad, so i

vixie=# vacuum analyze rss;
VACUUM

but when i reran the explain, it still said it was doing it sequentially:

vixie=# explain select file from rss where addr = '127.0.0.2';
NOTICE: QUERY PLAN:

Seq Scan on rss (cost=0.00..1685.10 rows=1 width=12)

EXPLAIN

i'll try remaking the table with "addr" as a unique key and see if that helps.

> 2) I'm not sure if you are executing pgcat 70k times or executing inner
> loop in pgcat 70k times. Postgres connection establishment is expensive.

it was 70K invocations, but connection establishment ought to be the same
for both "pgcat get" and "pgcat put" so this doesn't explain the difference
in the graphs.

> 3) Postgres INSERT is not very efficient if you are doing a bulk load of
> data (it has to reparse the statement every time). If you want to delete
> everything and load new data, use "COPY", which is about 5 times faster.

well, that doesn't help in my application. i'm trying to find out whether
pgsql can be used as the generic backend for MAPS RSS, and the only time i
expect to be doing bulk loads is during benchmarking and during transition.
so, the speed of a "pgcat get" really matters if i want the web server to
go fast when it gets hit by a lot of simultaneous lookups. so, even though
there are faster ways to do bulk loading, the current benchmark is accurate
for the real application's workload, which isn't about bulk loading.

> Oh...And since I have your attention, could you please resolve
> long-standing discussion between me and Tom Lane? :)
>
> Question is whether proper (standard/most-commonly-used) format for
> printing CIDR network address is 10/8 or 10.0.0.0/8 (i.e. should all
> octets be printed even if they are 0). After search of RFCs, there's
> nothing that specifies the standard, but 10.0.0.0/8 is used more often in
> examples than 10/8 form.
>
> Postgres uses 10/8 form, and I'm saying that 10.0.0.0/8 is more accepted
> by everyone else. (I.E. all software can deal with that, but not all
> software accepts 10/8).

cisco IOS just won't take 10/8 and insists on 10.0.0.0/8. you will never,
ever go wrong if you try to use 10.0.0.0/8, since everything that understands
CIDR understands that. 10/8 is a pleasant-appearing alternative format, but
it is not universally accepted and i recommend against it. (i'm not sure if
my original CIDR type implementation for pgsql output the shorthand or not;
if it did, then i apologize to one and all.)

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mikheev, Vadim 2000-12-20 00:08:00 RE: Who is a maintainer of GiST code ?
Previous Message Mikheev, Vadim 2000-12-19 23:48:14 RE: heap page corruption not easy