Re: about multiprocessingmassdata

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: about multiprocessingmassdata
Date: 2012-04-10 13:21:56
Message-ID: CAHyXU0zKZGAcU3cpdAYGFi846OoZv6my7C7rhxOQaaSMD+8eyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Apr 9, 2012 at 6:50 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> On 10.4.2012 00:37, Merlin Moncure wrote:
>> On Thu, Apr 5, 2012 at 9:47 AM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>>> If you have slower drives, the dependency is about linear (half the
>>> speed -> twice the time). So either your drives are very slow, or
>>> there's something rotten.
>>>
>>> I still haven's seen iostat / vmstat output ... that'd tell us much more
>>> about the causes.
>>
>> geometry column can potentially quite wide.  one thing we need to see
>> is the table has any indexes -- in particular gist/gin on the
>> geometry.
>
> Yeah, but in one of the previous posts the OP posted this:
>
> relname      | relpages  |  reltuples
> -------------+----------+-------------
> poi_all_new |  2421133 | 6.53328e+06
>
> which means the table has ~ 19GB for 6.5 million rows, so it's like
> 2.8GB per 1 million of rows, i.e. ~3kB per row. I've been working with 1
> million rows and 1.3GB of data, so it's like 50% of the expected amount.
>
> But this does not explain why the SELECT COUNT(*) takes 364 seconds on
> that machine. That'd mean ~8MB/s.
>
> Regarding the indexes, the the OP already posted a description of the
> table and apparently there are these indexes:
>
> Indexes:
>    "poi_all_new_pk" PRIMARY KEY, btree (ogc_fid)
>    "poi_all_new_flname_idx" btree (flname)
>    "poi_all_new_geom_idx" btree (wkb_geometry)
>    "poi_all_new_ogc_fid_idx" btree (ogc_fid)
>    "poi_all_new_pinyin_idx" btree (pinyin)
>
> So none of them is GIN/GIST although some one of them is on the geometry
> column.

hm. well, there's a duplicate index in there: ogc_fid is indexed
twice. how much bloat is on the table (let's see an ANALYZE VERBOSE)?
what's the storage for this database?

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Eyal Wilde 2012-04-11 22:11:45 scale up (postgresql vs mssql)
Previous Message Kim Hansen 2012-04-10 09:55:46 Re: Planner selects slow "Bitmap Heap Scan" when "Index Scan" is faster