Re: Indexing large table of coordinates with GiST

From: Daniel Begin <jfd553(at)hotmail(dot)com>
To: "'Paul Ramsey'" <pramsey(at)cleverelephant(dot)ca>, "'Andy Colson'" <andy(at)squeakycode(dot)net>, 'Rémi Cura' <remi(dot)cura(at)gmail(dot)com>, "'Vick Khera'" <vivek(at)khera(dot)org>, "'Rob Sargent'" <robjsargent(at)gmail(dot)com>
Cc: "'PostgreSQL General'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Indexing large table of coordinates with GiST
Date: 2015-01-15 20:36:28
Message-ID: COL129-DS1C1512BA86084C80BD942944E0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thank, there is a lot of potential ways to resolve this problem!

For Rob, here is a bit of context concerning my IT environment…

Windows 7 64b Desktop, running with an Intel i7 core and 16GB ram. The PostgreSQL 9.3 database is stored on a 3TB external drive (USB-3 connection with write cache enabled and backup battery) and a temp_tablespaces is pointing to a 1TB internal drive.

Now, let me answered/questioned given proposals in the order I received them…

1- Andy, I will set maintenance_work_mem as large as I can unless someone points to an important caveat.

2- Vick, partitioning the table could have been very interesting. However, I will have to query the table using both the node ID (which could have provided a nice partition criterion) and/or the node location (find nodes within a polygon). I am not familiar with table partition but I suspect I can’t create a spatial index on a table that have been partitioned (split into multiple tables that inherit from the “master" table). Am I right?

3- Rémi, so many rows does not necessarily mean either raster or points cloud (but it’s worth asking!-). As I mentioned previously, I must be able to query the table not only using nodes location (coordinates) but also using the few other fields the table contains (but mainly node IDs). So, I don’t think it could work, unless you tell me otherwise?

4- Paul, the nodes distribution is all over the world but mainly over inhabited areas. However, if I had to define a limit of some sort, I would use the dateline. Concerning spatial queries, I will want to find nodes that are within the boundary of irregular polygons (stored in another table). Is querying on irregular polygons is compatible with geohashing?

Regards,

Daniel

__________________________________________________________________

On Thu, Jan 15, 2015 at 7:44 AM, Daniel Begin <jfd553(at)hotmail(dot)com> wrote:

Hi, I'm trying to create an index on coordinates (geography type) over a
large table (4.5 billion records) using GiST...

CREATE INDEX nodes_geom_idx ON nodes USING gist (geom);

The command ran for 5 days until my computer stops because a power outage!
Before restarting the index creation, I am asking the community if there are
ways to shorten the time it took the first time :-)

Any idea?

Daniel

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Maciek Sakrejda 2015-01-15 21:02:42 sslcompression / PGSSLCOMPRESSION not behaving as documented?
Previous Message Paul Ramsey 2015-01-15 17:27:20 Re: Indexing large table of coordinates with GiST