Re: Searchable chess positions in a Postgress DB

From: Bèrto ëd Sèra <berto(dot)d(dot)sera(at)gmail(dot)com>
To: Sidney Cadot <sidney(at)jigsaw(dot)nl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Searchable chess positions in a Postgress DB
Date: 2012-04-11 09:43:27
Message-ID: CAKwGa_8EBQ_auMKoGuuJ=ypXhV+fvRB0C7GZVw2DNgwX-pq7pA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>
> Hi!
>

> As a hobby project, I am toying around with a database containing
> about 5 million chess games. On average, these games have about 80
> positions (~ 40 moves by both black and white), which means there are
> about 400 million chess positions in there.
>

What happens if you think of it in terms of "space of phases", so that we
define a single "position" as a record, and build a game as a path from the
common starting position to the end game position? This would, among other
things, tell you what games passed through exactly the same position. You'd
need some sort of soundex algorithm to find out similar positions, through
a ranking function. Once you have that you could analyse different players'
strategic thought.

It would probably also pretty much compress the volume of your data. It
would grow a lot in the beginning, but once most common positions are
mapped the growth should slow down a lot.

Besides, I'd use a point type to build the chessboard, rather than
a traditional chess Letter/Number notation.

Bèrto

On 11 April 2012 08:15, Sidney Cadot <sidney(at)jigsaw(dot)nl> wrote:

> Dear all,
>
> As a hobby project, I am toying around with a database containing
> about 5 million chess games. On average, these games have about 80
> positions (~ 40 moves by both black and white), which means there are
> about 400 million chess positions in there.
>
> I have written code to extract these positions, and now I want to put
> them into a Postgres database. Specifically, I want to do this in a
> way that allows *fast* lookups of positions, e.g. "give me all
> positions that have a White King on c4 and either a Black Bishop or
> White Knight on f7".
>
> Currently, my "Positions" table looks like this:
>
> Column | Type | Modifiers
> -------------------+---------+-----------
> gameindex | integer | not null
> plyindex | integer | not null
> pseudofenboard | text | not null
> fenside | text | not null
> fencastling | text | not null
> fenenpassant | text | not null
> possiblemovecount | integer | not null
> isincheck | boolean | not null
> Indexes:
> "positions_pkey" PRIMARY KEY, btree (gameindex, plyindex)
> Foreign-key constraints:
> "positions_gameindex_fkey" FOREIGN KEY (gameindex) REFERENCES
> games(gameindex)
>
> The "PseudoFenBoard" field currently holds a string describing the
> position. For example, the starting position of chess looks like this:
>
> "rnbqkbnr/pppppppp/________/________/________/________/PPPPPPPP/RNBQKBNR"
>
> This design allows me to formulate the kind of positional queries that
> I want (by using regular expression matching), but executing them will
> involve a slow, linear traversal of the 400M table rows, which is not
> desirable.
>
> I am toying around with the ugly idea to make a "Positions" table that
> has a single field for each of the squares, e.g.
>
> CREATE TABLE Position2 (
> GameIndex INTEGER NOT NULL,
> PlyIndex INTEGER NOT NULL,
> a1 "char" NOT NULL,
> a2 "char" NOT NULL,
> -- (60 fields defs omitted)
> h7 "char" NOT NULL,
> h8 "char" NOT NULL
> );
>
> This would allow the creation of indices on each of the 64 fields
> separately, which should help to achieve near-instantaneous position
> query performance, especially after gathering proper statistics for
> all the field-specific indices.
>
> I realize that this design is quite ugly, so I would be interested to
> hear if there are nicer alternatives that can perform equally well.
>
> Also, above I use the 1-byte "char" type. Is this the only type in
> PostGres that is guaranteed to be just a single byte, or are there
> better alternatives? A 13-state enum would be best (listing the 6
> white pieces, 6 black pieces, and 'empty' states for every square on
> the board) but as I understand from the documentation, enums always up
> take 4 bytes per entry.
>
> Any ideas for improvement would be greatly appreciated.
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

--
==============================
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ondrej Ivanič 2012-04-11 10:01:55 Re: Searchable chess positions in a Postgress DB
Previous Message Gianni Ciolli 2012-04-11 09:32:38 Re: Searchable chess positions in a Postgress DB