From: | Steve Crawford <scrawford(at)pinpointresearch(dot)com> |
---|---|
To: | Robert DiFalco <robert(dot)difalco(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Modeling Friendship Relationships |
Date: | 2014-11-11 23:39:48 |
Message-ID: | 54629E44.1030203@pinpointresearch.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 11/11/2014 02:38 PM, Robert DiFalco wrote:
> I have a question about modeling a mutual relationship. It seems basic
> but I can't decide, maybe it is 6 of one a half dozen of the other.
>
> In my system any user might be friends with another user, that means
> they have a reciprocal friend relationship.
>
> It seems I have two choices for modeling it.
>
> 1. I have a table with two columns userOne and userTwo. If John is
> friends with Jane there will be one row for both of them.
> 2. I have a table with two columns owner and friend. If John is
> friends with Jane there will be two rows, one that is {John, Jane} and
> another {Jane, John}.
>
> The first option has the advantage of saving table size. But queries
> are more complex because to get John's friends I have to JOIN friends
> f ON f.userA = "John" OR f.userB = "John" (not the real query, these
> would be id's but you get the idea).
>
> In the second option the table rows would be 2x but the queries would
> be simpler -- JOIN friends f ON f.owner = "John".
>
> There could be >1M users. Each user would have <200 friends.
>
> Thoughts? Do I just choose one or is there a clear winner? TIA!
What you are describing is basically an adjacency-list without any
hierarchy information, i.e. there isn't a John reports to Dick reports
to Jane type of tree.
One-million-users at 200 friends each would (order-of-magnitudeish) be
200-million rows which tends to argue for saving space. It also reduces
the number of rows impacted by deletes and avoids the risk of ending up
with John,Jane without a corresponding Jane,John.
Getting John's friends isn't too complicated but I suspect the form of
the query you gave won't lead to optimal query plans. For a two-column
format I would imagine that ...userB as friend where userA='John' union
userA as friend where userB='John'... would yield a more optimal plan
assuming an index on each column.
I'm guessing that you will also want to study common-table-expressions
and recursive queries (description and examples are available in the
PostgreSQL docs) so you can start to write single queries to answer
things like "list everyone who is a friend of a friend of John."
Cheers,
Steve
From | Date | Subject | |
---|---|---|---|
Next Message | Robin Ranjit Singh Chauhan | 2014-11-12 00:36:06 | Re: repmgr |
Previous Message | Rob Sargent | 2014-11-11 23:28:09 | Re: Modeling Friendship Relationships |