SQL query funnel analysis

From: Maks Materkov <materkov(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: SQL query funnel analysis
Date: 2016-05-21 16:52:06
Message-ID: CABVsJ4FnAorqnHLP4Pr_TwdFZGcKGDxwhaY16c+jmsi72KvE=w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi All,

I am developing SQL query for funnel analysis. We have one big table
'events' with following structure (simplified):

id | user_id | type_id | created
-- | ------- | ------- | -------
1 | 7160 | 6148 | 2016-05-20 12:11:21
...

Currently I have this SQL query (for three steps):

SELECT
SUM(CASE WHEN sub.PERFORMED_STEP1 THEN 1 ELSE 0 END) AS STEP1_COUNT,
SUM(CASE WHEN sub.PERFORMED_STEP2 THEN 1 ELSE 0 END) AS STEP2_COUNT,
SUM(CASE WHEN sub.PERFORMED_STEP3 THEN 1 ELSE 0 END) AS STEP3_COUNT
FROM
(
SELECT
Q1.user_id,
Q1.created IS NOT NULL AS PERFORMED_STEP1,
Q2.created IS NOT NULL AS PERFORMED_STEP2,
Q3.created IS NOT NULL AS PERFORMED_STEP3
FROM
(SELECT user_id,MIN(created) as created FROM events WHERE type_id=6451
and created between '2016-04-16' and '2016-05-15' GROUP BY user_id) AS Q1

LEFT JOIN

(SELECT user_id,MIN(created) as created FROM events WHERE type_id=6782
and created between '2016-04-16' and '2016-05-15' GROUP BY user_id) AS Q2
ON Q1.user_id=Q2.user_id AND Q1.created<Q2.created

LEFT JOIN

(SELECT user_id,MIN(created) as created FROM events WHERE type_id=356 and
created between '2016-04-16' and '2016-05-15' GROUP BY user_id) AS Q3
ON Q2.user_id=Q3.user_id AND Q2.created<Q3.created

) sub

It works almost good, but unfortunately, there is an error.
We need to build a funnel for events 1,2,3. If events happened in following
order: 2,1,2,3 then this query is not working, because on step 2 we use
"MIN(created)" and first event is counted (instead of third event).

Is there any suggestions how to solve this problem?
I think that we need to some kind of filtering in inner query 2 and 3 to
filter all events, that happened before event in first level. But I have no
idea how to do this (we can't write "created > q1.created" in WHERE clause)

--
Maks

Browse pgsql-sql by date

  From Date Subject
Next Message Michael Moore 2016-05-24 22:19:12 INOUT text[],OUT text parameter handling problem
Previous Message gmb 2016-05-17 16:23:29 Re: Handle untype-cast string as VARCHAR instead of TEXT