You are viewing an older version of this section. View current production version.
PERCENTILE_CONT and Median
In MemSQL, percentile functions are only available as window functions. There are two percentile functions: PERCENTILE_CONT
and PERCENTILE_DISC.
PERCENTILE_CONT
is an inverse distribution function. It assumes a continuous distribution between values of the expression in the sort specification. Then, it interpolates the value of that expression at the given percentile, performing a linear interpolation.
For an overview of window functions in MemSQL, see Window Functions Guide.
Syntax
PERCENTILE_CONT(percentile)
WITHIN GROUP (ORDER BY (col | expr))
OVER (
[PARTITION BY (col | expr), ...]
[frame_clause]
)
or equivalently
PERCENTILE_CONT(percentile)
OVER (
ORDER BY (col | expr)
[PARTITION BY {col | expr}, ...]
[frame_clause]
)
Arguments
percentile
A numeric value between 0 and 1.
col
A column of a numeric data type.
expr
An expression that evaluates to a numeric data type.
Remarks
The OVER()
clause defines how to partition the input set. If no PARTITION BY
clause is specified, the entire input set is treated as a single partition.
Only one column or expression is allowed in the ORDER BY
clause.
Example
The following example creates a table containing SAT scores. The resulting window frame shows the hypothetical 90th percentile SAT score given the existing scores.
memsql> CREATE TABLE SAT (math INT, reading INT, writing INT);
memsql> INSERT INTO SAT VALUES (800,800,800),(650,300,500),(700,500,520);
memsql> SELECT math, reading, writing,
-> math + reading + writing AS total,
-> PERCENTILE_CONT(.9)
-> WITHIN GROUP (ORDER BY math + reading + writing)
-> OVER () AS `90th Percentile` FROM SAT;
+------+---------+---------+-------+-----------------+
| math | reading | writing | total | 90th Percentile |
+------+---------+---------+-------+-----------------+
| 650 | 300 | 500 | 1450 | 2264.0000 |
| 700 | 500 | 520 | 1720 | 2264.0000 |
| 800 | 800 | 800 | 2400 | 2264.0000 |
+------+---------+---------+-------+-----------------+
Median
Compute the median using percentile = 0.5.
PERCENTILE_CONT( 0.5 ) ...
Example:
memsql> create table t (a int);
memsql> insert into t values (1),(2),(3),(166),(1024);
memsql> select a, avg(a) over () as mean, percentile_cont(.5) within group (order by a) over () as median from t;
+------+----------+--------+
| a | mean | median |
+------+----------+--------+
| 1 | 239.2000 | 3.0000 |
| 2 | 239.2000 | 3.0000 |
| 3 | 239.2000 | 3.0000 |
| 166 | 239.2000 | 3.0000 |
| 1024 | 239.2000 | 3.0000 |
+------+----------+--------+