Now that the data is all loaded into your instance, you can start running queries on the database. A number of queries have been selected for you from the TPC-H specifications in this guide. For the complete workload of 22 queries that can be run against the TPC-H database, click here.
First, open up your SQL Editor to begin querying the data.
Query 1: Pricing Summary Report
This query reports the amount of business that was billed, shipped, and returned within 60 - 120 days of the greatest ship data contained in the database.
use tpch;
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date('1998-12-01' - interval '90' day)
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
Run the query again and observe the execution time difference. You will see a dramatic improvement with query execution time. This is due to MemSQL’s Code Generation technology.
Query 2: Shipping Priority
This query retrieves the 10 unshipped orders with the highest value.
use tpch;
select
l_orderkey,
sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate,
o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = 'BUILDING'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < date('1995-03-15')
and l_shipdate > date('1995-03-15')
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
revenue desc,
o_orderdate
limit 10;
Query 3: Returned Item Reporting
This query identifies top 20 customers who might be having problems with the parts that are shipped to them.
use tpch;
select
c_custkey,
c_name,
sum(l_extendedprice * (1 - l_discount)) as revenue,
c_acctbal,
n_name,
c_address,
c_phone,
c_comment
from
customer,
orders,
lineitem,
nation
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate >= date('1993-10-01')
and o_orderdate < date('1993-10-01') + interval '3' month
and l_returnflag = 'R'
and c_nationkey = n_nationkey
group by
c_custkey,
c_name,
c_acctbal,
c_phone,
n_name,
c_address,
c_comment
order by
revenue desc
limit 20;
Query 4: Product Type Profit Measure
This query determines how much profit is made on a given line of parts, broken out by supplier nation and year.
use tpch;
select
nation,
o_year,
sum(amount) as sum_profit
from
(
select
n_name as nation,
extract(year from o_orderdate) as o_year,
l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
from
part,
supplier,
lineitem,
partsupp,
orders,
nation
where
s_suppkey = l_suppkey
and ps_suppkey = l_suppkey
and ps_partkey = l_partkey
and p_partkey = l_partkey
and o_orderkey = l_orderkey
and s_nationkey = n_nationkey
and p_name like '%green%'
) as profit
group by
nation,
o_year
order by
nation,
o_year desc;
Want more queries? Click here to get a full set of the 22 queries for a complete TPC-H workload.
Analyze Performance of Queries
MemSQL Studio has a convenient feature that allows you to record and visually analyze workload profiles at either the query or node level.
-
In SQL Editor, open up the Cluster icon and click Profile.
-
The Visual Explain window will open up. You can now explore the query and navigate around different operations.
Want to run a cluster with more than four license units? Create an Enterprise License trial key.