Outdated Version

You are viewing an older version of this section. View current production version.

/Data Ingest/

How To Load Data Into MemSQL min read

This guide covers various ways of loading data into the system. We recommend trying option 1, and if that is not sufficient, trying option 2 then option 3.

Option 1: Loading from a file
Option 2: Loading from S3 or HDFS using MemSQL Ops
Option 3: Loading from MySQL

Option 1: Loading Data stored in a file

MemSQL supports standard SQL loading constructs. For instance, after creating tbl_name, run:

LOAD DATA INFILE 'file_name.tsv' INTO TABLE tbl_name
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'

For more details see LOAD DATA.

Option 2: Importing data from S3 or HDFS

Warning

MemSQL Streamliner was deprecated in MemSQL 6.0. For current Streamliner users, we recommend migrating to MemSQL Pipelines instead. MemSQL Pipelines provides increased stability, improved ingest performance, and exactly-once semantics. For more information about Streamliner deprecation, see the 5.8 Release Notes. For more information about Pipelines, see the MemSQL Pipelines documentation.

If your data lives in S3 or HDFS, you can use the Data Import functionality in MemSQL Ops to download and ingest files.

Option 3: Loading from MySQL

The popular open source tool mysqldump is a straightforward way to load data from MySQL into MemSQL For example, on a machine with a MySQL database foo, issue the following commands.

mysqldump -h 127.0.0.1 -u root --databases foo > foodump.sql

You can also dump specific tables by using the --tables option:

mysqldump -h 127.0.0.1 -u root --databases foo --tables tbl_name > foodump.sql

Since MemSQL is a distributed database,, we need to tell MemSQL how to shard the data across the cluster. If all your tables have primary keys, MemSQL will automatically use the primary keys to shard them, so you can skip this next step. Otherwise, you need to open up foodump.sql in your favorite editor and either add shard keys or declare the tables as reference tables (for more details, see How to Port Your Applications to MemSQL.

For instance, suppose foodump.sql contains

CREATE TABLE `users` (
  `id` bigint(20) NOT NULL,
  `username` varchar(11) DEFAULT NULL,
  `address` varchar(10)
);

To make this a valid MemSQL table, consider changing it to

CREATE TABLE `users` (
  `id` bigint(20) NOT NULL,
  `username` int(11) DEFAULT NULL,
  `address` varchar(10),
   SHARD KEY(`id`)
);

For more information on choosing a shard key, see Primary Key as the Shard Key. Once all your tables have either shard key or primary key declarations, you can run the following on the master aggregator:

mysql -h 127.0.0.1 -u root < foodump.sql

All tables and data from users will now be loaded into MemSQL.

Was this Article Helpful?