You are viewing an older version of this section. View current production version.
Data Types
MemSQL supports the following data types. The sizes given are for in-memory rowstore tables. Columnstore (on-disk) tables are stored differently, and often compressed.
Integer Numbers
Data Type | Size | Size (Not Null) | Synonyms | Min Value | Max Value |
---|---|---|---|---|---|
BOOL* (see note below) | 2 bytes | 1 byte | BOOLEAN | -128 | 127 |
BIT | 9 bytes | 8 bytes | |||
TINYINT | 2 bytes | 1 byte | -128 | 127 | |
SMALLINT | 4 bytes | 2 bytes | -32768 | 32767 | |
MEDIUMINT | 4 bytes | 3 bytes | -8388608 | 8388607 | |
INT | 8 bytes | 4 bytes | INTEGER | -2147483648 | 2147483647 |
BIGINT | 12 bytes | 8 bytes | -2 ** 63 | (2 ** 63) - 1 |
Note:
- BOOL and BOOLEAN are synonymous with TINYINT. A value of 0 is considered FALSE, non-zero values are considered TRUE.
Each integer type can be “unsigned”, doubling the positive range of values and disallowing values under 0.
Real Numbers
Data Type | Size | Size (Not Null) | Synonyms | Precision |
---|---|---|---|---|
DOUBLE | 8 bytes | 8 bytes | REAL | 24 to 53 digits |
FLOAT | 4 bytes | 4 bytes | 23 digits | |
DECIMAL | see note | see note | DEC, FIXED, NUMERIC | 65 digits |
The space consumed by the DECIMAL
datatype varies with the precision and scale arguments.
The syntax for defining a DECIMAL
(or its synonyms DEC
, FIXED
, and NUMERIC
)
is in the form DECIMAL(P, S)
where P is the total number of digits (precision) and
S (scale) is how many of those digits appear after the decimal point. For example,
DECIMAL(10,5)
will store numbers with 5 digits before the decimal and 5 after.
The formula is ceil(P / 9) * 4
. In our example, it would be ceil(10 / 9) * 4
, or 8 bytes.
Time and Date
Data Type | Size | Size (Not Null) | Resolution | Min Value | Max Value |
---|---|---|---|---|---|
DATE | 8 bytes | 4 bytes | day | 1000-01-01 | 9999-12-31 |
TIME | 8 bytes | 4 bytes | second | -838:59:59 | 838:59:59 |
TIMESTAMP | 4 bytes | 4 bytes | second | 1970-01-01 00:00:01 | 2038-01-19 03:14:07 |
TIMESTAMP(6) | 8 bytes | 8 bytes | microsecond | 1970-01-01 00:00:01.000000 | 2038-01-19 03:14:07.999999 |
DATETIME | 12 bytes | 8 bytes | second | 1000-01-01 00:00:00 | 9999-12-31 23:59:59 |
DATETIME(6) | 12 bytes | 8 bytes | microsecond | 1000-01-01 00:00:00.000000 | 9999-12-31 23:59:59.999999 |
YEAR | 2 bytes | 1 byte | year | 1901 | 2155 |
String Types
Fixed-Length String
There are three fixed-size string types in MemSQL. The space consumed is directly proportional to their length definitions. Values of type CHAR
require 3 x LENGTH
bytes of memory, as MemSQL supports only the UTF8 character set. Values of type BINARY
require LENGTH
bytes of memory.
Data Type | Size | Size (Not Null) |
---|---|---|
CHAR(LENGTH) | 3*LENGTH + 4 bytes | 3*LENGTH bytes |
BINARY(LENGTH) | LENGTH + 4 bytes | LENGTH bytes |
Variable-Length Strings
There are several variable-length string types. The data for these variable types are stored in a memory allocation designed for them, and pointers to their locations in memory are stored in the main table structure. This means that the space consumed by a particular value is the byte length of the string, plus an 8-byte overhead for the pointer, and a 4-byte overhead for storing the length. (There is an additional 4-byte overhead if the field is nullable.) One exception is values less than 7 bytes long. Those small values are stored directly in the pointer area.
The three main types are VARCHAR
, VARBINARY
, and LONGTEXT
. While they all operate similarly for most English-language alphanumeric strings, there are several important differences in their behavior. The length limit of the VARCHAR
is measured in Unicode characters, not in bytes. VARBINARY
fields store data verbatim, regardless of whether they are valid Unicode strings. Sorting and comparing a VARBINARY
uses the exact binary representation.
Data Type | Max Length |
---|---|
VARCHAR | 21,845 characters |
VARBINARY | system RAM |
LONGTEXT | 4 GB |
There are several variants of the BLOB and TEXT types, listed below. Aside from their maximum lengths, there is no practical difference between these types in MemSQL, including performance or memory efficiency. They are implemented solely for backwards compatibility with MySQL syntax.
Name | Name | Description |
---|---|---|
LONGBLOB | 4 GB | VARBINARY |
MEDIUMBLOB | 16 MB | VARBINARY |
BLOB | 65,535 bytes | VARBINARY |
TINYBLOB | 255 bytes | VARBINARY |
MEDIUMTEXT | 16 MB | LONGTEXT |
TEXT | 65,535 bytes | LONGTEXT |
TINYTEXT | 255 bytes | LONGTEXT |
JSON Type
The JSON type can contain any valid Javascript Object Notation data, including maps, arrays, and nested types. For more information see the JSON Guide.
Data Type | Size | Size (Not Null) |
---|---|---|
JSON | 20 bytes + data size | 16 bytes + data size |
Geospatial Types
Geospatial types contain information about points, paths, and polygons on the surface of the Earth. For more information, see the Geospatial Guide.
Data Type | Size | Size (Not Null) |
---|---|---|
GEOGRAPHYPOINT | 8 bytes | 8 bytes |
GEOGRAPHY | See note |
The space consumed by the path and polygon data in Geography
columns is dependent on the complexity of the objects. Very roughly, a Geography
value consumes 24 bytes per vertex, plus a 64 byte overhead, plus some minor overhead for each “ring” in a multipolygon.
Indexing a Geography
object may also consume more space than indexes over other datatypes. The rule of thumb is 40 bytes * the value of the RESOLUTION parameter of the index, per row.
Other Types
Data Type | Size | Size (Not Null) | Max cardinality | Max values per row |
---|---|---|---|---|
ENUM | 4 bytes | 2 bytes | 65,535 unique values | 1 |
SET | 12 bytes | 8 bytes | 64 unique values | 64 |