You are viewing an older version of this section. View current production version.
ALTER PIPELINE
The ALTER PIPELINE
clause changes an existing pipeline’s configuration.
Syntax
ALTER PIPELINE pipeline_name
[SET
[OFFSETS
[EARLIEST | LATEST | json_source_partition_offset]
]
[TRANSFORM ('uri', ['executable', 'arguments [...]'])]
[BATCH_INTERVAL milliseconds]
]
[FIELDS | COLUMNS]
[TERMINATED BY 'string'
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
Each of the clauses in a ALTER PIPELINE
statement are described below.
ALTER PIPELINE SET
You can set a pipeline’s offsets, transform, or batch interval by using the SET
clause.
ALTER PIPELINE pipeline_name
[SET
[OFFSETS
[EARLIEST | LATEST | json_source_partition_offset]
]
[TRANSFORM]
[BATCH_INTERVAL milliseconds]
]
ALTER PIPELINE SET OFFSETS
A pipeline’s current starting offset can be altered by using the SET OFFSETS
clause. When a new offset is set, the pipeline will begin extracting data from the specified offset, regardless of any previous offsets that have or have not been extracted. There are three offset options:
SET OFFSETS EARLIEST
: Configures the pipeline to start reading from the earliest (or oldest) available offset in the data source.
memsql> ALTER PIPELINE mypipeline SET OFFSETS EARLIEST;
Query OK, 0 rows affected (0.01 sec)
SET OFFSETS LATEST
: Configures the pipeline to start reading from the latest (or newest) available offset in the data source.
memsql> ALTER PIPELINE mypipeline SET OFFSETS LATEST;
Query OK, 0 rows affected (0.01 sec)
SET OFFSETS '{"<source-partition>": <partition-offset>}'
: Configures the pipeline to start reading from specific data source partitions and offsets. When you manually specify which source partition and offset to start extracting from, there are a few important things to consider:
- If the data source has more partitions than are specified in the JSON string, only data from the specified offsets will be extracted. No new offsets from the other partitions will be extracted.
- If the specified source partition doesn’t exist, no data will be extracted and no errors will appear. However, the partition will be present in a row of the
information_schema.PIPELINES_OFFSETS
table with itsEARLIEST_OFFSET
,LATEST_OFFSET
, andLATEST_LOADED_OFFSET
columns set toNULL
.
memsql> ALTER PIPELINE mypipeline SET OFFSETS '{"0",100}, {"1",100}';
Query OK, 0 rows affected (0.01 sec)
In the example above, the data source has two partitions with IDs of 0
and 1
, and the pipeline will start reading from offset 100
in both partitions.
ALTER PIPELINE SET TRANSFORM
You can configure an existing pipeline to use a transform by using the SET TRANSFORM
clause. The syntax for applying a transform to a pipeline is identical to the WITH TRANSFORM
syntax that is used when creating a new pipeline.
memsql> ALTER PIPELINE mypipeline SET TRANSFORM('http://memsql.com/my-transform-tarball.tar.gz', 'my-executable.py', '-arg1 -arg1');
Query OK, 0 rows affected (0.01 sec)
SET TRANSFORM ('uri', ['executable', 'arguments [...]'])
: Each of the transform’s parameters are described below:uri
: The transform’s URI is the location from where the executable program can be downloaded, which is specified as either anhttp:\\
orfile:\\
endpoint. If the URI points to a tarball with a.tar.gz
or.tgz
extension, its contents will be automatically extracted. Additionally, theexecutable
parameter must be specified if a theuri
is a tarball. If the URI specifies an executable file itself, theexecutable
andarguments
parameters are optional.executable
: The filename of the transform executable to run. This parameter is required if a tarball was specified as the endpoint for the transform’surl
. If theurl
itself specifies an executable, this parameter is optional.arguments
: A series of arguments that are passed to the transform executable at runtime.
ALTER PIPELINE SET BATCH_INTERVAL
You can alter the batch interval for an existing pipeline by using the SET BATCH_INTERVAL
clause. A batch interval is the time duration between the end of a batch operation and the start of the next one. The syntax for setting a batch interval is identical to the BATCH_INTERVAL
syntax that is used when creating a new pipeline.
memsql> ALTER PIPELINE mypipeline SET BATCH_INTERVAL 0;
Query OK, 0 rows affected (0.01 sec)