Job files
Learn how to configure job files for data transformation.
You can optionally supply one or more job files that specify how you want to
transform the captured data before writing it to the target.
Each job file contains a YAML
configuration that controls the transformation for a particular table from the source
database. You can also add a default-job.yaml
file to provide
a default transformation for tables that don't have a specific job file of their own.
The job files have a structure like the following example. This configures a default job that:
- Writes the data to a Redis hash
- Adds a field
app_code
to the hash with a value offoo
- Adds a prefix of
aws
and a suffix ofgcp
to the key
source:
table: "*"
row_format: full
transform:
- uses: add_field
with:
fields:
- field: after.app_code
expression: "`foo`"
language: jmespath
output:
- uses: redis.write
with:
data_type: hash
key:
expression: concat(['aws', '#', table, '#', keys(key)[0], '#', values(key)[0], '#gcp'])
language: jmespath
The main sections of these files are:
-
source
: This is a mandatory section that specifies the data items that you want to use. You can add the following properties here:server_name
: Logical server name (optional).db
: Database name (optional). This refers to a database name you supplied in config.yaml.schema
: Database schema (optional). This refers to a schema name you supplied in config.yaml.table
: Database table name. This refers to a table name you supplied inconfig.yaml
. The default job doesn't apply to a specific table, so use "*" in place of the table name for this job only.row_format
: Format of the data to be transformed. This can take the valuespartial
(default) to use only the payload data, orfull
to use the complete change record. See thetransform
section below for details of the extra data you can access when you use thefull
option.case_insensitive
: This applies to theserver_name
,db
,schema
, andtable
properties and is set totrue
by default. Set it tofalse
if you need to use case-sensitive values for these properties.
-
transform
: This is an optional section describing the transformation that the pipeline applies to the data before writing it to the target. Theuses
property specifies a transformation block that will use the parameters supplied in thewith
section. See the data transformation reference for more details about the supported transformation blocks, and also the JMESPath custom functions reference. You can test your transformation logic using the dry run feature in the API.Note:If you set
row_format
tofull
under thesource
settings, you can access extra data from the change record in the transformation:- Use the
key
object to access the attributes of the key. For example,key.id
will give you the value of theid
column as long as it is part of the primary key. - Use
before.<FIELD_NAME>
to get the value of a field before it was updated in the source database - Use
after.<FIELD_NAME>
to get the value of a field after it was updated in the source database - Use
after.<FIELD_NAME>
when adding new fields during transformations
See Row Format for a more detailed explanation of the full format.
- Use the
-
output
: This is a mandatory section to specify the data structure(s) that RDI will write to the target along with the text pattern for the key(s) that will access it. Note that you can map one record to more than one key in Redis or nest a record as a field of a JSON structure (see Data denormalization for more information about nesting). You can add the following properties in theoutput
section:uses
: This must have the valueredis.write
to specify writing to a Redis data structure. You can add more than one block of this type in the same job.with
:connection
: Connection name as defined inconfig.yaml
(by default, the connection namedtarget
is used).data_type
: Target data structure when writing data to Redis. The supported types arehash
,json
,set
,sorted_set
,stream
andstring
.key
: This lets you override the default key for the data structure with custom logic:expression
: Expression to generate the key.language
: Expression language, which must bejmespath
orsql
.
expire
: Positive integer value or SQL/JMESPath expression indicating a number of seconds for the key to expire. If you don't specify this property, the key will never expire. See Set custom expiration times / TTL for more information and examples.
transform
section is optional, but if you don't specify
a transform
, you must specify custom key logic in output.with.key
. You can include
both of these sections if you want both a custom transform and a custom key.Another example below shows how you can rename the fname
field to first_name
in the table emp
using the
rename_field
block. It also demonstrates how you can set the key of this record instead of relying on
the default logic.
source:
server_name: redislabs
schema: dbo
table: emp
transform:
- uses: rename_field
with:
from_field: fname
to_field: first_name
output:
- uses: redis.write
with:
connection: target
key:
expression: concat(['emp:fname:',fname,':lname:',lname])
language: jmespath
See the RDI configuration file reference for full details about the available source, transform, and target configuration options and see also the data transformation reference for details of all the available transformation blocks.
Examples
The pages listed below show examples of typical job files for different use cases.