Transformation Primitives

Transformation Primitives

Transformation tasks modify the fields of each data entry.

Syntax:

- TASKNAME: ["[!]CONDITION", "opt1", "opt2", ...]

Use '' as condition to always execute. Use !COND to execute only when condition is False.

Quick reference

    - field_noop:         [COND]                                : do nothing
    - field_toint:        [COND, field1, ...]                   : convert fields to int
    - field_tofloat:      [COND, field1, ...]                   : convert fields to float
    - field_tostring:     [COND, field1, ...]                   : convert to string
    - field_nospace:      [COND, field1, ...]                   : remove all whitespaces from fields
    - field_regexp_sub:   [COND, field, 'pattern', 'replace']   : apply regexp/replace
    - field_set:          [COND, field, 'value']                : create field with value
    - field_copy:         [COND, field1, field2]                : field1 to field2
    - field_rename:       [COND, field1, field2]                : field1 to field2
    - field_merge:        [COND, field1, field2, field3]        : field1 + field2 > field3
    - field_delete:       [COND, field1, field2, ...]           : remove fields
    - field_keep:         [COND, field1, field2, ...]           : keep only these fields (and classname/keyname)
    - field_lower:        [COND, field1, ...]                   : lowercase
    - field_upper:        [COND, field1, ...]                   : uppercase
    - field_date_now:     [COND, field1, ...]                   : set field(s) to YYYY-MM-DD
    - field_time_now:     [COND, field1, ...]                   : set field(s) to HH:MM:SS
    - field_datetime_now: [COND, field1, ...]                   : set field(s) to YYYY-MM-DD HH:MM:SS
    - field_uuid:         [COND, field1, ...]                   : set field(s) to random UUID string (distinct)
    - field_append:       [COND, field, 'suffix']               : append a suffix to field
    - field_prepend:      [COND, field, 'prefix']               : prefix field with prefix string
    - field_md5:          [COND, dst, src1, src2, ...]          : dst = md5(concat(src1, src2, ...))
    - field_join:         [COND, sep, dst_field, src1, src2, ...]: join fields with separator
    - field_datetime:     [COND, target, source, delta]         : compute datetime into target field
    - field_lookup:       [COND, dstfield, schemaname, srcfield]: lookup entry by keyname, set dstfield or ""
    - align_subnet4:      [COND, field]                         : align IPv4/mask to subnet boundaries
    - discard:            [COND]                                : eliminate full entry
    - exit:               [COND, 'status']                      : stop processing tasks, keep entry

field_noop

tasks:
- field_noop: [CONDITION]

Performs nothing. Useful as a placeholder or for testing conditions.

field_set

tasks:
- field_set: [CONDITION, "fieldname", "AnyValue"]

Creates or overwrites fieldname and sets its value.

field_copy

tasks:
- field_copy: [CONDITION, "field1", "field2"]

Creates/overwrites field2 with the value of field1.

field_rename

tasks:
- field_rename: [CONDITION, "field1", "field2"]

Renames field1 to field2. Creates/overwrites field2 with field1’s value and removes field1.

field_delete

tasks:
- field_delete: [CONDITION, "field1", "field2", ...]

Removes all specified fields.

field_keep

tasks:
- field_keep: [CONDITION, "field1", "field2", ...]

Keeps provided fields only. Removes all other fields (except classname and keyname).

field_lower

tasks:
- field_lower: [CONDITION, "field", "field", ...]

Converts field value(s) to lowercase.

field_upper

tasks:
- field_upper: [CONDITION, "field", "field", ...]

Converts field value(s) to uppercase.

field_toint

tasks:
- field_toint: [CONDITION, "field", "field", ...]

Converts a field’s value to integer.

field_tofloat

tasks:
- field_tofloat: [CONDITION, "field", "field", ...]

Converts a field’s value to floating point value.

field_tostring

tasks:
- field_tostring: [CONDITION, "field", "field", ...]

Converts a field’s value to a string.

field_nospace

tasks:
- field_nospace: [CONDITION, "field", "field", ...]

Removes all whitespace and tab characters from field values. Faster than a general regexp task.

field_regexp_sub

tasks:
- field_regexp_sub: [CONDITION, "fieldname", "pattern", "replace"]

Alters fieldname’s value by replacing a regexp pattern with the provided replacement. Accepts any Python standard regexp.

Example:

tasks:
- field_regexp_sub: ['', 'field_a', 'test', 'QWERTY']

# Before : {'field_a': 'This is a test from unittest !'}
# After  : {'field_a': 'This is a QWERTY from unitQWERTY !'}

field_merge

tasks:
- field_merge: [CONDITION, 'field1', 'field2', 'field3']

Concatenates field1 and field2 into new field3 (overwrites if already exists).

If field values are numerical, a mathematical addition is performed. If they are strings, string concatenation is performed.

Use field_toint / field_tofloat / field_tostring to handle type conversions beforehand.

field_prepend

tasks:
- field_prepend: [CONDITION, 'field1', 'prefix']

Puts the prefix string in front of field1’s value.

field_append

tasks:
- field_append: [CONDITION, 'field1', 'suffix']

Puts the suffix string at the end of field1’s value.

field_date_now

tasks:
- field_date_now: [CONDITION, "field", "field", ...]

Sets field(s) to the current date in YYYY-MM-DD format.

Useful for automated or periodic imports to track in-sync/out-of-sync objects.

field_time_now

tasks:
- field_time_now: [CONDITION, "field", "field", ...]

Sets field(s) to the current time in HH:MM:SS format.

field_datetime_now

tasks:
- field_datetime_now: [CONDITION, "field", "field", ...]

Sets field(s) to the current datetime in YYYY-MM-DD HH:MM:SS format.

field_uuid

tasks:
- field_uuid: [CONDITION, "field", "field", ...]

Creates/overwrites field(s) with a random UUID string. Each field receives a distinct UUID.

Use field_copy if the same UUID is needed in multiple fields.

See field_md5 for a deterministic primary key.

field_md5

new v3.24

tasks:
- field_md5: [CONDITION, dst, src1, src2, ...]

Computes md5 of the concatenated values of src* fields and stores the result in dst.

If any source field is missing, no md5 is computed and dst is not created.

Useful for creating a deterministic primary key by combining multiple fields.

field_join

new v3.24

tasks:
- field_join: [CONDITION, separator, dst_field, src1, src2, src3, ...]

Joins multiple source fields into a destination field using a separator (similar to Python’s str.join()). All values are converted to strings.

Example:

tasks:
- field_join: ['', '-', 'result', 'field1', 'field2']

# Before : {'field1': 'hello', 'field2': 'world'}
# After  : {'field1': 'hello', 'field2': 'world', 'result': 'hello-world'}

- field_join: ['', ' ', 'fullname', 'firstname', 'lastname']

# Before : {'firstname': 'John', 'lastname': 'Doe'}
# After  : {'firstname': 'John', 'lastname': 'Doe', 'fullname': 'John Doe'}

field_datetime

new v3.33

tasks:
- field_datetime: [CONDITION, target_field, source, delta]

Computes a datetime value and stores it in target_field.

source (required): one of:

  • now() — current local datetime
  • a field name present in the entry — its value is parsed as a datetime
  • a datetime string in ISO format YYYY-MM-DD HH:MM:SS

delta (optional): a duration to add or subtract, in the form [+-]<int><unit>.

Supported units: second, minute, hour, day, week, month, year (plural form also accepted).

delta can also be the name of a field whose value contains the delta string.

Output format: YYYY-MM-DD HH:MM:SS

If source cannot be resolved or parsed, the target field is not set.

Examples:

tasks:
# set result to current datetime
- field_datetime: ['', result, 'now()']

# set expiry to start + 30 days
- field_datetime: ['', expiry, start, '+30day']

# subtract 3 months from a fixed date
- field_datetime: ['', result, '2025-06-01 00:00:00', '-3month']

# use a field as both source and delta
- field_datetime: ['', result, last_sync, retention_delay]

field_lookup

new v3.33

tasks:
- field_lookup: [CONDITION, dstfield, schema_name, srcfield]

Looks up an entry in schema_name using the value of srcfield as the keyname. Sets dstfield to the keyname of the found entry, or to "" if no match is found.

schema_name can be:

  • _user — looks up a user by login
  • _group — looks up a group by keyname
  • _role — looks up a role by keyname
  • _permission — looks up a permission by keyname
  • any other schema name — looks up a DataInstance by keyname

Useful for validating references and resolving canonical keynames from external data.

Examples:

tasks:
# Check that a username from an import exists as a Cavaliba user
- field_lookup: ['', resolved_user, _user, raw_login]

# Before : {'raw_login': 'alice'}
# After  : {'raw_login': 'alice', 'resolved_user': 'alice'}   (if alice exists)
# After  : {'raw_login': 'nobody', 'resolved_user': ''}       (if not found)

# Resolve a server from an external dataset
- field_lookup: ['', server, myserver_schema, server_name]

align_subnet4

new v3.24

tasks:
- align_subnet4: [CONDITION, fieldname]

Aligns an IPv4 address/mask to subnet boundaries. Transforms an IP address with prefix length to the network address with the same prefix length.

Example:

tasks:
- align_subnet4: ['', subnet_field]

# Before : {'subnet_field': '10.1.1.1/24'}
# After  : {'subnet_field': '10.1.1.0/24'}

# Before : {'subnet_field': '192.168.50.100/16'}
# After  : {'subnet_field': '192.168.0.0/16'}

discard

tasks:
- discard: [CONDITION]

Eliminates the full entry. The entry is not written to the database.

exit

new v3.33

tasks:
- exit: [CONDITION, status]

Stops processing remaining tasks for the current entry. The entry is kept (not discarded).

The optional status parameter sets the return value for other processing steps. Defaults to 'exit' if omitted.

Example:

tasks:
- set_condition: [IS_DONE, field_match, status, '^done$']
- exit: [IS_DONE, "all-done"]
- field_set: ['', needs_processing, 'yes']

# entries with status='done' skip remaining tasks and are saved as-is