Version

    Text manipulation steps

    Left substring

    Left substring allows you to extract specific number of characters from the beginning of text (i.e., it extracts first characters).

    Parameters
    • Input column: required, a string column containing input text.

    • Number of characters: number of characters to extract from the input string. The value must be greater than or equal to zero. You can use values greater than the length of the input string - whole input will be returned in such case (no padding is done).

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Number of characters Output value Description

    "INV-20230328"

    3

    "INV"

    Extract first three characters to determine type of a document ("Invoice").

    "short"

    10

    "short"

    Using higher number of characters than the length of the input returns the whole input value.

    "INV-20230328"

    0

    ""

    Setting number of characters to 0 returns an empty string. Note that empty string is not the same as No value (which is stored as null).

    No value

    5

    No value

    Returns No value when called on No value input.

    Error

    any

    Error

    Calling the step on a cell with Error will result in an Error.

    Remarks
    • Calling the step on a No value string results in No value.

    • If number of characters specified in the parameter is greater than the length of the input string, the whole input string is returned.

    • Calling the step on a cell with Error will result in an Error.

    See also

    Right substring

    Right substring allows you to extract specific number of characters from the end of text (i.e., it extracts last characters).

    Parameters
    • Input column: required, a string column containing input text.

    • Number of characters: number of characters to extract from the input string. The value must be greater than or equal to zero. You can use values greater than the length of the input string - whole input will be returned in such case (no padding is done).

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Number of characters Output value Description

    "INV-20230328"

    8

    "20230328"

    Extract date part from an invoice number - last 8 characters in this example.

    "short"

    10

    "short"

    using higher number of characters than the length of the input returns the whole input value.

    "INV-20230328"

    0

    ""

    Setting the number of characters to 0 returns an empty string.

    No value

    5

    No value

    Returns No value when called on No value input.

    Error

    any

    Error

    Calling the step on a cell with Error will result in an Error.

    Remarks
    • Calling the step on a No value string results in No value.

    • If number of characters specified in the parameter is greater than the length of the input string, the whole input string is returned.

    • Calling the step on a cell with Error will result in an Error.

    See also

    Substring

    Substring step returns specific number of characters from provided text starting with specified character position. First character has position of 0.

    Parameters
    • Input column: required, a string column containing input text.

    • Start position: required, starting position of the substring. First character in the string has position of 0, last character has position of length($Input) - 1. Negative start positions are not allowed. Start position greater than or equal to the length of the input is allowed.

    • Number of characters: required, number of characters returned to copy from the input string. Value must be greater or equal to 0. Values longer than the remainder of the string from Start position are allowed.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Starting position Number of characters Output value Description

    "INV-20230328"

    4

    4

    "2023"

    Returns 4 characters starting from the fifth one (character with index 4 is the fifth character).

    "INV-20230328"

    4

    20

    "20230328"

    If Number of characters is longer than the rest of the input after Start position, only the available characters are returned.

    "London"

    10

    5

    "" (empty string)

    Start position is after the end of the input, empty string is returned.

    "INV-20230328"

    0

    3

    "INV"

    First three characters of the input. Same result as using Left substring step with Number of characters set to 3.

    "INV-20230328"

    4

    8

    "20230328"

    Simpler solution would be reached by using Right substring step

    No value

    any

    any

    No value

    No value input produces No value output.

    Error

    any

    any

    Error

    Error input value leads to Error on output.

    Remarks
    • If Starting position is greater than the length of the input text, empty string "" is returned.

    • If Starting position is less than the length of the input text and the Number of characters exceeds the length of the remainder of the input, only characters up to the end of the input are returned (no padding is done).

    Trim

    Trim step removes leading and trailing spaces from text. Spaces within the text (i.e., spaces between words) are left without any changes.

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    "Spaces after    "

    "Spaces after"

    Trailing spaces are removed.

    "    Spaces before"

    "Spaces before"

    Leading spaces are removed.

    "    Space before and after    "

    "Space before and after"

    Both leading and trailing spaces are removed.

    "    Spaces    in    the    middle    "

    "Spaces    in    the    middle"

    Leading and trailing spaces are removed but spaces between words are left unchanged.

    "    " (spaces only)

    "" (empty text)

    Trim of text containing only spaces results in empty text - a string with zero length. Note this is different than No value.

    "" (empty text)

    "" (empty text)

    Trim of empty text produces empty text (zero-length string). Note this is different than No value.

    No value

    No value

    Trim of a No value cell results in a No value output.

    Error

    Error

    Running Trim on errors propagates error.

    Remarks
    • Trim removes variety of whitespace characters not just "regular" spaces:

      • Space character (various variants of Unicode space characters are removed)

      • New line characters and their combinations as used in Windows, Mac or Linux (line feed \n, form feed \f, carriage return \r, line separator \u2028, paragramp separator \u2029)

      • Tab (both horizontal tab \t as well as vertical tab \u000B)

      • Unicode separator character (file, group, record, unit separators)

    • Calling Trim step on a cell with Error will result in an Error.

    Pad left

    Pad left allows you to pad string values to specified minimum length by adding a designated character at the beginning of the string. If the input is longer than specified length, no change is made.

    Parameters
    • Input column: required, a string column containing input text.

    • Length: a numeric value containing minimal length of an output string. The value must be greater than 0.

    • Padding char: required, a character used as a filler character. Only a single character can be used here.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Length Padding char Output value Description

    "1234"

    6

    "0"

    "001234"

    Pad to 6 characters by adding zeroes at the beginning.

    "1234"

    2

    "0"

    "1234"

    Length value is smaller than the length of the Input value, no change to input is made.

    "1234"

    6

    "00"

    Error

    Padding character cannot contain multiple characters.

    No value

    any

    any

    No value

    Calling the step on No value input will return No value.

    Error

    any

    any

    Error

    Calling the step on an Error will return an Error.

    Remarks
    • Padding char must contain exactly one character (including whitespace characters). Longer padding string will result in an error.

    • If length of the input is longer than the specified Length, the input is returned unchanged.

    See also

    Pad right

    Pad right allows you to pad string values to specified minimum length by adding a designated character at the end of the string. If the input is longer than specified length, no change is made.

    Parameters
    • Input column: required, a string column containing input text.

    • Length: a numeric value containing minimal length of an output string. The value must be greater than 0.

    • Padding char: required, a character used as a filler character. Only a single character can be used here.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Length Padding char Output value Description

    "1234"

    8

    " " (space)

    "1234    "

    Pad to 8 characters by adding spaces at the end.

    "1234"

    2

    "0"

    "1234"

    Length value is smaller than the length of the Input value, no change to input is made.

    "1234"

    6

    "00"

    Error

    Padding character cannot contain multiple characters.

    No value

    any

    any

    No value

    Calling the step on No value input will return No value.

    Error

    any

    any

    Error

    Calling the step on an Error will return an Error.

    Remarks
    • Padding char must contain exactly one character (including whitespace characters). Longer padding string will result in an error.

    • If length of the input is longer than the specified Length, the input is returned unchanged.

    See also

    Normalize whitespaces

    Normalize whitespaces step removes tabs, new line or carriage return characters, and extra spaces from the input text and replaces them with a single space character when in between words, or removes them completely when before the first word or after the last word. This step can help you optimize your data and quickly remove unwanted characters that were entered in error or that are in an unwanted format.

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the Input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the Input column.

    Examples
    Input value Output value Explanation

    "Text with single spaces."

    "Text with single spaces."

    No change because there are no extra whitespace characters.

    "Text with    lots of    space     characters."

    "Text with lots of space characters."

    All extra spaces are removed.

    "Text      with tabs and       
    a new line."

    "Text with tabs and a new line."

    All extra spaces, tabs, and new line/carriage return characters are removed.

    " Test word "

    "Test word"

    Extra spaces at the beginning and end are removed.

    No value

    No value

    Empty value produces empty value on output.

    Error

    Error

    Applying the step to an error results in another Error.

    Remove whitespaces

    Remove whitespaces step removes all whitespace (invisible) characters from text including leading and trailing whitespaces as well as any whitespace characters within the text.

    Parameters
    • Input column: required, a string column to remove the whitespaces from.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    "Text with spaces."

    "Textwithspaces."

    Any space in the string is removed.

    "   Text with lots of whitespace characters.    "

    "Textwithlotsofwhitespacecharacters."

    Any space including leading and trailing spaces are removed.

    "Text      with
    a new line."

    "Textwithanewline."

    All whitespace characters are removed; including new lines and tabs.

    "1 st"

    "1st"

    All spaces in a sentence or between words are removed.

    Remarks
    • Remove whitespaces step only works with string columns.

    • Removes variety of "invisible" spacing characters not just spaces:

      • Regular space

      • New line (both Windows and Linux type)

      • Tab

    • To remove only leading and trailing whitespace characters, use Trim step.

    Remove accents

    Remove accents removes all accents from the text by converting all accented characters into the same character without an accent.

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    Côte d’Azure

    Cote d’Azure

    "ô" → "o". Apostrophe is not removed since it is not an accent.

    Saarbrücken

    Saarbrucken

    "ü" → "u"

    No value

    No value

    Calling the step on No value returns No value.

    Error

    Error

    Calling the step on Error returns Error.

    Remarks
    • Calling the step on No value returns No value.

    • Calling the step on cell with an Error returns Error.

    Lowercase

    Lowercase step converts all letter characters to their lowercase form.

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    "CLOVERDX"

    "cloverdx"

    All uppercase letters are transformed to lowercase.

    "camelCase"

    "camelcase"

    All uppercase letters are transformed to lowercase.

    "Corporation Inc.""

    "corporation inc."

    All uppercase letters are transformed to lowercase.

    "INV07282710PH""

    "inv0728271ph"

    All uppercase letters are transformed to lowercase while numerical characters stay the same.

    No value

    No value

    Returns No value when called on No value input.

    Error

    Error

    Calling the step on a cell with Error will result in an Error.

    Remarks
    • Non-letter characters (e.g., digits, symbols, …​) remain without any changes.

    See also

    Propercase

    Propercase step capitalizes first letters of each word in string columns. Note that it does convert any of the following letters to lowercase. If you want to make sure that there are no capital letters within words, use the Lowercase step first to convert all letters to lowercase.

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. The output will always be of string type.

      • Write result to the current column: overwrite the Input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the Input column.

    Examples
    Input value Output value Explanation

    "product code name"

    "Product Code Name"

    First letters of all words are capitalized.

    "PRODUCT CODE NAME"

    "PRODUCT CODE NAME"

    No change; words in uppercase will remain in uppercase.

    "ProDuct cOde name"

    "ProDuct COde Name"

    First letters of all words are capitalized, and the uppercase letters in the middle of the words are retained.

    See also

    Uppercase

    Uppercase step converts all letter characters to uppercase form (i.e., it capitalizes all letters in the input).

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    "cloverdx"

    "CLOVERDX"

    All lowercase letters are transformed to uppercase.

    "camelCase"

    "CAMELCASE"

    All lowercase letters are transformed to uppercase.

    "Corporation Inc.""

    "CORPORATION INC."

    All lowercase letters are transformed to uppercase.

    "inv07282710ph""

    "INV0728271PH"

    All lowercase letters are transformed to uppercase while numerical characters stay the same.

    No value

    No value

    Returns No value when called on No value input.

    Error

    Error

    Calling the step on a cell with Error will result in an Error.

    Remarks
    • Non-letter characters (e.g., digits, symbols, …​) remain without any changes.

    See also

    Remove non-alphanumeric characters

    Remove non-alphanumeric characters removes whitespaces, punctuation, math operators and other special characters from the text, only letters and numbers are kept.

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    (8+4)*2

    842

    The math operators (), +, and * are removed.

    34% of books

    34ofbooks

    The math operator % is removed along with whitespaces.

    gâteau

    gâteau

    All text is alphanumeric so it is unmodified.

    おはよう日本。

    おはよう日本

    Japanese characters are not letters but are considered alpha-numeric and are not removed. The period character at the end is removed.

    No value

    No value

    Calling the step on No value returns No value.

    Error

    Error

    Calling the step on Error returns Error.

    Remarks
    • Calling the step on No value returns No value.

    • Calling the step on cell with an Error returns Error.

    Remove non-ASCII characters

    Remove non-ASCII characters removes all characters that are not in US ASCII encoding.

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    Voyez le brick géant que j’examine.

    Voyez le brick gant que j’examine

    Only the "é" is a non-US ASCII character therefore it was removed

    Příšerný žluťoučký kůň úpěl ďábelské ódy.

    Pern luouk k pl belsk dy.

    This string has many non-US ASCII characters therefore many parts were removed

    おはよう日本。

    "" (empty string)

    Japanese text does not contain any ASCII characters and is completely removed.

    No value

    No value

    Calling the step on No value returns No value.

    Error

    Error

    Calling the step on Error returns Error.

    Remarks
    • Calling the step on No value returns No value.

    • Calling the step on cell with an Error returns Error.

    Remove non-numeric characters

    Remove non-numeric characters removes all characters except for (0-9) or decimal point (period character).

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    6.125%

    6.125

    Only characters 0-9 and decimal point are kept, % character is removed.

    123 2nd Street apartment 3b

    12323

    All of the whitespaces and alphabetical characters are removed.

    No value

    No value

    Calling the step on No value returns No value.

    Error

    Error

    Calling the step on Error returns Error.

    Remarks
    • Calling the step on No value returns No value.

    • Calling the step on cell with an Error returns Error.

    Remove non-printable characters

    Remove non-printable characters removes all non-printable characters from value - tabs, line breaks, page breaks and variety of control characters. Note that regular spaces as used between words are not removed.

    Parameters
    • Input column: required, a string column containing input text.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Output value Description

    Welcome:⟶John    Doe!

    Welcome:John Doe!

    The tab (denoted by the ⟶ symbol in the example) is removed while the four regular spaces between words are kept.

    Please sign here:

    John Doe

    Please sign here:John Doe

    The spaces are kept while new line characters are removed.

    No value

    No value

    Calling the step on No value returns No value.

    Error

    Error

    Calling the step on Error returns Error.

    Remarks
    • Remove non-printable characters removes all non-printable characters. you can find a full list here.

    • Calling the step on No value returns No value.

    • Calling the step on cell with an Error returns Error.

    Replace text

    Replace text replaces all matches of a regular expression within text with the replacement text.

    Parameters
    • Input column: required, a string column containing input text.

    • Replaced text: required, a regular expression to search for in the input text. Note that the replace function is case-sensitive.

    • Replacement: optional, text which will be used as replacement for each occurrence of the Replaced text in the input. If left empty, the value in the Replaced text field will be deleted.

    • Target column: required, configure the column which will receive the output. Output will always be of string type.

      • Write result to the current column: overwrite the input column with the result.

      • Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.

    Examples
    Input value Replaced text Replacement Result Description

    Two boxes: red box and blue BOX

    box

    car

    Two cares: red car and blue BOX

    Simple replacements without special symbols can be done directly. Search for instances of "box" and replace with "car". Search is case-sensitive, so word "BOX" is left as is.

    Hello, Emily.

    [Ee]

    -X-

    H-X-llo, -X-mily

    Square brackets define character group. Here we define a group that contains "E" and "e"All occurrences of letters from the group are replaced with string -X-.

    Hello, Emily

    e(l+)

    a$1

    Hallo, Emily

    In this example we define capture group using round brackets and use it to capture all lower-case Ls. The group is then referenced in the replacement text using $1 syntax.

    No value

    No value

    Calling the step on No value returns No value.

    Error

    Error

    Remarks
    • Calling the step on No value returns No value.

    • Calling the step on cell with an Error returns Error.