Data conversion steps
-
Convert to boolean: convert values in string columns to boolean
-
Convert to date: convert values in string columns to date
-
Convert to decimal: convert values in string or integer columns to decimals
-
Convert to integer: convert values in decimal or string columns to integers
-
Convert to string: convert values in a column to string
-
Convert Unix time to date: convert Unix timestamp value to date and time
-
Calculate formula: convert data using custom calculation
Convert to boolean
The Convert to boolean step can be used to convert text column into a boolean column. It cannot be applied to any other column type. See below for the list of values that the step can handle.
Parameters
-
Input column: required, a string column to convert to boolean.
-
Target column: required, configure the column which will receive the output. Output will always be of boolean type.
-
Write result to the current column: overwrite the input column with the result.
-
Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.
-
Example
Since the step does not have any extra configuration besides input and target columns, its application is quite straightforward. Following screenshot shows possible values and how they are converted:
Remarks
-
The step can handle following values when converting to boolean. Note that the step ignores leading or trailing whitespaces and is not case sensitive (so "YES" and "yes" are treated the same).
Input Output "true", "yes", "y", "1"
true
"false", "no", "n", "0"
false
No value
No value
Any other value not mentioned above
Error
-
Converting an Error value will result in an Error.
Convert to date
The Convert to date converts string columns to date columns using a format that you specify in the step configuration.
To learn more about formatting options consult our date format documentation.
Parameters
-
Input column: required, a string column to convert to date.
-
Format: required, describes format to use when parsing values in the input column. See date formats for more details.
-
Locale: required, determines which language to use when parsing the data. This is important for example when trying to parse month names that are spelled out (e.g. "November" instead of 11).
-
Timezone: required, the timezone to be used to create the date values.
-
Target column: required, configure the column which will receive the output. Output will always be a date column.
-
Write result to the current column: overwrite the input column with the result.
-
Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.
-
Example
A simple example showing conversion results alongside settings step settings:
Note that for the above screenshot, the date column uses yyyy-MM-dd HH:mm:ss.SSS
display format. You can observe that:
-
The first and second rows have the same output value even though the input value is different. This is because the milliseconds are ignored since they are not part of the conversion date format.
-
The third row returns and error (with error message being "Unparseable date"). The date in that row does not use the same format as expected and therefore cannot be parsed and converted.
-
No value gets converted into No value.
Remarks
-
Converting an Erorr value will result in an Error.
Convert to decimal
The Convert to decimal step converts values in the selected column from string or integer to decimal.
Parameters
-
Input column: required, a string or integer column to convert to decimal.
-
Format: only shown if converting string to decimal. Describes format to use when parsing values in the input column. See decimal formats for more details.
-
Locale: only shown if converting string to decimal. Determines which language to use when parsing the data. Locale controls whether (for example) decimal point or comma is used, what digit grouping is allowed etc.
-
Target column: required, configure the column which will receive the output. Output will always be a decimal column.
-
Write result to the current column: overwrite the input column with the result.
-
Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.
-
Examples
A basic example that shows output of the conversion from string to decimal using No specific format formatting and English (United States) locale.
When parsing monetary data you’ll often need to parse values that have currency symbol - often $, € or other symbol. In such case, you can write a format string with this symbol in mind: the following screenshot shows $###,###.00
with English (United States) locale.
Percent symbol has a special meaning when parsing numerical data. It will automatically cause the data to be divided by 100 when converting to decimal. While this may be desirable in some case, sometimes you want to parse the percentage as it is. In such case, you will have to use a format string like 0.00'%'
- note the single quotes around the percent sign. The difference can be seen on the following screenshot:
For the example below, the decimal column uses #.00
format to always show two decimal places.
Notice how the values that do not fit into an integer column can still be converted to decimal properly:
-
Row 6 is too large to fit into integer, but fits into a decimal. Conversion process understands this and will handle cases like this.
-
Row 8 is a decimal number in the source CSV file. It cannot be stored in integer column so it is marked as an error. The conversion to decimal is able to convert the value so it no longer shows as error in the converted data.
Remarks
-
When converting string to decimal, format you specify will apply.
-
When converting integer to decimal, no format is necessary and you’ll see that the step will not show you the Format or Locale parameters. The step will try to convert also values that do not fit into integer column but may fit into decimal - e.g. values that are too large/small or contain non-integer numbers.
-
Converting an Error value will result in an Error unless the error is a data error as described in the previous point.
-
Keep in mind that Wrangler data type integer corresponds to CTL type
long
when researching CTL functions to use in formulas that work with integer columns.
Convert to integer
The Convert to integer step converts decimal or string values to integer. When converting string values, you can define format
Parameters
-
Input column: required, a string or decimal column to convert to integer.
-
Format: only shown if converting string to integer. Describes format to use when parsing values in the input column. See integer formats for more details.
-
Locale: only shown if converting string to integer. Determines which language to use when parsing the data. Locale controls whether what digit grouping is allowed etc.
-
Target column: required, configure the column which will receive the output. Output will always be a integer column.
-
Write result to the current column: overwrite the input column with the result.
-
Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.
-
Examples
A basic example that shows output of the conversion from string to integer using No specific format formatting and English (United States) locale.
Converting from decimal to integer does not require Format or Locale parameters and works by throwing away fractional part of a decimal number. Note that decimal can store larger numbers than integer and the conversion can therefore result in an error.
Remarks
-
Converting an Error value will result in an Error.
-
Converting a No value cell will result in a No value.
Convert to string
The Convert to string step converts values from any Wrangler-supported to string. Display format of the input column is preserved during the conversion.
Parameters
-
Input column: required, a column to convert to string.
-
Target column: required, configure the column which will receive the output. Output will always be a string column.
-
Write result to the current column: overwrite the input column with the result.
-
Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.
-
Examples
A screenshot below shows an example in which date is converted to string. Notice how errors "disappear" in the String column since string can store even invalid and malformed dates.
Remarks
-
Converting No value will result in a No value again.
-
Converting errors may in many cases "fix" the date errors since string can store values that cannot be parsed as numbers, dates etc. Runtime errors will be propagated (i.e., a cell with Error will result in Error if the error is a runtime error and not a data formatting error). See examples above for more details.
Convert Unix time to date
The Convert Unix time to date step allows you to convert from a numeric Unix timestamp value to a date type. Unix timestamp is specified as a number of milliseconds since January 1st, 1970 (also called Unix epoch). Positive numbers refer to date and time after the epoch while negative values refer to date and time before the epoch. For example, timestamp 1680620400000 refers to Tuesday, April 4th, 2023, 15:00 GMT.
Parameters
-
Input column: required, an integer column containing timestamps to convert to date.
-
Target column: required, configure the column which will receive the output. Output will always be a date column.
-
Write result to the current column: overwrite the input column with the result.
-
Create new column with name: create a new column with specified name. Name of the new column can contain spaces or special characters - technical column name will be created automatically. The new column will be placed right after the input column.
-
Examples
Note the value in the Output column in the examples will depend on formatting selected for the target column. Examples use yyyy-MM-dd HH:mm:ss.SSS
formatting to display full date and time including milliseconds and use UTC time zone.
Input value | Output value | Description |
---|---|---|
0 |
1970-01-01 00:00:00.000 |
Value of 0 corresponds to Unix epoch start - midnight of January 1st, 1970. |
1680620400000 |
2023-04-04 15:00:00.000 |
Positive values represent date and time after the Epoch start. |
-14182980000 |
1969-07-20 20:17:00.000 |
Negative values result in date and time before Epoch start. |
No value |
No value |
Remarks
-
This steps changes type of the column from integer to date. Different formatting options will be available for the column after the conversion.