DataSetReader
Short description
DataSetReader reads data from data sets stored in the Data Manager.
This component must run on CloverDX Server. It only connects to the Data Manager instance deployed in the same CloverDX Server instance where it is running. |
Data source | Input ports | Output ports | Each to all outputs | Different to different outputs | Transformation | Transf. req. | Java | CTL | Auto-propagated metadata |
---|---|---|---|---|---|---|---|---|---|
Any data set in Data Manager |
0 |
1 |
⨯ |
⨯ |
✓ |
⨯ |
⨯ |
✓ |
✓ |
Ports
Port type | Number | Required | Description | Metadata |
---|---|---|---|---|
Output |
0 |
✓ |
For the data read from the selected data set. |
Auto-propagated based on the layout of the selected data set. |
Metadata
The DataSetReader component propagates metadata on the output port. Metadata is created based on the layout of the selected data set.
If different metadata is used, the mapping from data set’s metadata to the metadata on the output port can be done via the Output mapping component attribute.
DataSetReader attributes
Attribute | Req | Description | Possible values |
---|---|---|---|
Basic |
|||
Data Set |
✓ |
Data set to read from. Clicking on the select button will show all data sets available on the Server. |
|
Record status |
Allows you to filter the records based on their status. Default value is Approved. |
All (any status) |
|
Include deleted |
Configure whether to include records marked as deleted when reading from the data set. |
|
|
Complete batches only |
Configure whether to include only batches that have all records in Approved or Committed status. |
|
|
Output mapping |
Allows you to map data read from the data set to the output port. By default, this is set to Map by name and fields with matching names and types will be mapped automatically. This is consistent with the common usage where the metadata on output port 0 is auto-propagated and will match the data set exactly. |
||
Advanced |
|||
Max number of records |
Configure how many records to read from the data set. If the data set contains fewer matching records than specified, the component will finish once it reads all of them – it will not wait for the data set to grow. If this attributed is left empty, all records from the data set will be read. |
Details
DataSetReader connects to an instance of Data Manager running on the same Server as the component and returns data from the selected data set. The component is intended for usage in “post-processing” jobs which read data from the Data Manager and load the data to the target system.
Unloading data from Data Manager
The basic pattern for reading data from Data Manager is to use DataSetReader component followed by any components implementing the logic for the data and then followed by the DataSetCommit to mark the records as fully processed.
As an example, a simple job pulling data from the Data Manager may look like this:
DataSetReader must be used together with DataSetCommit component to mark the records processed with the reader as Committed. If this is not done, the records will stay in Approved status and will never be purged from the Data Manager.
The job above first reads data from the specified data set with the DataSetReader, then loads the records to the warehouse and finally notifies the Data Manager that the records were successfully processed by setting their status to Committed with the DataSetCommit.
Note how the DataSetCommit is in phase 5 while the DataSetReader and WriteListingToDWH are both in phase 0. This two-phase approach is necessary since it is possible that the records that are read from the data set do not make it to their destination – for example, they may be rejected by an API, or the target system may be unavailable when the job runs etc.
In such cases, the records will not be marked as Committed in the data set and will be picked up again next time the job runs.
Compatibility
Version | Compatibility Notice |
---|---|
6.5 |
DataSetReader introduced as Incubation component in 6.5.0. |
6.6 |
DataSetReader with expanded functionality, still an Incubation component in 6.5.0. |