Version

    1. CloverDX Designer tutorial

    This chapter explains the basics of CloverDX projects and shows you way to create a simple graph that reads records from a CSV file and writes them to a .xlsx file.

    Instead of reading this chapter, you can try the Tutorial that is available in the product after the first start in a new workspace or watch a video tutorial.

    Terminology

    Before creating a transformation graph we will explain some terms we use in this tutorial.

    • A workspace is a directory on your computer where your save your projects. It also contains per-workspace configuration. You have chosen it during the start of Designer.

    • A project is a directory in workspace. It is the location where you place data transformations and data.

    • A graph , or a transformation graph, is a recipe to data transformation. The graph consists of components which are connected by edges.

    Creating a project

    We assume that you have downloaded and installed CloverDX Designer.

    It is the right time to create a new project now. Select File  New  CloverDX Project from the main menu:

    ctg 0110
    Figure 2. Creating a New CloverDX project

    Type the name of the project, e.g. Project_01.

    ctg 0120
    Figure 3. Selecting a name of a new project

    Creating a new data file

    Now you need a data file. You probably have some. If not, you can create an example file as shown below.

    The best practice is to place your input data into data-in.

    Right-click data-in item in the Project Explorer pane and select New  File from the context menu.

    ctg 0210
    Figure 4. Creating a flat data file

    Type file name, e.g. input.dat. It will be created and stored in the highlighted data-in subfolder.

    ctg 0220
    Figure 5. Selecting a folder for the data file

    The file will be created and opened.

    ctg 0230
    Figure 6. Still empty data file

    Enter some data records in this file; for example, copy and paste the lines below (make sure there is an empty line at the end):

    John;Smith;25000
    Peter;Brown;30000
    George;Hardy;20000
    Richard;Gordon;22000
    Mark;Taylor;40000
    Michael;Lester;18000
    George;Smith;30000
    Albert;Brown;30000
    ctg 0240
    Figure 7. Filling the graph with delimited data records

    Remember that once you have already some CloverDX project in you workspace and have opened the CloverDX perspective, you can create your next CloverDX projects in a slightly different way:

    • You can create directly a new CloverDX project from the main menu by selecting File  New  CloverDX Project or select File  New  Project…​ and do what has been described above.

    • You can also right-click inside the Project Explorer pane and select either directly New  CloverDX Project or New  Project…​ from the context menu and do what has been described above.

    Creating a graph

    After creating a new project, create a new graph: select File  New  Graph from the main menu. The graph is a recipe of your data transformation.

    ctg 0310
    Figure 8. Creating a new graph

    Give a name to the graph and choose a directory for it. We choose graph as the graph name. CloverDX Designer gives it the .grf extension automatically.

    CloverDX Designer offers the graph subfolder. It is the recommended place for graphs.

    ctg 0320
    Figure 9. Selecting a folder for the graph and a name of a new graph

    Placing components in the Graph Editor pane

    To create a graph, select right components from Palette of Components and place them in the Graph Editor pane. Palette of Components on the right side of the Graph Editor pane.

    ctg 0410
    Figure 10. Selecting components for the graph

    If Palette is not displayed, click an arrow in the right top of the Graph Editor pane. This way the Palette will remain opened until you fold it.

    Find FlatFileReader label in the Palette among Readers. Drag FlatFileReader from Palette into Graph Editor pane.

    ctg 0420
    Figure 11. Placing the first component to the Graph Editor pane

    Do the same with the SpreadsheetDataWriter component from Writers. Put these components in the Graph Editor from left to right.

    creating graph 030
    Figure 12. Placing the writer component

    If you know the component name, you can add component using Add Component dialog. Press Shift+Space within graph editor and start typing the name.

    ctg 0440
    Figure 13. Add component dialog

    Connecting components by an edge

    Click the first output port of FlatFileReader.

    ctg output port
    Figure 14. Connecting components by edges

    An edge appears connected to the output port of the component. Now click inside the Filter component near its input port.

    ctg 0520
    Figure 15. Connecting components by edges

    The edges are still red and dashed since no metadata are assigned to them.

    If you missed the port, a dialog for adding a new component would appear.

    I the next step we will assign metadata to the edge.

    Extracting metadata from the input file

    Metadata is data describing the data structure.

    You can extract metadata from your flat data file or create it by your own. We will show you, how to extract it from input file.

    Right-click the first edge and select New metadata  Extract from flat file.

    ctg 0610
    Figure 16. Extracting metadata

    A wizard for metadata extraction opens. Use Browse button to open dialog to specify a file.

    ctg 0620
    Figure 17. Introductory window of metadata editor

    Select the input.dat file in data-in directory and click the OK button.

    ctg 0630
    Figure 18. Selecting data file

    The Metadata Editor fills up:

    ctg 0640
    Figure 19. Metadata Editor (introductory pane filled)

    Click Next to specify metadata fields.

    ctg 0650
    Figure 20. Metadata Editor (editing pane)

    As you can see, the wizard guessed that the records consisted of three fields and it also understood that the third field values were integer numbers.

    You can replace the three default field names (Field1, Field2 and Field3) with more descriptive ones: FirstName, LastName and Salary.

    To do that, click the Field1 item and enter the new field name.

    ctg 0660
    Figure 21. Renaming a field

    Do the same with the other two field names. The result will look like this:

    ctg 0670
    Figure 22. All fields have been renamed

    Now click Finish. This way you have created metadata. The metadata has been assigned to the edge.

    You can extract metadata on edges and on input components.

    Assigning metadata to the edges

    If you have metadata assigned to the edge from previous step, you do not have to assign it once more.

    If you have any edge without metadata and you would like to assign the metadata to the edge, right-click the edge and select the Select Metadata item from the context menu.

    ctg 0710
    Figure 23. Assigning metadata to an edge

    Select the desired metadata by clicking its item. The edge with assigned metadata becomes solid.

    Setting up Readers (FlatFileReader)

    To set up the FlatFileReader, double-click this component in the Graph Editor pane. The component editor opens.

    ctg reader 0008
    Figure 24. Editing a Reader

    Click the File URL attribute row in component editor. A button appears in the row.

    ctg reader 0010
    Figure 25. Editing a Reader

    Click the tiny button at the of the row to open the File URL dialog. The input file is in data-in directory.

    ctg reader 0030
    Figure 26. Selecting the file in File URL dialog

    Setting up Writers (SpreadsheetDataWriter)

    When you set up writers, the most important is to specify the output files to which data should be written.

    Double-click the SpreadsheetDataWriter component. Click the File URL attribute row in the component editor. After that, a button appears in this, click the button.

    In the File URL dialog, select the output directory and enter the file name.

    ctg writer 0110
    Figure 27. Parameters of the project

    Click OK to use the new component configuration.

    You have created a (transformation) graph, use Ctrl-S to save it. The graph is ready to be run.

    Running the graph

    To run the graph, right-click anywhere inside the Graph Editor pane and select Run CloverDX Graph from the context menu. The graph will run.

    ctg 1010
    Figure 28. Running the graph

    In the Console tab below the Graph Editor pane, you can see the graph run report. If everything is OK, the graph execution will be successful.

    You should see the following window with numbers of parsed records near below the edges:

    ctg 1020
    Figure 29. Execution of graph was successful

    If you would like to see more detailed information about graph run, double-click the Console tab. The tab will cover the whole window. You can restore the original size of this tab when you double-click it again.

    Opening the output file

    After running a graph, the file structure of the Project Explorer pane refreshes automatically. Expand the data-out item to see the output.xlsx file.

    ctg 1110
    Figure 30. Refreshing the output folder

    Double click the file to open it with an appropriate spreadsheet editor.

    Summary

    We have learned to

    • create a transformation graph

    • place component to a graph

    • assign metadata to an edge

    • run a graph

    • read data from a CSV file

    • write data to Excel spreadsheet

    What to do next

    You can continue with Filtering the records or Sorting the records.

    You can also play with built-in pre-prepared examples: Help  CloverDX Examples

    Filtering the records

    In this chapter we will learn how to filter records with the Filter component.

    This chapter builds on the graph from Creating a graph.

    Inserting the filter

    The component for filtering is called Filter. You can find it in the Transformers category. Drag the component on the edge between FlatFileReader and SpredsheetWriter.

    filtering 0110
    Figure 31. Filter component attributes
    filtering 0120
    Figure 32. Filter was added to the graph

    The filtering condition is not specified yet, therefore can you see an error on the component.

    Setting up the filter component

    Double-click the Filter component to open the component editor.

    filtering 0210
    Figure 33. Component editor - Filter component attributes

    Click the Filter expression attribute row in the component editor. The Filter Editor will open:

    filtering 0220
    Figure 34. Filter editor

    Select the salary item and double-click it. The $in.0.salary expression will appear on the blue background in the pane at the bottom.

    filtering 0230
    Figure 35. Selecting a field

    Note that CloverDX validates the expressions automatically. Now an error will be found.

    Click on the right side from the $in.0.salary expression.

    Select a "greater" sign by clicking. It will appear in the pane on the right side from $in.0.salary.

    filtering 0250
    Figure 36. Selecting a "greater" sign

    You need to complete the expression to make it valid. Click at the right from the expression. Then type 24000. After typing the number, the expression becomes valid again.

    The number is the salary that will serve to filter incoming data flow. Only data records with salary higher than 24000 will be sent out.

    filtering 0260
    Figure 37. Filter expression defined

    By clicking OK, you close the FilterExpression editor.

    When you save the graph, you can see that the warning icon has disappeared from the Filter component:

    filtering 0270
    Figure 38. Warning icon has disappeared

    The Filter is configured and you can run the graph.

    Best practices

    It’s better to filter and sort records than to sort and filter.

    If you need to split data into multiple (more than two) streams, use Partitioner.

    See also

    Documentation on ExtFilter

    Two data streams

    You can use Filter component to split data stream into two data streams. Connect an edge to the second output port of the filter to get rejected records too.

    filtering 0310
    Figure 39. Splitting one data stream

    You can use the same condition as in the previous example. The records matching the data filter condition will be passed to the first output port, the later ones will go to the second output port.

    Sorting the records

    In this chapter we will learn how to sort records with the ExtSort component.

    This chapter builds on the graph from Creating a graph.

    Adding ExtSort

    sorting 0110
    Figure 40. Adding ExtSort component

    Setting up the ExtSort component

    Double-click the ExtSort component.

    Click the Sort key row in the component editor. A button appears in this row.

    extsort 010
    Figure 41. Editing ExtSort component

    Use the button to open Sort key dialog:

    extsort 020
    Figure 42. Selecting a Sort key

    Select the key that will be used for sorting the incoming records. Drag the Salary item from the Fields pane and drop it to the Sort key pane. Do the same with LastName and FirstName items in this order.

    extsort 028
    Figure 43. Sort key selection

    Click the cell in the Order column right from the salary item and select Descending instead of Ascending.

    This way you have selected the fields according to which the incoming records should be sorted. The records will be sorted according to the salary field values in descending order. Records with same salary will be sorted according to lname in ascending order. Records the same salary and lname will be sorted according to fname in ascending order.

    Thus, any person with salary of 25000 would be sent out after any other person with salary of 28000. And, within the same salary, any Brown would be sent out through the output port before any Smith. And again, within the same salary, any John Smith would be sent out before any Peter Smith.

    In other words, the fields located higher in the Key parts pane have higher sorting priority.

    extsort 030
    Figure 44. Sort key selection

    After clicking OK, a sequence of field names separated by semicolon will appear in the component editor:

    extsort 040
    Figure 45. Sort key appearance

    Run the graph and see the results.

    You can see that all salaries are sorted in descending order. Note that within the same salary of 30000 both Browns lies above George Smith and that Albert Brown lies higher than Peter Brown.