This section walks you through the steps to create, configure, run, and review a data quality task using Diagramming Configuration in LakeFusion. The process is designed to ensure your data meets the required standards for accuracy, consistency, and reliability.
Step 1: Creating a Data Quality Task
Access the Data Quality card (either from Home or from the left navigation pane)
Click on Create Quality Task
You have two options to define your task:
Notebook Configuration: Use a predefined notebook template to configure your data quality tasks.
Diagramming: Use a visual interface to create and link data quality tasks in a flow.
Quality Task Name (following naming standards)
Detailed task description
Choose a task type
Target dataset selection
Task execution frequency (if required)
Step 2: Configuration
After choosing the task type, the task will be in draft status and change to configured after this step.
Click on the created task and use the visual interface to link data quality tasks (e.g., Null Handler, String Cleaner, Filter, Value Mapper).
Click on each of the task boxes to configure. E.g. In the null handler, select the column, choose the Replace Type, and enter the Replacement Value.
Hover over a task box to start a connecting, hold your mouse, and drag a line to the next task box to create an order of operations.
Click on Validate Flow to ensure there are no errors (e.g., cyclic errors). You can only save the flow if there are no errors.
Step 5: View Results
After configuration, a job is created, and job details can be found in the Details Tab.
View task execution results in the results tab.
Monitor the task execution status (e.g., In Progress or Completed) in the Runs Tab.