Integration Hub

Integration Hub

Integration Task creation

  1. Navigate to Integration Hub post-Match Maven completion

  2. Configure new pipeline with required parameters:

  • Task Name designation

  • Entity selection

  • Model specification

  1. Execute task creation

  1. Access workflow configuration via Databricks environment redirect

Through this in the databricks environment we can check the entire pipeline from initial stage to the final stage (email notification task).

Having explored the comprehensive User Journey of LakeFusion, including platform navigation, data configuration, preprocessing steps, and advanced Match-Merge operations, we now transition to understanding the Data Flow within the system. The upcoming section will outline how data moves through various components of the platform.


    • Related Articles

    • Databricks Workspace Integration

      LakeFusion seamlessly integrates with Databricks, leveraging its robust features for data analytics and governance. Key aspects include: a. Authentication and Access Databricks OIDC: Facilitates secure communication between LakeFusion and Databricks. ...
    • Technical Stack

      Front End: React, single-spa. Backend Microservices: Python, FastAPI. Database: MySQL, MS SQL, PostgreSQL. Cloud Platform: Databricks. Authentication: OIDC, SSO (Okta, Azure AD). Integration: Databricks Python SDK.Containerization and Orchestration: ...
    • Data Profiling Configuration

      This section walks you through the Data Profiling process in LakeFusion, which analyzes datasets to generate key metrics that reveal data structure, assess quality, and identify anomalies for informed decision-making and improved data management. ...
    • Key Features

      1. Golden Record Generation Automatically create the most accurate version of a record by consolidating data from multiple systems, applying survivorship rules, and resolving conflicts. These "golden records" ensure all downstream systems are ...
    • Who is LakeFusion MDM for?

      LakeFusion is designed for modern data teams that are scaling their use of Databricks and need to ensure consistency, accuracy, and governance in core data entities such as customers, products, suppliers, and employees. It addresses the ...