Data Warehouse Construction Standards - Data Modeling Standards

3.1 Data Modeling Standards

a. Horizontal Layering:

  • Explanation

    Our layered design approach is an essential byproduct of our data architecture strategy, adhering to strict standards during the modeling phase for optimal results.

  • Layer Standards

    • The Operational Data Store (ODS)
      • Design Methodology:
        • Store raw data with minimal processing in the data warehouse system, maintaining a structure that is similar to the source system.
      • Main Functions:
        • Synchronize and store foundational data in the data warehouse to address data silo issues and ensure the integrity of data integration .
        • To ensure data persistence, keeping tables and data in perfect sync.
        • Perform regular synchronization and add synchronization timestamps to the tables to capture their temporal variability.
      • Data Processing:
        • Handling of anomalies and erroneous data.
      • Naming Convention:
        • Layer abbreviation_Source System_Source System Table Name
        • Example: ods_oms_order
- The Data Warehouse Details (DWD)
    - Design Methodology:
        - A core layer of the design that obtains data from the ODS layer and establishes data models according to subjects. It cleans and processes the original data for consistency and stores detailed factual data.
    - Main Functions:
        - Data Segmentation: Categorizing **Analytical Subject** for Precise Insights
        - Cleanse the data and process it for consistency to resolve issues of poor data quality and inconsistencies in definitions, thereby ensuring the consistency, integrity, validity, and accuracy of data integration.
        - Store the most detailed granularity of metrics to address the issue of having numerous development tasks.
        - Perform regular synchronization and add synchronization timestamps to the tables to capture their temporal variability.
    - Data Processing:
        - Handle null values
        - Verify data validity
        - Standardize data formats
        - Unify data standards
        - Perform data transform
        - Clean data according to business rules.
    - Naming Convention:
        - Layer abbreviation_Subject abbreviation_Full name of stored content
        - Example: dwd_mkt_sign
    - Construction Method:
        - Select business processes
            - For example: For the supply chain subject area, select the purchase order product refund amount based on the indicator document.
        - Declare granularity
            - For example: Analyze business forms/find the base table for purchase refunds, extract the business primary key granularity of the table, return order number, product number.
        - Confirm dimensions
            - For example: Purchase order number, product, product type, supplier, time, purchaser.
        - Confirm facts
            - For example: Purchase order product refund amount, purchase order product refund quantity.
- The Data Warehouse Service (DWS)
    - Design Methodology:
    Using the **analysis object** as the modeling driver, build a summary indicator table at a common granularity based on the common requirements of the upper-level applications and products. Physically model the wide table to provide a unified calculation basis and data standard for the application layer, improving efficiency.
    - Main Functions:
        - Meet 90% of common requirements, address inconsistencies in standards and the issue of having too many development tasks.
        - Implement wide-table processing to address poor performance issues and improve analysis efficiency.
        - Regular synchronization, add table synchronization timestamps to reflect the temporal variability.
    - Data Processing:
        - Dimension Roll-up
        - Metric Aggregation
        - Metric Consolidation
    - Naming Convention
        - Layer abbreviation_Object abbreviation_Full name of stored content
        - Example: dws_order_orderdetail
- The Application Data Service (ADS)
    - Design Methodology:
        - Based on business requirements, store personalized report data of data products, which can be directly provided for query and display, ensuring efficiency.
    - Main Functions:
        - Personalized design to solve performance issues and ensure data analysis efficiency.
    - Data Processing:
        - High-level summarization, individualized processing.
    - Name convention:
        - Layer abbreviation_Subject abbreviation_Full name of form.
        - Example: ADS/DM_mkt_XSJSC
  • Data Ingestion Principal:
    • In layered data modeling, each layer builds on the previous one, starting with ingesting raw data and resulting in a final view containing clean transformed data that is highly functional.

请我喝杯咖啡吧~

支付宝
微信