Introduction
This guide is designed for data developers and data governance teams to effectively use Octopai's automated data lineage solution in managing changes to data flows within a CI/CD DataOps process. It provides best practices for setting up multiple environments, performing impact analyses, and conducting risk assessments, ensuring that changes are managed smoothly and governance standards are upheld.
Setting Up Your Environments
For data developers, it's crucial to establish multiple environments—development, QA, staging, and production—to manage changes efficiently. This setup allows you to rigorously test changes before they reach production, while also aligning with data governance protocols.
The best practice is to effectively use multiple environments within Octopai.
- Development Environment: Data developers introduce and iterate on changes here, ensuring that the changes meet initial requirements.
- QA Environment: This is where typically data governance teams can validate changes against governance policies and standards. Test plans are executed to ensure compliance and data integrity.
- Staging Environment: This environment mimics production and is used for final validation by both data developers and governance teams.
- Production Environment: The live environment where data is actively used by end-users and business processes, requiring strict governance oversight.
Important Note: It is essential to ensure that your organization’s contract includes licensing for multiple environments. This setup should be coordinated with the Octopai Support team as part of your licensing agreement. Having this in place allows you to fully leverage Octopai’s capabilities across all necessary environments, ensuring a smooth and compliant change management process.
Applying Impact Analysis and Risk Assessment
Understanding the potential impact of changes on data flows is critical for both data developers and governance teams.
-
Triggering Impact Analysis:
- Use Octopai to identify upstream and downstream dependencies that may be affected by the change.
- Ensure that the impact analysis aligns with governance policies, documenting any risks or compliance issues.
-
Conducting Risk Assessment:
- Evaluate the technical risks associated with the change, such as potential disruptions to dependent systems.
- Assess the risks from a compliance perspective, ensuring that all regulatory requirements are met.
Upstream Impact Analysis
Comparison Upstream Impact Analysis - QA vs. Production
Automating Data Refresh for Testing
Automating the data refresh process in Octopai ensures that all environments reflect the most recent changes, which is essential for both effective development and governance.
-
Setting Up Automation:
- Using Jenkins: For teams using Jenkins in their CI/CD pipeline, Octopai can be integrated to automate the data refresh process. This allows data developers to work with the latest data lineage and governance teams to ensure continuous compliance.
- Octopai’s Client offers a built-in feature for scheduling automatic data refreshes. This ensures that the lineage is always up-to-date, facilitating both development and governance efforts.
- When to Apply Jenkins: Applying Jenkins makes the most sense in environments with established CI/CD pipelines where rigorous testing and governance are required. If not, Octopai's scheduling features can serve as a simpler alternative.
- The automatic refresh Octopai's solution can be applied on a connection (metadata source) level.
https://octopai.zendesk.com/hc/en-us/articles/15788388136465-Admin-User-Octopai-Client
4. Simulating Changes in Pre-Production
Simulating changes in a pre-production environment is essential to avoid unintended consequences in production, particularly when governance standards must be met.
-
Simulating Impact:
- Use Octopai to simulate the change and understand its technical implications across environments.
- Ensure that the simulated impact is analysed for compliance risks, validating that the change adheres to governance policies.
-
Creating a Regression Test Plan:
- Develop a test plan that covers all critical data flows impacted by the change.
- Validate that the test plan includes checks for compliance and data integrity.
Use the Export of Octopai E2E Column Lineage capability as a foundation for your test plan:
5. Troubleshooting in Production
When a change leads to an issue in production, Octopai helps both data developers and governance teams quickly identify and resolve the problem.
-
Using Lineage Information for Troubleshooting:
- Trace the issue back to its source within the data flow, identifying the root cause.
- Ensure that the resolution process aligns with governance standards, updating documentation as needed.
-
Documenting the Solution:
- Document the technical steps taken to resolve the issue.
- Update governance documentation to reflect the resolution and any changes to compliance processes.
Best Practices Summary
- Environment Setup: Maintain separate environments for development, testing, and production to support both development and governance needs. Ensure that your Octopai license includes support for multiple environments by coordinating with the Support team.
- Impact Analysis: Conduct thorough impact analyses to understand both technical and governance implications of changes.
- Automation: Integrate Octopai into your CI/CD pipeline to automate data refreshes, using Jenkins and/or other scheduling engines.
- Pre-Production Testing: Simulate changes in a pre-production environment to validate technical and governance compliance before going live.
- Troubleshooting: Use Octopai’s lineage information for efficient troubleshooting that meets both development and governance standards.
By following these guidelines and leveraging Octopai effectively, data developers and governance teams can collaborate to manage data flow changes with confidence, minimizing risks and ensuring compliance across the board.
Comments
0 comments
Please sign in to leave a comment.