ADF Best Practices

Making a good Data Factory implementation.

Posted by radekrezac on January 16, 2024

ADF Best Practices Areas

We can aply Best Practices in the following areas:

  • Platform Setup
  • Reusable Code
  • Parallelism
  • Security
  • Monitoring & Error Handling
  • Documentation
  • Development Conventions

Platform Setup

Platform setup best practices should always be used when we have a development project and there is a script or code stored in plain text for development.

Platform setup can be divided into this sections:

  • Development connected to source control.
  • Multiple code repository branches that align to DevOps backlog features.
  • ADF debug feature to perform basic end-to-end testing.
  • Pull requests reviewed before merging into the main delivery branch and published to the development Data Factory service.
  • Test, UAT, and Production are not connected to source control.

Reusable Code

This practice saves time, improves consistency, and leads to higher-quality, more maintainable software. Methods for achieving reusability in ADF:

  • Dynamic Linked Services
  • Generic Datasets
  • Pipeline Hierarchies
  • Dynamic Annotation (More transparent monitoring)
  • Object variables
  • Metadata Driven Processing
  • Using Templates

Dynamic Linked Services gives to us flexibility to parameterize anything in the key/value pair. In our case, we define Secret name with connection string defined in a key vault secret version.

Generic Datasets uses Dynamic Link Services to which they pass dynamic parameters. In our case, we define Secret name with connection string defined in a key vault secret version.