Exploring Modern Data Integration Scenarios With Azure Cloud.
Exploring modern data integration scenarios with Azure Cloud.
In our time working with Azure cloud solutions, we have identified three key data integration scenarios that we want to share with you. Understanding and planning your architecture is a key part of adopting any piece of the Azure Cloud or Azure Data Stack. We hope that our outlines below will help you in taking the next steps in your Azure solution!
The three most common data integration situations we have encountered are:
Building a Modern Business Intelligence Solution
Building a Modern Analytics Solution for Apps
Lift & Shift an Existing SSIS Packages to the Cloud
There are many different variations of the above. Some are a mix these three scenarios. However, our experience is that most of our customers fall into one of these scenarios.
In all of our encounters the core product is Azure Data Factory. This is the flagman Microsoft Azure product for building Analytics Solutions.
Azure Data Factory Value Proposition
Azure Data Factory is a managed Data Integration Service in Azure. It provides four main value propositions:
Flexible Pipeline Model. It means that you can have your activities connected to another activities. They could be running on success, on failure, on completion. It has loops and iterations. It can be triggering your pipeline on demand or on a schedule or on event like a file arriving on your azure blob storage.
Data Movement as a Service. This service is truly elastic and serverless. That means we can provide configuration and it will create the infrastructure in the back end to scale up based on your data movement and your configuration.
SSIS Package Execution. Data Factory allows us to lift and shift SSIS packages, i.e. in case when you have heavily invested in an SSIS instance, you can send and execute them in Azure using the full power of Cloud.
Authoring and Monitoring. You can use visual drag and drop interface in order to author and monitor Azure Data Factory.
On the top we have Azure Data Factory service running in Azure, and we have our using UX & SDK, which allow us to build pipelines or SSIS. There are two things we can author:
Then we have Azure Data Factory Service. It is responsible for Scheduling, Orchestration and Monitoring.
In addition, Data Factory has Integration Runtime. You can think of this as the heart of Azure Data Factory. This is the execution engine and it provides three capabilities:
Pipeline activity execution
SSIS package execution
There are two types of Integration Runtime. The 1st one is Azure, and this is responsible for the cloud. The 2nd form of integration is called Self Hosted Integration Runtime. This is the actual software that you install on machine that can connect your on-premise machine and your Azure environment. We need this for Hybrid integration. Moreover, it allows to create the network of VMs.
Let’s discover the actual BI solutions and its architecture.
Scenario #1: Building a Modern Business Intelligence Solution
Of our three situations, the most common is Building Modern Business Intelligence Solution. Usually, customers are looking to modernize their existing data warehouse and business intelligence solution. There are lots of factors that play into this need, such as increase in data volume/velocity/variety, cost of ownership, and functionality.
Have a look at the diagram below to see a typical architecture for this scenario.
From the left we have structure data within OLTP (transactional databases), unstructured data like logs files, social media, as well as Business and Marketing Apps like Adwords, Facebook, Sales Force and others.
This story is about hybrid data integration, because we have data in cloud, and on premise. In this particular architecture we are using Azure Data Factory to acquire source data. Then we perform transformations and computations with Azure Databricks (Spark) through Azure Blob Storage, and then load into Azure SQL Data Warehouse. Once we have competed this we are using Azure Analysis to build a business semantic model. Finally, we are connecting Power BI to the semantic model to allow our business end users get the most out of our new architecture.
This entire pipeline can be orchestrated using Azure Data Factory. Moreover, Azure Data Factory allows us monitor and schedule the whole process.
Scenario #2: Building Modern Analytics Solution for Apps
The second scenario for us to touch on is a Modern Analytics Solution for Software as a Service (SaaS). It is common in the tech sector to have the need to develop these SaaS apps. Our customers in this realm are looking to modernize their data warehouse in order to build their SaaS application.
There is a typical architecture for this scenario:
This is quite similar to the previous one. From left side, we have almost the same sources, and we have structure, semi-structure, etc. We are loading data into the storage, then transforming or computing with Spark. Then, instead of the utilizing a data warehouse, we are loading result set into the app storage that is backend for our SaaS app. Lastly the customers are rendering those SaaS apps in browser or on mobile devices.
Azure Data Factory allows us to consume big volume of different data sets and orchestrate the whole data load and transform process. It is efficient and reliable.
Scenario #3: Lift & Shift Existing SSIS Packages to the Cloud
The final use case is also common across our customers. This is popular among customers who are invested in Microsoft infrastructure. They have used Microsoft SSIS ETL packages for the last decade and looking for a way forward in Azure. In this case a lift and shift of existing packages to the cloud avoids rewriting existing logic and saves valuable time. They want to continue to use their familiar tools but don’t want to be in the infrastructure business anymore. They want to get all benefits of the cloud, including pay as you go. They want to take their SSIS packages and execute them in Azure. This is possible in Azure Data Factory and avoids the headache of rebuilding existing systems.
Let’s look the potential architecture for this case:
Here, our customers have their own on-premise infrastructure. This usually consists of SQL Server as their data warehouse platform. They have heavily relied on Microsoft Integration Service as their ETL tool for many years. They load from on-premise data sources such as transactional databases, logs and etc.
In this scenario, they can use SQL Server Integration Services (SSIS) Feature Pack for Azure. It is an extension that provides the components listed for SSIS to connect to Azure services, transfer data between Azure and on-premises data sources, and process data stored in Azure. Moreover, they can copy data from SQL Server to the Azure SQL Data Warehouse. Finally, they can use Azure Analysis Service and Power BI as an accessible reporting tool.
There are lots and lots of ways that the Azure Cloud can be used to integrate with modern data dilemmas. We hope that these common scenarios will help illuminate your decisions and enable you to take the next step in your BI solution.
Author, dmitry anoshin
Power BI Developer
Has obtained certifications across many BI platforms.
A recognized expert in building and implementing Business/Digital Intelligence Solutions
Team FreshBI is an innovation machine and the blogs that Dmitry writes mostly emanate from the innovations that FreshBI does for its customers
Our Latest Blogs
Based in USA and Canada, we have helped hundreds of businesses achieve excellence & success through business intelligence.
Power BI runs in our blood, and we are ready to take your business to next level.