What are Integration Runtimes?
An Integration Runtime (IR) is the compute infrastructure used by Azure Data Factory to provide data integration capabilities such as Data Flows and Data Movement. It has access to resources in either public networks, or hybrid scenarios (public and private networks).
Integration Runtimes are specified in each Linked Service, under Connections.
There are 3 types to choose from.
Azure Integration Runtime is managed by Microsoft. All the patching, scaling and maintenance of the underlying infrastructure is taken care of. The IR can only access data stores and services in public networks.
Self-hosted Integration Runtimes use infrastructure and hardware managed by you. You’ll need to address all the patching, scaling and maintenance. The IR can access resources in both public and private networks.
Azure-SSIS Integration Runtimes are VMs running the SSIS engine which allow you to natively execute SSIS packages. They are managed by Microsoft. As a result, all the patching, scaling and maintenance is taken care of. The IR can access resources in both public and private networks.
Integration Runtime Scenarios
- Azure automatically provisions an integration Runtime which can connect to Azure resources (Azure SQL, Azure Synapse Analytics, Storage Accounts) without any issues.
- You can perform data integration securely in a private network environment, shielded from the public cloud environment. For that you need to install a self-hosted IR inside your virtual private network. The self-hosted integration runtime only makes outbound HTTP-based connections to open internet.
- You can also perform data integration securely in an on prem environment. For that you need to install a Self-hosted IR behind your corporate firewall in your on prem environment.
- You can natively execute SSIS Packages by creating an Azure-SSIS Integration Runtime which creates an Integration Services Catalog in Azure SQL Database where the packages are stored. An ADF pipeline run sends commands to the Azure SSIS IR which executes the SSIS Packages.
Are Integration Runtimes Secure?
Data Store Credentials
On-premise data store credentials can either be stored within Data Factory or be referenced by Data Factory via Key Vault at runtime. Storing credentials within Data Factory means they are always stored and encrypted on the Self-hosted IR machine.
Storing credentials locally can be done with or without flowing credentials through Azure backend service to the Self-hosted IR machine. Both options allow secure encryption.
Encryption in Transit
All data transfers are via secure channel HTTPS and TLS over TCP to prevent man-in-the-middle attacks during communication with Azure services.
You can also use IPSec VPN or Azure ExpressRoute to further secure the communication channel between your on-premises network and Azure.
Virtual Network Service Endpoint
Using Virtual Network Service Endpoints to restrict SQL DB access to only the specified Virtual Network (VNet) adds an extra layer of security. Service Endpoints enables private IP addresses in the VNet to reach the endpoint of an Azure service without needing a public IP address on the VNet.
Once you enable service endpoints in your VNet, you can add a VNet rule to secure the Azure service resources to your VNet. The rule provides improved security by fully removing public internet access to resources and allowing traffic only from your VNet.
In order for the Azure-SSIS IR to access the SQL Database, it needs to be joined to the same VNet and Subnet as illustrated by the above diagram (scenario 4). In this way, only this Subnet can access the SQL Database.
With that in place, turning off “Allow Azure Services to Access Server” is the next step as both the IR and the Azure SQL DB now operate within the context of a VNet and can communicate with private IP addresses which is more secure.
In this blog we’ve looked at the 3 integration runtimes. We’ve also examined how they can be made secure. Thank you for your attention.
Databricks Vs Synapse Spark Pools – What, When and Where?
Databricks or Synapse seems to be the question on everyone’s lips, whether its people asking
May
Power BI to Power AI – Part 2
This post is the second part of a blog series on the AI features of
Apr
Geospatial Sample architecture overview
The first blog ‘Part 1 – Introduction to Geospatial data’ gave an overview into geospatial
Apr
Data Lakehouses for Dummies
When we are thinking about data platforms, there are many different services and architectures that
Apr
Enable Smart Facility Management with Azure Digital Twins
Before I started writing this blog, I went to Google and searched for the keywords
Apr
Migrating On-Prem SSIS workload to Azure
Goal of this blog There can be scenario where organization wants to migrate there existing
Mar
Send B2B data with Azure Logic Apps and Enterprise Integration Pack
After creating an integration account that has partners and agreements, we are ready to create
Mar
Incremental Group is acquired by Telefónica Tech
Incremental’s acquisition by Telefónica Tech powers the next phase of growth for the digital technology
Mar