As I mentioned in the previous post, the main reason I wanted to connect a custom VM to Azure Arc was that I could start telemetry collection. So today let’s do just that. This post relies on the resources that we created the last time, so if something looks like it came out of nowhere – check out the first post of this series.
VM Telemetry in Azure
Azure’s approach to VM’s telemetry collection is called VM Insights. It’s actually way more powerful than just a telemetry collector, but at this stage we just need a few CPU/Disk graphs to be happy.
Enabling VM Insights through UI is easy. We’d go to Insights blade of the VM, hit a friendly “Enable” button, and after 10 minutes or so we’d discover some dashboards with beautiful graphs on them. However, hitting buttons is not the Jedi way. Terraform is.
Building blocks of VM Insights
It is still worth to hit that “Enable” button at least once, to see what’s going on under the hood. It is interesting, because at least four things are happening:
- Azure creates Log Analytics workspace (LA) resource, which will act as a database for our telemetry.
- It will also add a Data Collection Rule (DCR), which will specify what kind of data we are interested in (Microsoft Insights, Performance Counters), and where we want to send it to (Log Analytics workspace).
- A virtual machine, that we enabled insights for, will receive an extension called Azure Monitor Agent (AMA), whose sole responsibility is to extract the metrics and send them to whatever Data Collection Rule we associated it with.
- Finally, Azure will add an association between the collection rule and the agent, thus completing the path between metrics and their storage.
Configuring VM Insights with Terraform
Knowing the building blocks, it doesn’t require much of a brain power to automate them. However, there is a catch. We can create LA workspace and DCR upfront, no problem. Creating AMA and the association is a completely different story, though. Most likely there will be more than one Arc virtual machine to monitor, and we have no control of when exactly they will appear in Azure. All that makes “the right time and place” to create AMA and DCR association quite unknown.
But there is a solution – Azure Policy. The policy defines the desired configuration of resources, and Azure will constantly seek for violations to rectify (if told so). In other words, if we create policies that require every Arc machine to have AMA extension with DCR association, whenever a new VM comes along, Azure will detect missing properties and configures them in automatically. Profit!
In fact, we don’t even need to create those policies. Azure already has them:
- “Configure Linux Arc-enabled machines to run Azure Monitor Agent” for AMA.
- “Configure Linux Machines to be associated with a Data Collection Rule or a Data Collection Endpoint” for DCR.
We just need to assign them to the resource group with future Arc machines, and the science will happen.
Steps 1-2: Create Log Analytics Workspace and Data Collection Rule
Creating LA workspace is trivial:
1 2 3 4 5 6 7 |
resource "azurerm_log_analytics_workspace" "this" { name = "Arc-Telemetry" resource_group_name = azurerm_resource_group.this.name location = azurerm_resource_group.this.location sku = "PerGB2018" retention_in_days = 30 } |
We put LA into the same resource group as Arc machines, since they should share the same region.
Data Collection Rule is more complex. Mainly because it’s surprisingly hard to find a proper documentation about its settings. I ended up creating DCR manually, examining its properties in JSON, and then replicating the findings in Terraform.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
resource "azurerm_monitor_data_collection_rule" "this" { name = "Arc-Telemetry-Collection-Rule" resource_group_name = azurerm_resource_group.this.name location = azurerm_resource_group.this.location destinations { log_analytics { workspace_resource_id = azurerm_log_analytics_workspace.this.id name = "VMInsightsPerf-Logs-Dest" } } data_flow { streams = ["Microsoft-InsightsMetrics"] destinations = ["VMInsightsPerf-Logs-Dest"] } data_sources { performance_counter { streams = ["Microsoft-InsightsMetrics"] sampling_frequency_in_seconds = 60 counter_specifiers = ["\\VmInsights\\DetailedMetrics"] name = "VMInsightsPerfCounters" } } } |
Most of the configuration actually makes sense. destinations-log_analytics
points to an LA workspace we declared above. That’s logical. data_flow
block is also more or less straightforward – we just pass data from streams
to destinations
as is.
data_sources
would’ve made the perfect sense, too, if it was backed by some documentation. Well, at least creating it manually and “reverse engineering” still works.
Either way, this configuration is valid, and we can go to the next steps – policies, that will connect these resources to Arc VMs.
Steps 3-4: Create Policies for AMA and DCR Association
It’s time for a little bit of magic! First, let’s add references to existing monitoring policies.
1 2 3 4 5 6 7 |
data "azurerm_policy_definition" "monitoring-agent" { display_name = "Configure Linux Arc-enabled machines to run Azure Monitor Agent" } data "azurerm_policy_definition" "data-collection-rule" { display_name = "Configure Linux Machines to be associated with a Data Collection Rule or a Data Collection Endpoint" } |
Then, assign those policies to our resource group with Arc machine. The first policy will install Azure Monitor Agent extension to machines that don’t have it:
1 2 3 4 5 6 7 8 9 10 11 12 |
resource "azurerm_resource_group_policy_assignment" "monitoring-agent" { name = "Enable Azure Monitor on Arc Machines" resource_group_id = azurerm_resource_group.this.id policy_definition_id = data.azurerm_policy_definition.monitoring-agent.id identity { type = "SystemAssigned" } location = local.azure_arc_region enforce = true } |
And the other will create a Data Collection Rule Association between the VM and the DCR:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
resource "azurerm_resource_group_policy_assignment" "data-collection-rule" { name = "Associate Data Collection Rule with Arc Machines" resource_group_id = azurerm_resource_group.this.id policy_definition_id = data.azurerm_policy_definition.data-collection-rule.id parameters = <<PARAMS { "dcrResourceId":{ "value": "${azurerm_monitor_data_collection_rule.this.id}" } } PARAMS identity { type = "SystemAssigned" } location = local.azure_arc_region enforce = true } |
In addition to usual policy stuff, these two have “SystemAssigned” managed identities. Since the policies will actually change the state of resources, they must have some permissions. And permissions imply some identity to be granted to. Thus, the identity {}
block.
To update Arc VM, the policies need “Azure Connected Machine Resource Administrator” role. DCR association policy also needs to be “Monitoring Contributor”:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
resource "azurerm_role_assignment" "monitoring-agent" { scope = azurerm_resource_group.this.id role_definition_name = "Azure Connected Machine Resource Administrator" principal_id = azurerm_resource_group_policy_assignment.monitoring-agent.identity[0].principal_id } resource "azurerm_role_assignment" "data-collection-rule" { for_each = toset([ "Azure Connected Machine Resource Administrator", "Monitoring Contributor" ]) scope = azurerm_resource_group.this.id role_definition_name = each.key principal_id = azurerm_resource_group_policy_assignment.data-collection-rule.identity[0].principal_id } |
A Test Drive
I put new resources to arc-policies.tf
file, along with main.tf
and arc.tf
from the last time, and run that terraform apply
command:
Policy assignments appear right away, but it takes some time for Azure Policy engine to discover that our existing Arc machine is in violation. However, in 5–30 minutes you’ll notice that Azure Monitoring Agent extension appeared in Arc VM’the’s extensions list, and a few more minutes later there will be some data in VM’s built-in dashboard. Victory!
Conclusion
Hopefully by now you are convinced, that hybrid clouds are actually something useful. Of course, Digital Ocean, whose VM we connected to Azure Arc, has its own monitoring extensions, and monitoring it with Azure didn’t add much. But I have two comments to that.
First, Azure Monitoring is not the only thing we could’ve enabled. Just as easy we could’ve enabled logging, desired state configuration, update management and even real time protection to that VM. I don’t believe you could’ve done that with vanilla Digital Ocean.
And second, I chose Digital Ocean just for demonstration. The cloud provider that I actually used in real life doesn’t support built-in monitoring, so AMA extension alone was a huge win. If we were pursuing, for example, FedRAMP certification, I’d just point my approved collection of Azure extensions to that third-party Arc VM, and it would magically become compliant in minutes.
So yes, hybrid cloud and Azure Arc in particular is a pretty useful thing. I have spoken.