Azure Shared Image Gallery¶
In recent years, Azure Cloud has provided the capability to share the VM images between regions, allowing you to create a Golden Image once and share it, whether publicly for the community, or privately within your organization.
Though, not the AzureRM OpenTofu provider, nor the Azure documentation, has a clear working example you can refer to. This is why I am sharing my struggle, so that you don't have to go through the same.
Creating the Linux VM¶
First things first, we need to creat the Virtual Machine. I create the Linux VM using the example provided in the OpenTofu Registry.
# ref: https://registry.terraform.io/providers/hashicorp/azurerm/3.91.0/docs/resources/linux_virtual_machine
resource "azurerm_resource_group" "example" {
name = "example-resources"
location = "West Europe"
}
resource "azurerm_virtual_network" "example" {
name = "example-network"
address_space = ["10.0.0.0/16"]
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
}
resource "azurerm_subnet" "example" {
name = "internal"
resource_group_name = azurerm_resource_group.example.name
virtual_network_name = azurerm_virtual_network.example.name
address_prefixes = ["10.0.2.0/24"]
}
resource "azurerm_network_interface" "example" {
name = "example-nic"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
ip_configuration {
name = "internal"
subnet_id = azurerm_subnet.example.id
private_ip_address_allocation = "Dynamic"
}
}
resource "azurerm_linux_virtual_machine" "example" {
name = "example-machine"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
size = "Standard_F2"
admin_username = "adminuser"
network_interface_ids = [
azurerm_network_interface.example.id,
]
admin_ssh_key {
username = "adminuser"
public_key = file("~/.ssh/id_rsa.pub")
}
os_disk {
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
}
source_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-jammy"
sku = "22_04-lts"
version = "latest"
}
}
This setup works just alright, except that it has no public IP address and I won't be able to SSH into machine for any possible reason.
This public access will also require a proper firewall rule.
On top of that, it also will require a public SSH key for the authentication.
That's why, the modified version will look like the following.
resource "azurerm_resource_group" "example" {
name = "example-resources"
location = "West Europe"
}
resource "azurerm_virtual_network" "example" {
name = "example-network"
address_space = ["10.0.0.0/16"]
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
}
resource "azurerm_subnet" "example" {
name = "internal"
resource_group_name = azurerm_resource_group.example.name
virtual_network_name = azurerm_virtual_network.example.name
address_prefixes = ["10.0.2.0/24"]
}
resource "azurerm_public_ip" "example" {
name = "example-public-ip"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
sku = "Standard"
ip_version = "IPv4"
allocation_method = "Static"
}
resource "azurerm_network_interface" "example" {
name = "example-nic"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
ip_configuration {
name = "internal"
subnet_id = azurerm_subnet.example.id
private_ip_address_allocation = "Dynamic"
public_ip_address_id = azurerm_public_ip.example.id
}
}
resource "tls_private_key" "example" {
algorithm = "RSA"
rsa_bits = 3072
}
resource "azurerm_ssh_public_key" "example" {
name = "example-ssh-public-key"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
public_key = tls_private_key.example.public_key_openssh
}
resource "azurerm_linux_virtual_machine" "example" {
name = "example-machine"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
size = "Standard_F2"
admin_username = "adminuser"
network_interface_ids = [
azurerm_network_interface.example.id,
]
admin_ssh_key {
username = "adminuser"
public_key = azurerm_ssh_public_key.example.public_key
}
os_disk {
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
}
source_image_reference {
publisher = "Canonical"
offer = "0001-com-ubuntu-server-jammy"
sku = "22_04-lts"
version = "latest"
}
}
data "http" "my_ip" {
url = "https://ifconfig.me"
method = "GET"
}
resource "azurerm_network_security_group" "example" {
name = "example-nsg"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
security_rule {
name = "SSH"
priority = 1000
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = data.http.my_ip.body
destination_address_prefix = "*"
}
}
resource "azurerm_network_interface_security_group_association" "example" {
network_interface_id = azurerm_network_interface.example.id
network_security_group_id = azurerm_network_security_group.example.id
}
Perfect! Now I have a VM machine in my Azure account that I can SSH into for further customization before creating the image.
Customize the VM¶
To keep things simple, let's just install a MongDB community edition on it and be on with it.
I am using ansible here, but you're free to SSH directly into the machine and run the ad-hoc commands.
Before being able to run Ansible on the target machine, I will need to create my inventory.
locals {
cwd = path.cwd
key_filepath = "${path.cwd}/azure_vm.key"
}
resource "local_sensitive_file" "ssh_private_key" {
content = tls_private_key.example.private_key_pem
filename = local.key_filepath
file_permission = "0400"
}
resource "local_file" "inventory" {
content = <<-EOT
azure:
hosts:
azure-vm0:
ansible_host: ${azurerm_public_ip.example.ip_address}
ansible_user: adminuser
ansible_ssh_private_key_file: ${local.key_filepath}
ansible_ssh_common_args: '-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null'
EOT
filename = "${local.cwd}/inventory.yml"
file_permission = "0640"
}
And now, I can either use null resource, or run the ansible-playbook
from the CLI. I prefer the former, since it is replicatable across runs.
resource "null_resource" "bootstrap" {
connection {
type = "ssh"
host = azurerm_public_ip.example.ip_address
user = "adminuser"
private_key = tls_private_key.example.private_key_pem
}
provisioner "local-exec" {
# To account for cloud-init operations in the new created VM
command = "sleep 120"
}
provisioner "local-exec" {
command = "cd ${local.cwd} && ansible-playbook bootstrap.yml"
}
provisioner "remote-exec" {
inline = [
"sudo waagent -deprovision+user -force",
]
}
provisioner "local-exec" {
command = "az vm deallocate --resource-group ${azurerm_resource_group.example.name} --name example-machine"
}
provisioner "local-exec" {
command = "az vm generalize --resource-group ${azurerm_resource_group.example.name} --name example-machine"
}
triggers = {
vm_id = azurerm_linux_virtual_machine.example.id,
}
depends_on = [
local_file.inventory,
azurerm_linux_virtual_machine.example,
azurerm_public_ip.example,
]
}
Installing the MongoDB¶
One last piece to customize the VM is to install the dependencies we need. Here's the playbook I am using.
- name: Install curl & gnupg
ansible.builtin.apt:
name: "{{ item }}"
state: present
update_cache: true
with_items:
- curl
- gnupg
- name: Install Mongo dependencies
block:
- name: Add jammy-security repository to sources.list.d
ansible.builtin.lineinfile:
path: /etc/apt/sources.list.d/jammy-security.list
line: "deb http://security.ubuntu.com/ubuntu jammy-security main"
create: true
state: present
mode: "0644"
- name: Install MongoDB GPG key
ansible.builtin.get_url:
url: https://pgp.mongodb.com/server-6.0.asc
dest: /usr/share/keyrings/mongodb-server-6.0.asc
mode: "0644"
- name: Add MongoDB repository to sources.list.d
ansible.builtin.lineinfile:
path: /etc/apt/sources.list.d/mongodb-org-6.0.list
line: "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-6.0.asc ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/6.0 multiverse"
create: true
state: present
mode: "0644"
- name: Install MongoDB community version
ansible.builtin.apt:
name: mongodb-org
state: present
update_cache: true
- name: Hold MongoDB packages
ansible.builtin.dpkg_selections:
name: "{{ item }}"
selection: hold
with_items:
- mongodb-org
- mongodb-org-database
- mongodb-org-server
- mongodb-mongosh
- mongodb-org-mongos
- mongodb-org-tools
- name: Set ulimit
ansible.builtin.lineinfile:
path: /etc/security/limits.d/99-mongodb-nproc.conf
line: |
limit fsize unlimited unlimited # (file size)
limit cpu unlimited unlimited # (cpu time)
limit as unlimited unlimited # (virtual memory size)
limit memlock unlimited unlimited # (locked-in-memory size)
limit nofile 64000 64000 # (open files)
limit nproc 64000 64000 # (processes/threads)
create: true
state: present
mode: "0644"
- name: Set configuration
ansible.builtin.copy:
content: |
storage:
dbPath: "/var/lib/mongodb"
directoryPerDB: true
systemLog:
destination: file
path: "/var/log/mongodb/mongod.log"
logAppend: true
processManagement:
fork: true
net:
bindIp: 127.0.0.1
port: 27017
setParameter:
enableLocalhostAuthBypass: true
security:
authorization: enabled
dest: /etc/mongod.conf
mode: "0644"
owner: mongodb
group: mongodb
- name: Start service
ansible.builtin.systemd:
name: mongod
state: started
enabled: true
daemon_reload: true
That's it. After applying this stack with tofu apply
, I will have a generalized VM ready to take a VM image from.
The generlization is something you should consider for yourself, as there are pros and cons to having either a generalized or a specialized image. For the purpose of this article, I am using a generalized VM image because there is nothing special about my image, nor do I have any of the conditions that will stop me from having such an image.
Create the Image¶
Running the stack so far will create a generalize VM, with my special dependencies installed. Now I am ready to create an image from it.
One requirement here is that I want to be able to use this image in other Azure regions. At the time of writing, the Azure cloud has recently provided the Azure Compute Gallery that will allow to replicate the same image across different regions.
The alternative is to create the same image in each region, which is an obvious waste of resource and money.
Let's create the image with the following resources.
# ref: https://registry.terraform.io/providers/hashicorp/azurerm/3.91.0/docs/resources/shared_image
resource "azurerm_resource_group" "example" {
name = "example-resources"
location = "West Europe"
}
resource "azurerm_shared_image_gallery" "example" {
name = "example_image_gallery"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
description = "Shared images and things."
tags = {
Hello = "There"
World = "Example"
}
}
resource "azurerm_shared_image" "example" {
name = "my-image"
gallery_name = azurerm_shared_image_gallery.example.name
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
os_type = "Linux"
identifier {
publisher = "PublisherName"
offer = "OfferName"
sku = "ExampleSku"
}
}
Now this is where it gets tricky, because so far, this will only create the gallery and an image definition only. It doesn't give you the image, nor does it allow you to create VM instances out of it later on.
For that, you will need to create an image version.
resource "azurerm_resource_group" "example" {
name = "example-resources"
location = "West Europe"
}
resource "azurerm_shared_image_gallery" "example" {
name = "example_image_gallery"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
description = "Shared images and things."
tags = {
Hello = "There"
World = "Example"
}
}
resource "azurerm_shared_image" "example" {
name = "my-image"
gallery_name = azurerm_shared_image_gallery.example.name
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
os_type = "Linux"
identifier {
publisher = "PublisherName"
offer = "OfferName"
sku = "ExampleSku"
}
}
resource "azurerm_shared_image_version" "example" {
name = "0.0.1"
gallery_name = azurerm_shared_image.example.gallery_name
image_name = azurerm_shared_image.example.name
resource_group_name = azurerm_shared_image.example.resource_group_name
location = azurerm_shared_image.example.location
target_region {
name = azurerm_shared_image.example.location
regional_replica_count = 1
storage_account_type = "Standard_LRS"
}
}
Now, you might go happy about it and call it a day. But this will throw an error with the following content.
│ "managed_image_id": one of `blob_uri,managed_image_id,os_disk_snapshot_id`
│ must be specified
Troubleshooting¶
What does this mean then in simple English?
In simple terms, it means that the "version" you are trying to create, will actually be a simple tag. Think of Docker tags if it helps with the analogy.
But the whole point of this article is that you will not get through without creating and actual azurerm_image
resource. That is the true image that will be created underneath. Without that, you cannot have an image version.
Again, if it helps with the analog, imagine trying to create a docker tag without having the image in the first place.
That's what this whole thing is about.
And to get around it, you will need to create the image as well.
Just as you see below.
resource "azurerm_resource_group" "example" {
name = "example-resources"
location = "West Europe"
}
resource "azurerm_shared_image_gallery" "example" {
name = "example_image_gallery"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
description = "Shared images and things."
tags = {
Hello = "There"
World = "Example"
}
}
resource "azurerm_shared_image" "example" {
name = "my-image"
gallery_name = azurerm_shared_image_gallery.example.name
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
os_type = "Linux"
identifier {
publisher = "PublisherName"
offer = "OfferName"
sku = "ExampleSku"
}
}
resource "azurerm_image" "example" {
name = "exampleimage"
location = azurerm_linux_virtual_machine.example.location
resource_group_name = azurerm_linux_virtual_machine.example.name
source_virtual_machine_id = azurerm_linux_virtual_machine.example.id
}
resource "azurerm_shared_image_version" "example" {
name = "0.0.1"
gallery_name = azurerm_shared_image.example.gallery_name
image_name = azurerm_shared_image.example.name
resource_group_name = azurerm_shared_image.example.resource_group_name
location = azurerm_shared_image.example.location
managed_image_id = azurerm_image.example.id
target_region {
name = azurerm_shared_image.example.location
regional_replica_count = 5
storage_account_type = "Standard_LRS"
}
}
Versions¶
To help with reproducibility, I will include the versions of the providers in this post.
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.92"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
local = {
source = "hashicorp/local"
version = "~> 2.4"
}
null = {
source = "hashicorp/null"
version = "~> 3.2"
}
}
required_version = "< 2"
}
Source Code¶
The code for this post is available from the following link.
Conclusion¶
That pretty much solves everything. I can't imagine having done it this way. But hey, this is Azure cloud we're talking about.
The things I've seen in Azure are the kind that I haven't seen elsewhere.
In no particular order, and in a non-exhaustive list, here are some horror stories:
- Creating a parent and a child resource, updating the parent which forces a replacement and then the provided complains not being able to delete the parent because the child is still referencing it. I mean, isn't the whole point of IaC to be able to create, update and delete resources and the underlying provider takes care of the ugly work for you!?
- The Azure Kubernetes module creates a child resource group for you, and for any other node-pool you want to add to the cluster, you can't create a separate resource group, but rather, you gotta reference the same resource group to create the new node-pool.
Some of these would have been fine if we weren't promised that IaC tools such as OpenTofu are supposed to protect you from a need to get into the Azure portal and do the manual chores yourself, the same chore the provider should've done for you.
But that's whole point. We were promised that it's all gonna be the responsibility of the underlying provider. That's wrong! At least in the case of Azure.
If you enjoyed this blog post, consider sharing it with these buttons . Please leave a comment for us at the end, we read & love 'em all.
Share on Share on Share on Share on