Complex music box

Build Your Own ACR Retention Policy

Costs on Azure can add up and before you know it, the bill will hurt you. Here is my take on managing container images retention in an Azure Container Registry.

The Need

I worked on a Python project deployed to Azure and the app was containerized.

After a few months, we noticed that the most expensive resource among a set of container registry, container apps, key vaults and storage accounts was the first resource listed.

Originally, I coded a Bash script to run manual cleanup of all tagged images with the build number.

The usage was simple:

  • you open the Azure portal on the company tenant.

  • you open the Azure CLI from the first icon on the left on the top right menu.

  • you create a shell script and copy the content of the script using Nano.

  • you adjust the constant RETENTION_COUNT, if needed. It’s set to 10 in the versioned script.

  • run it with a dry run first:

    1
    
    bash acr_custom_retention_policy.sh <registry_name> --dry-run
    
  • run it without a dry run to delete permanently images:

    1
    
    bash acr_custom_retention_policy.sh <registry_name>
    

However, I wanted to challenge this by automating it.

Solutions possible

You have several solutions and not all of them can work for you, depending on your permissions on Microsoft Azure:

  1. Azure DevOps Pipeline: you can define a new pipeline in DevOps with a few manageable variables (ACR name and dry run option in my use case) and use the schedules to run the pipeline on a recurring basis.
  2. Azure Automation: this service enables you to schedule and run PowerShell scripts or Python runbooks.
  3. Azure Logic Apps: this could work, but might be overly complex for a simple script execution. As you’d need to create a Logic App with a recurrence trigger and use the Azure CLI action to run the script.
  4. Azure Container Instances: you could containerize the script and schedule it to run periodically using Azure Container Instances.

I chose the Azure Automation over Azure DevOps Pipeline since I lacked the permission to use that first option in particular the Service connection creation). The other two seemed overkill for my need.

The best option depends on factors like the type of script you’re running, its complexity, and your specific requirements.

Create the Automation Account

On the Azure portal, search for Automation Accounts and create a new one following the naming guideline from Microsoft.

Apart from the name and region, that followed the value selected for the existing resources of the project, I left the settings with their default.

Once created, you need to give the AcrDelete and Reader role-based permission to the System assigned Object ID for the automation account.

Configure Permissions Between the Container Registry and the Automation Account

Now, go to the Container Registry and under the Access control (IAM) blade, add a role:

  • search for AcrDelete role and select it
  • click the Members tab, select Managed identity and click Select members
  • on the right pane, you will see a form with a managed identity input. Select from the drop-down the All system-assigned managed identities, which is the default when you created the Automation account.
  • select the Automation Account you created earlier and confirm selection with the button Select
  • finally, click twice Review + assign.
  • repeat the steps above to add Reader role.

Craft Your Script Before You Set Up The Runbook With It

In my case, I went to the documentation page for **Get-AzContainerRegistryTag** (found here) and I selected the Open Cloud Shell in the example section.

You will need to set up an account or log-in to an existing one and complete the Microsoft Learn form.

Then, when the cloud shell opens, you’ll want to switch to PowerShell (see top left of the cloud shell window) instead of using Bash.

There you can test a single command or upload your crafted .ps1 from your computer. It was helpful for me.

Create the Runbook With PowerShell

Back the Automation Account, navigate to the blade Process Automation > Runbooks. Then:

  • create a PowerShell runbook:
  • give it a name (e.g., “ACR-Retention-Policy”)
  • choose “PowerShell” as the runbook type
  • choose version 7.2 or above
  • confirm creation with a click on Review + Create

Code the Runbook Script With PowerShell

Once the runbook is created, go to the Overview blade and select Edit > Edit in Portal. A code editor opens.

Paste your script, save it and publish it.

Before putting a schedule on it, run it by clicking Start from the Overview blade.

Once you’re happy and that the script ran as expected,

  • go to the the Resources > Schedules blade while in your Runbook resource.
  • click Add a schedule
  • click Link your runbook to a schedule
  • click Add a schedule again (not the same as the first instance…)
  • on the right pane that opened, name your schedule, describe it and set up your schedule start date and time and the recurrence.
  • click Create and select the schedule and click OK at the bottom left to link it to the runbook.

Now, wait for the schedule to run and check the run result.

Caveat About the PowerShell Option

When I finally tested the runbook set up above, it turned out that the Remove-AzContainerRegistryTag command doesn’t actually delete the image with the numeric tag…

The only programmatic way was to use the Azure CLI command. But how could I run the job so that it could work just like the Bash script?

I went and created a second runbook using Python as the runtime engine.

Code the Runbook With Python

Back the Automation Account, navigate to the blade Process Automation > Runbooks. Then create a Python runbook like so:

  • give it a name (e.g., “ACR-Retention-Policy”)
  • choose “Python” as the runbook type
  • choose version 3.8 or above (at the time of writing, 3.8 was recommended).
  • confirm creation with a click on Review + Create

Configure the Runbook With Necessary Packages

The equivalent script in Python uses a few packages from Microsoft:

To install the package, you need to use the links above and download the .whl file for each and go to the Shared Resources > Python Packages blade.

Click Add a Python package.

Then select the .whl file and select the same Runtime version as the one selected on the runbook creation.

Importing the package will take time, so go get a coffee or take a short walk and come back.

Code the Runbook Script With Python

I converted the manual and original Bash script to a Python script.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#!/usr/bin/env python3

import sys
from azure.identity import ManagedIdentityCredential
from azure.mgmt.containerregistry import ContainerRegistryManagementClient
from azure.containerregistry import ContainerRegistryClient
from azure.core.exceptions import ResourceNotFoundError

def is_numeric(string):
    return string.isdigit()

def cleanup_acr(registry_name, retention_count=3, dry_run=False):
    print(f"Processing Azure Container Registry: {registry_name}")
    if dry_run:
        print("DRY RUN: No actual deletions will occur")

    # Create a Managed Identity credential
    credential = ManagedIdentityCredential()

    # Create a management client
    mgmt_client = ContainerRegistryManagementClient(credential, subscription_id)

    try:
        # Get the registry
        registry = mgmt_client.registries.get(resource_group_name, registry_name)
    except ResourceNotFoundError:
        print(f"Error: Registry '{registry_name}' not found or you don't have access to it.")
        return
    except Exception as e:
        print(f"An error occurred: {str(e)}")
        return

    # Create a ContainerRegistryClient
    registry_client = ContainerRegistryClient(f"https://{registry_name}.azurecr.io", credential)

    # List repositories
    for repository in registry_client.list_repository_names():
        print(f"Processing repository: {repository}")

        # Get all tags for the repository
        tags = list(registry_client.list_tag_properties(repository))

        # Sort tags by creation time in descending order
        tags.sort(key=lambda x: x.created_on, reverse=True)

        numeric_count = 0

        for tag in tags:
            print(f"Processing tag: {repository}:{tag.name}")
            if is_numeric(tag.name):
                numeric_count += 1

                if numeric_count > retention_count:
                    if dry_run:
                        print(f"Would delete tag: {repository}:{tag.name}")
                    else:
                        print(f"Deleting tag: {repository}:{tag.name}")
                        registry_client.delete_tag(repository, tag.name)
            else:
                print(f"Keeping non-numeric tag: {repository}:{tag.name}")

# Azure Automation entry point
if __name__ == "__main__":
    # You need to set these variables
    subscription_id = "[guid-of-your-subscription]"
    resource_group_name = "[resource-group-name]"

    # Parse automation variables
    registry_name = "[container-registry-name]"  # or use the paramter 1 > sys.argv[1]
    retention_count = 2  # or use the paramter 2 > int(sys.argv[2]) if len(sys.argv) > 2 else 3
    dry_run = False  # or use the paramter 3 > sys.argv[3].lower() == 'true' if len(sys.argv) > 3 else False

    cleanup_acr(registry_name, retention_count, dry_run)

The runbook could accept parameters but it seems that you need to provide them manually on each manual run and on a schedule, I didn’t want to spend the time to research how to provide them automatically.

Setting local variables within the script was sufficient.

Conclusion

I now have a clean and efficient Container Registry with the minimum container images persisted on each repository of images, hence optimizing the costs for my employer.

Follow me

Thanks for reading this article. You liked it? Make sure to follow me on X, subscribe to my Substack publication and bookmark my blog to read more in the future.

Photo by Daniel Tuttle on Unsplash.

License GPLv3 | Terms
Built with Hugo
Theme Stack designed by Jimmy