Skip to content

Supercharge Monorepo CI/CD: Unlock Selective Builds

Monorepo is the practice of storing all your code in a single repository, which can be beneficial for code sharing, dependency management, and version control.

However, there is no free lunch! As your codebase grows, managing builds become unavoidably complex and time-consuming. This build time is billed on your organization and it can get quite costly.

In this blog post, we'll explore the challenges of building only changed applications in a monorepo and discuss strategies to optimize your workflow with selective builds.

If this gets you excited, let's dive in!

Introduction

In the modern day of software development, where it is claimed that software is eating the world1, it's important to stay sharp and ahead of the game, not falling victom of our own success.

Monorepo is one of the strategies to help manage the codebase of multiple applications2, while keeping them in a single repository, squeezing every ounce of collaboration and teamwork out of your team.

It comes with its own benefit and challenges of course. Let's explore them in more detail.

What is a Monorepo and Why Use It?

A monorepo is a single repository that contains the source code for multiple projects or applications. This approach is commonly used by organizations to manage large codebases, share code between projects, and simplify dependency management3.

flowchart LR
    subgraph Monorepo
        Auth
        Inventory
        Order
        Payment
    end

It allows standardizing some of the organization-wide practices for packaging, building, dependency management, and deployment. It enables more accessible codebase where audits are done faster, dependencies can be upgraded simultaneously and the overall development experience is more streamlined.

The Challenge of Building Only Changed Applications

As the codebase in a monorepo grows, it becomes evidently clear that builds will cost more compute time and resources. This is because the build process often involves compiling, testing, and packaging all applications in the repository, even if only a few of them have changed.

Full focus at a coffee shop
Full focus at a coffee shop

This inefficiency can lead to longer build times, increased resource consumption, and slower feedback loops for developers. To address this challenge, developers need to implement strategies for building only the applications that have changed since the last build.

Key Advantages of Using a Monorepo Structure

Monorepo comes with a couple of very sexy and appealing advantages4:

  • Code Sharing: Developers can easily share code between projects and applications, reducing duplication and improving consistency.
  • Dependency Management: Dependencies can be managed at the repository level, ensuring that all applications use the same versions of libraries and packages5.
  • Version Control: Changes to multiple applications can be tracked and managed in a single repository, simplifying version control and code reviews.
  • Collaboration: Teams can work together more effectively by sharing code, reviewing changes, and coordinating releases in a single repository.
  • Consistency: Standardized build processes, testing frameworks, and deployment pipelines can be applied across all applications in the repository.

Downsides of Monorepo Structure

However, there are some disadvantages to using a monorepo structure:

  • Complexity: Managing a large codebase with multiple applications can be complex and challenging, especially as the number of projects grows. This will require a lot of discipline and organization.
  • Build Performance: Building all applications in the repository can be time-consuming and resource-intensive, leading to longer build times and slower feedback loops for developers.
  • Dependency Conflicts: Dependencies between applications can lead to conflicts and compatibility issues, requiring careful management and coordination between teams.
  • Security Risks: A single repository can be a single point of failure for security breaches, so it's important to implement strong access controls and security measures to protect the codebase.
  • Bigger Size at Scale: When the teamsize grows, so does the size of the applications within it. As a result, even a single git status might take minutes6.

The Problem with Full Rebuilds

When you make changes to a single application in a monorepo, the traditional build process often involves rebuilding all applications in the repository. This can be inefficient and time-consuming, especially if only a few applications have changed.

Structured poorly, this can lead to a lot of wasted time and resources, which could be better spent on more productive tasks. Imagine paying for an AWS lambda function that runs for 10 minutes, but only 1 minute is used for the actual work!

That's the main reason why optimizing the build process is crucial to benefit from the advantages of a monorepo structure, while minimizing the drawbacks.

Why Full Rebuilds are Inefficient

Full rebuilds are inefficient for several reasons:

  • Resource Consumption: Building all applications in the repository requires more compute resources, leading to higher costs and longer build times.
  • Build Time: Rebuilding unchanged applications is a waste of time and can slow down the feedback loop for developers.
  • Developer Productivity: Waiting for full rebuilds to complete can reduce developer productivity and hinder collaboration between teams.
  • CI/CD Pipelines: Full rebuilds can overload CI/CD pipelines and increase the risk of build failures and bottlenecks.

Strategies for Building Only Changed Applications

To address the challenges of full rebuilds in a monorepo, developers can implement strategies for building only the applications that have changed since the last build. This can help optimize build performance, reduce resource consumption, and improve developer productivity.

There are various strategies to employ, each can be fitting for different organizations and teams. Let's explore some of them.

Using Dependency Graphs to Identify Changes

One approach to selective builds is to use dependency graphs to identify the applications that have changed and need to be rebuilt. By analyzing the dependencies between applications, developers can determine which applications are affected by a change and only rebuild those applications.

Implementing Selective Builds with Build Tools

Another strategy is to use build tools that support selective builds, such as Bazel or Lerna. These tools allow developers to define build targets for specific applications and only rebuild those targets when changes are detected.

Leveraging CI/CD Pipelines for Optimized Builds

The third approach to selective builds is to employ CI/CD pipelines to optimize builds in a monorepo by triggering builds only for the applications that have changed.

By setting up automated pipelines that monitor changes in the repository, developers can ensure that only the necessary applications are rebuilt.

This is the approach we will explore in this article, implementing an efficient CI/CD pipeline that will trigger the build for only the applications that have changed since the last build.

Selective Builds in Practice

We have done a lot of talking so far. Let's get hands-on a bit. 🤓

Imagine having a monorepo with the following code structure:

monorepo/
├── auth/
├── inventory/
├── order/
└── payment/

We need a way to findout about the changes in the repository in each application. As explained earlier, there are different ways to do this.

To keep things simple, we will follow a naive approach by identifying the changes of an application by looking at the changes to the contents of the files within its directory.

If we, somehow, figure out a way to map the current contents of the files down to a single unique string (i.e. a hash), we can compare future changes of any of the files within that directory to the earlier computed hash and determine if the application needs a rebuild.

Chances are, collisions are close to none and we won't have false negatives, i.e. to not build an application if it needs to be built. 🤞

flowchart LR
    subgraph Monorepo
      Auth
      Inventory
      Order
      Payment
    end
    subgraph Hashes
      abcd1234
      efgh1234
      ijkl1234
      mnop1234
    end
    Auth --> abcd1234
    Inventory --> efgh1234
    Order --> ijkl1234
    Payment --> mnop1234

With the hashes we collected in our first run, in any of the future pushes to the repository, running the same hash function should either:

  1. Generate the same hash output, meaning no changes were made to the files within the application directory, 🔒 or
  2. Generate a different hash output, meaning changes were made to the files and we need a rebuild of that application to reflect the new state of the app. 🔀

Determine Current State for Selective Builds

What if there was a way we could map our directory's content to a hash?

That's the question we will cover in this section. It aims to calculate a single finite string that represents the current state of the files within a directory.

This function has to have knowledge of the contents within each of the files in that directory because any future changes to those files should change the output of this hash function.

Consequently, and completely irrelevant, the hash function is a one-way function, meaning we can't reverse-engineer the contents of the files from the hash output. This may or may not be a compliance requirement for your orgniazation, yet it is good to know that it's already the case if the need arises.

find . -maxdepth 1 -type d -print0 | \
   while IFS= read -r -d '' dir; do
     find "$dir" -type f -exec sha256sum {} \; | \
     sha256sum - | \
     awk '{print $1}' | \
     xargs -I {} echo "$dir: {}"
   done | \
   sha256sum - | \
   awk '{print $1}'

There are 3 steps happening in this hash function. Let's break it down:

1⃣ We first find all the directories at the root of the current working directory and pass that to the next step.

2⃣ For each of the found directories, we calculate the SHA256 hash of the contents of the files within that directory and pass it to the next pipe.

In the next pipe, we'll grab all the hashes of the files within each directory and calculate another SHA256 hash of these textual outputs.

Here's a visual diagram of everything that's happened so far.

flowchart LR
    A["./auth/\n├── Dockerfile\n└── main.py"] -->|the hash function| B[abcd1234]
    style A text-align:start
    style B text-align:start

And here's a sample output of the script:

./:          0bb5ea223d6c1d8a01694e98d1e8c5da12eb7e45d7530276d85a77a2466779d1
./order:     58013061b734d86cb1794a7d0542db559b4dd4c55201a7cb6cd1c8a331ce1180
./auth:      ad39a5acefb8599684aa7761c3713c5bf4a67e217f0b00514abe19af58d74668
./payment:   50addb41c31a91220677a981eb1ac9b818c024d066997ad59f3a61ca0fa65aed
./inventory: 6e061f94f199396d210ce2be5ffc9ec88a87bd78d5b1895e9055791fdf34944e

2⃣ We grab all these textual outputs and run them through our SHA256 hash function to get a single hash output. This will be the hash function of the entire directory we'll use later to compare the current state against any of the future changes. This will be the sample output:

2c3d76bc29e8f1123c2e07b85d4f7796c3516604f1558cfd0c549aa255f92990  -

3⃣ As you see in the last step's output, there is a redundant - at the end. The awk at the last step removes that, leaving us with only the hash function output.

The output of the last step is our final value for getting a unique hash string for an entire directory of files. We'll use this in our following steps.

Now that we get the idea of what to do, let's put it into a script. To make the process programmatic, we'll use JavaScript to have more flexibility, as well as because it is natively supported in GitHub Actions.

The following snippet is probably not the best in class; yet, as there are usually multiple ways to get the job done in programming languages, let's just see it to completion for now.

index.js
import fs from "fs";
import path from "path";
import { createHash } from "crypto";
import core from "@actions/core";
import { createClient } from "redis";

function calculateFileHash(filePath) {
  var fileBuffer = fs.readFileSync(filePath);
  var hashSum = createHash("sha256");
  hashSum.update(fileBuffer);
  return hashSum.digest("hex");
}

We will now store the hash for all the directories of a given root path.

index.js
function* findFiles(directory) {
  var items = fs.readdirSync(directory);

  for (var item of items) {
    var fullPath = path.join(directory, item);
    if (fs.statSync(fullPath).isDirectory()) {
      yield* findFiles(fullPath);
    } else {
      yield fullPath;
    }
  }
}

function calculateDirectoryHash(directory) {
  var hashSum = createHash("sha256");

  for (var file of findFiles(directory)) {
    var fileHash = calculateFileHash(file);
    hashSum.update(fileHash);
  }

  return hashSum.digest("hex");
}

function calculateAllHashes(appRootPath) {
  var applications = fs.readdirSync(appRootPath).filter(function isDir(file) {
    return fs.statSync(`${appRootPath}/${file}`).isDirectory();
  });

  var directoryHashes = {};

  applications.forEach(function hashDir(appDir) {
    var rootPath = appRootPath.replace(/\/$/, "");
    directoryHashes[`${rootPath}/${appDir}`] = calculateDirectoryHash(
      `${rootPath}/${appDir}`
    );
  });

  return directoryHashes;
}

Comparison for Selective Builds on Changes

In our naive approach, we will consider an application to be changed if any of the files within it has changed their contents, e.g. addition, deletion, etc.

To be able to determine the change, we need to store the "state" somewhere, that is, the hash function output of previous runs. That's how we'll later be able to compare the hashes and decide if a rebuild is needed.

Furthermore, for our datastore, among many available options, we'll pick Redis for its simpicity and ease of use.

index.js
async function getCurrentAppHashes(store, storeKey) {
  return await store.hGetAll(storeKey);
}

function compareHashes(oldHashes, newHashes) {
  if (!oldHashes) {
    return Object.keys(newHashes);
  }

  var changedApps = [];
  for (var app in newHashes) {
    if (!oldHashes[app] || oldHashes[app] != newHashes[app]) {
      changedApps.push(app);
    }
  }
  return changedApps;
}

Mark Changed Services for Rebuild

At this stage we're ready to putting it all together. We'll use the previous codes to determine the changes in the repository and mark the applications that need a rebuild if the contents of a file in that directory has changed.

Since this script will be used in the GitHub Actions workflow, we should write the output to a special temporary file accessible by $GITHUB_OUTPUT.

index.js
async function markChanges(store, newHashes, storeKey) {
  var oldHashes = await getCurrentAppHashes(store, storeKey);
  return compareHashes(oldHashes, newHashes);
}

function githubOutput(changedApps) {
  var numChangedApps = changedApps.length;

  var stringifyApps = JSON.stringify({ directory: changedApps });

  core.info(`Changed apps: ${stringifyApps}`);
  core.info(`Number of changed apps: ${numChangedApps}`);

  // e.g. matrix: '{"directory": ["./auth"]}'
  core.setOutput("matrix", stringifyApps);
  // e.g. length: '1'
  core.setOutput("length", numChangedApps);
}

async function mark(store, newHashes, storeKey) {
  var changedApps = await markChanges(store, newHashes, storeKey);

  githubOutput(changedApps);
}

JavaScript GitHub Actions

To run this script, we will provide the required file action.yml for this script to be used in the GitHub Actions workflow.

action.yml
name: Selective Builds
description: Calculate the changed services comparing the contents of files to the previous run.
author: Meysam Azad
branding:
  icon: sliders
  color: orange
inputs:
  path:
    description: The directory to look for applications
    required: true
    default: "."
  redis-host:
    description: The host of the redis server
    required: true
  redis-port:
    description: The port of the redis server
    required: false
    default: "6379"
  redis-password:
    description: The password of the redis server
    required: true
  redis-ssl:
    description: Whether to use SSL for the redis connection
    required: false
    default: "false"
  mode:
    description: Whether to capture the changes or to submit them to datastore (mark|submit)
    required: false
    default: "mark"
  exclusions:
    description: A line separated list of patterns to exclude when finding apps
    required: false
    default: ""
  store-key:
    description: The key to store the changes in the redis server
    required: false
    default: app-caches
outputs:
  apps:
    description: A comma list of applications that need rebuild.
runs:
  using: node20
  main: dist/index.js

For the Redis server, we'll use the free tier of Upstash7, a managed Redis service.

DISCLOSURE

This post is NOT sponsored by Upstash. I'm just a happy user of their services.

GitHub Actions Workflow for Selective Builds

We are now ready to take this script and use it for our CI/CD pipeline. We have already covered how to use dynamic matrix in GitHub Actions in a previous post, so we'll use that knowledge to build our pipeline.

In our monorepo, we can define a GitHub Actions workflow that has three jobs:

  1. The first job will run through the first depth of directories, read the contents of each directory/app, and calculate the hash of the contents. The output of this first job will be a list of directories that has changed contents since the last run and are in need of a rebuild.
  2. The second job will use the technique of GitHub Actions dynamic matrix to run the build for only the applications that have changed.
  3. The third job will wait for the completion of the build step, and once done successfully, it will update the Redis server with the new hash values for the directories.

Here's what each of the three jobs will look like in a sample monorepo:

.github/workflows/ci.yml
name: ci

on:
  push:
    branches:
      - main
  schedule:
    - cron: "0 0 * * *"

jobs:
  prepare:
    runs-on: ubuntu-latest
    permissions:
      contents: read
    outputs:
      matrix: ${{ steps.matrix.outputs.matrix }}
      length: ${{ steps.matrix.outputs.length }}
    steps:
      - uses: actions/checkout@v4
        name: Checkout
      - id: matrix
        name: Discover changed services
        uses: developer-friendly/selective-builds-actions@v1
        with:
          redis-host: ${{ secrets.REDIS_HOST }}
          redis-port: ${{ secrets.REDIS_PORT }}
          redis-password: ${{ secrets.REDIS_PASSWORD }}
          redis-ssl: ${{ secrets.REDIS_SSL }}
          exclusions: |
            .git
            .github

  build:
    needs: prepare
    runs-on: ubuntu-latest
    if: needs.prepare.outputs.length > 0
    strategy:
      fail-fast: false
      matrix: ${{ fromJson(needs.prepare.outputs.matrix) }}
    permissions:
      contents: read
      packages: write
    steps:
      - name: Checkout
        uses: actions/checkout@v4
      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      - name: Login to GitHub Container Registry
        uses: docker/login-action@v3
        with:
          password: ${{ secrets.GITHUB_TOKEN }}
          registry: ghcr.io
          username: ${{ github.actor }}
      - name: Pre-process image name
        id: image-name
        run: |
          name=$(echo ${{ matrix.directory }} | sed 's/.*\///')
          echo "name=$name" >> $GITHUB_OUTPUT
      - id: meta
        name: Docker metadata
        uses: docker/metadata-action@v5
        with:
          images: ghcr.io/${{ github.repository }}/${{ steps.image-name.outputs.name }}
      - id: build-push
        name: Build and push
        uses: docker/build-push-action@v6
        with:
          cache-from: type=gha
          cache-to: type=gha,mode=max
          context: ${{ matrix.directory }}
          labels: ${{ steps.meta.outputs.labels }}
          platforms: linux/amd64,linux/arm64
          push: true
          tags: |
            ${{ steps.meta.outputs.tags }}

  finalize:
    runs-on: ubuntu-latest
    needs: build
    permissions:
      contents: read
    steps:
      - uses: actions/checkout@v4
        name: Checkout
      - id: matrix
        name: Discover changed services
        uses: developer-friendly/selective-builds-actions@v1
        with:
          redis-host: ${{ secrets.REDIS_HOST }}
          redis-port: ${{ secrets.REDIS_PORT }}
          redis-password: ${{ secrets.REDIS_PASSWORD }}
          redis-ssl: ${{ secrets.REDIS_SSL }}
          mode: submit
          exclusions: |
            .git
            .github

Notice a couple of important points in the workflow:

  • There are two triggers for this CI/CD workflow; one for the pushes to main, the default branch of the repository and understandably since we want to tie our live state to the HEAD of the repository. The next trigger is scheduled to run daily; this is present to avoid Upstash removing your possibly unused Redis instance as part of their resource management.
  • This workflow will run the three jobs in sequential order, waiting for the completion of one before starting the other. However, the build of the applications will be in parallel, thanks to the dynamic matrix feature of GitHub Actions.
  • The second job, named build, is expecting two outputs from the first job and it is important to highlight that the value of each output is an string, yet the format expected in the strategy.matrix is a JSON serialized string. It will turn out to be something like this in the end:
    strategy:
      matrix:
        directory:
          - ./auth
          - ./inventory
          - ./order
          - ./payment
    
  • There are other ways we could've avoided writing three separate jobs for the same requirement, however, if we want to leverage the power of GitHub Actions' parallel builds, we have to split the jobs into separate steps, one for identifying changes, one for the actual build, and the last one to update the Redis server with the new hashes.

NOTE: As smart as it may sound to merge the last two jobs, that would actually result in duplicate work and overwriting the writes to the Redis server. This may or may not be an issue for your use case, yet it is clearly evident that it is redundant and not a good idea!

Save the Hashes of the Changes to the Upstash Redis Server

In order to be able to later compare the current state of the repo against old states, we have to write the new hashes at the end of every workflow after the build.

To do that in our JavaScript code, we'll simply use the Redis API.

index.js
async function submit(store, newHashes, storeKey) {
  await store.hSet(storeKey, newHashes);
}

Entrypoint to the JavaScript GitHub Actions

The main starting point of this script, fetching all the inputs and producing the expected output is as follows. Notice the heavy usage of the GitHub Actions SDK API in this code to access the input.

index.js
try {
  var host = core.getInput("redis-host");
  var port = core.getInput("redis-port");
  var password = core.getInput("redis-password");
  var tls = core.getBooleanInput("redis-ssl");
  var mode = core.getInput("mode");
  var appRootPath = core.getInput("path");
  var exclusions = core.getMultilineInput("exclusions");
  var storeKey = core.getInput("store-key");

  core.info(`Mode: ${mode}`);
  core.info(`App root path: ${appRootPath}`);
  core.info(`Exclusions: ${exclusions}`);
  core.info(`Store key: ${storeKey}`);

  var store = createClient({
    username: "default",
    password,
    socket: {
      host,
      port,
      tls,
    },
  });
  await store.connect();
  var ping = await store.ping();

  core.info(`Redis ping: ${ping}`);

  var newHashes = calculateAllHashes(appRootPath);

  core.info(`New hashes: ${JSON.stringify(newHashes)}`);

  newHashes = Object.fromEntries(
    Object.entries(newHashes).filter(function getInclusions([key]) {
      return !exclusions.some(function isExcluded(exclusion) {
        return key.includes(exclusion);
      });
    })
  );

  core.info(`New hashes after exclusions: ${JSON.stringify(newHashes)}`);

  if (mode == "mark") {
    await mark(store, newHashes, storeKey);
  } else if (mode == "submit") {
    await submit(store, newHashes, storeKey);
  }
} catch (error) {
  core.setFailed(error.message);
  core.setFailed(error.stack);
} finally {
  await store.quit();
}

The first run of this script will trigger a build on all the directories as expected, since the Redis server doesn't have any previous state to compare against.

Build All Applications on First Run
Build All Applications on First Run

As soon as the first run is completed, the Redis server will have the hashes of the directories stored. This will allow the script to compare the current state against the previous state in the next run.

Redis Stored Hashes
Redis Stored Hashes

As a result, any future change to a single application, will only trigger a build for that app only, instead of an expensive and unnecessary full rebuild of all the applications.

Selective Build on Changes
Selective Build on Changes

Considerations

While the proposed method works great for some teams and processes, it's good to be aware of the following considerations:

  • Build Process May Vary: Some of the applications may have different build processes, dependencies, or requirements. In such cases, a selective build approach may not be suitable for all applications in the monorepo. The proposed method is using the same build pipeline for all applications, which may not be ideal for all use cases.
  • Human Collaboration: As much as the proposed method is a streamlined approach into building projects and producing artifacts, it is important to highlight that no amount of technology can replace the chemistry and collaboration between team members. If one of the the teams decide to change their build for a specific application, the end result may end up breaking every other app's build process!

It's great to tackle this with the consultation of your team members. If the majority of the team is happy with a process in place, that'll benefit the overall productivity!

Conclusion

Here's a crisp recap of what we covered in this blog post:

  • Selective builds in monorepos improve CI/CD efficiency
  • Hash functions identify changed applications for targeted rebuilds
  • JavaScript and GitHub Actions automate the process
  • Redis stores hashes to track repository state
  • Benefits include faster build times, reduced rebuilds, and improved productivity

To close this up, it is recommended to implement selective builds using JavaScript, GitHub Actions, and Redis to optimize your monorepo's development workflow, especially for large codebases with multiple applications.

Further Reading

If you want to find out more about monorepo and how other players in the industry are using it, here are some resources to check out:

  1. Microsoft DevOps Blog - Insights into using Azure DevOps for monorepo management and selective builds8.

  2. GitHub Blog - Articles on monorepo strategies and build optimization9.

  3. LinkedIn Engineering Blog - LinkedIn's practices for managing large codebases with monorepos10.

  4. Facebook Engineering Blog - Facebook's experiences with monorepos and incremental builds11.

  5. Atlassian Developer Blog - Insights into monorepo architecture and efficient build practices12.

  6. ThoughtWorks Insights - Articles on continuous integration, deployment, and monorepo strategies13.

  7. Medium Articles - Community-driven insights on monorepos and selective builds14.

  8. Stack Overflow - Discussions and Q&A on monorepo best practices15.

  9. Guide to Monorepos for Front-end Code by Toptal16

  10. Monorepos in Git by Atlassian17

Bonus: Python Equivalent Script

If you have made it thus far, you deserve a goodie! 🍬

Here is the equivalent Python script for the JavaScript script we have discussed so far. It does the same thing, albeit in Python.

main.py
# -*- coding: utf-8 -*-
import hashlib
import os
import subprocess
import sys

import redis

REDIS_HOST = os.environ["REDIS_HOST"]
REDIS_PORT = int(os.getenv("REDIS_PORT", "6379"))
REDIS_PASSWORD = os.getenv("REDIS_PASSWORD")
REDIS_SSL = os.getenv("REDIS_SSL", "false") == "true"


def caclculate_directory_hash(directory) -> str:
    output = subprocess.check_output(
        ["find", directory, "-type", "f", "-exec", "sha256sum", "{}", ";"],
    )
    return hashlib.sha256(output).hexdigest()


def calculate_all_hashes(app_root_path) -> dict:
    applications = []
    for app_dir in os.scandir(app_root_path):
        if app_dir.is_dir():
            applications.append(app_dir.path)

    directory_hashes = {}

    for app_dir in applications:
        directory_hashes[app_dir] = caclculate_directory_hash(app_dir)

    return directory_hashes


def get_current_app_hashes(store: redis.Redis, store_key: str) -> dict:
    return store.hgetall(store_key)


def compare_hashes(old_hashes: dict, new_hashes: dict) -> list[str]:
    changed_apps = []
    for app, new_hash in new_hashes.items():
        if old_hashes.get(app) != new_hash:
            changed_apps.append(app)
    return changed_apps


def mark_changes(store: redis.Redis, new_hashes: dict, store_key: str):
    old_hashes = get_current_app_hashes(store, store_key)
    changed_apps = compare_hashes(old_hashes, new_hashes)
    return changed_apps


def github_output(changed_apps: list[str]):
    num_changed_apps = len(changed_apps)

    github_output_file = os.environ["GITHUB_OUTPUT"]

    with open(github_output_file, "a") as f:
        f.write(f"directory={changed_apps}\n")
        f.write(f"length={num_changed_apps}\n")


def write_changed_hashes(store: redis.Redis, new_hashes: dict, store_key: str):
    store.delete(store_key)
    store.hmset(store_key, new_hashes)


if __name__ == "__main__":
    store = redis.Redis(
        host=REDIS_HOST,
        port=REDIS_PORT,
        password=REDIS_PASSWORD,
        ssl=REDIS_SSL,
    )
    store_key = "app-hashes"

    if len(sys.argv) > 2:
        app_root_path = sys.argv[2]
    else:
        app_root_path = "."

    new_hashes = calculate_all_hashes(app_root_path)
    changed_apps = mark_changes(store, new_hashes, store_key)

    match sys.argv[1]:
        case "mark":
            github_output(changed_apps)
        case "submit":
            write_changed_hashes(store, new_hashes, store_key)
        case default:
            raise ValueError(f"Unknown action: {sys.argv[1]}")

Happy hacking and until next time 🫡, ciao. 🐧 🦀

If you enjoyed this blog post, consider sharing it with these buttons 👇. Please leave a comment for us at the end, we read & love 'em all. ❣

Share on Share on Share on Share on

Comments