4. Create a playbook to push Alloy to all hosts

Step Contents

While you could tackle it manually, if you have AWX (or Semaphore or pure Ansible) deployed in your home lab, you can create a playbook and tackle it for all your hosts, which is very elegant and future-proof.

If you do not have such an automation in place and you find it interesting, why don’t you set up AWX using my previous guide? See Deploy Ansible AWX to automate OS patching.
For AWX, you will need space for source control in your own repo. I’m using a locally hosted Gitea.

Playbook structure

Here is the structure of the repo. This structure follows standard Ansible best practices for role-based playbooks. It may feel like a lot of files/folders, yet each has a slightly different role and is kept as short as possible, so do not feel discouraged. Rather, let’s dive into each.

alloy-deploy/
├── playbook.yml
├── inventory/
│   └── group_vars/
│       └── all.yml
├── roles/
│   └── alloy/
│       ├── tasks/
│       │   └── main.yml
│       ├── templates/
│       │   └── config.alloy.j2
│       ├── handlers/
│       │   └── main.yml
│       └── files/
│           └── alloy.service

Initial playbook

In this playbook, we confirm the OS family of the host that is being processed.

If it is the one(s) we want, the ‘alloy’ role is assumed and additional scripts are triggered in the /roles/alloy/ folder.
If it is not the supported OS architecture, then the playbook reports it and does not proceed further for the given host.
Replace the monitoring_server IP with the IP of your VM on which you installed the Docker containers.

# playbook.yml

---
- name: Deploy Grafana Alloy monitoring agent
  hosts: all
  become: true
  vars:
    monitoring_server: "1.2.3.4"
    alloy_version: "1.8.2"

  pre_tasks:
    - name: Gather OS facts if not already present
      ansible.builtin.setup:
        gather_subset:
          - '!all'
          - os_family
      when: ansible_os_family is not defined

    - name: Check if host is supported
      ansible.builtin.set_fact:
        alloy_supported: "{{ ansible_os_family == 'Debian' and 'no_monitoring' not in group_names }}"

    - name: Skip unsupported or excluded hosts
      ansible.builtin.debug:
        msg: >-
          Skipping {{ inventory_hostname }} -
          {% if 'no_monitoring' in group_names %}excluded via no_monitoring group
          {% else %}OS family {{ ansible_os_family }} not supported{% endif %}
      when: not alloy_supported

  roles:
    - role: alloy
      when: alloy_supported

The main task

After the pre-check is passed, the main playbook is triggered.

The variable host_site is defined based on group membership – if you do not have those defined, either remove this task or modify it to your liking.
The playbook verifies that an installation of the agent is even needed and if yes, it is downloaded, unzipped, installed, permissions set and Alloy is added to systemd to be managed as a service (daemon) and to start at boot time. Lastly, the downloaded zipped binary and extracted data are deleted.

# roles/alloy/tasks/main.yml

---
- name: Check if Alloy is already installed
  ansible.builtin.stat:
    path: /usr/local/bin/alloy
  register: alloy_binary

- name: Get installed Alloy version
  ansible.builtin.command: /usr/local/bin/alloy --version
  register: alloy_installed_version
  changed_when: false
  failed_when: false
  when: alloy_binary.stat.exists

- name: Set install required fact
  ansible.builtin.set_fact:
    alloy_install_required: "{{ not alloy_binary.stat.exists or (alloy_version not in (alloy_installed_version.stdout | default(''))) }}"

- name: Install dependencies
  ansible.builtin.apt:
    name:
      - unzip
      - curl
    state: present
    update_cache: true
  when: alloy_install_required

- name: Download Alloy
  ansible.builtin.get_url:
    url: "<https://github.com/grafana/alloy/releases/download/v>{{ alloy_version }}/alloy-linux-amd64.zip"
    dest: "/tmp/alloy-linux-amd64.zip"
    mode: '0644'
  when: alloy_install_required

- name: Extract Alloy binary
  ansible.builtin.unarchive:
    src: "/tmp/alloy-linux-amd64.zip"
    dest: "/tmp/"
    remote_src: true
  when: alloy_install_required

- name: Install Alloy binary
  ansible.builtin.copy:
    src: "/tmp/alloy-linux-amd64"
    dest: "/usr/local/bin/alloy"
    mode: '0755'
    remote_src: true
  when: alloy_install_required
  notify: Restart Alloy

- name: Create config directory
  ansible.builtin.file:
    path: /etc/alloy
    state: directory
    mode: '0755'

- name: Create data directory
  ansible.builtin.file:
    path: /var/lib/alloy
    state: directory
    mode: '0755'

- name: Determine site from group membership
  ansible.builtin.set_fact:
    host_site: >-
      {%- if 'site1' in group_names -%}site1
      {%- elif 'site3' in group_names -%}site2
      {%- elif 'site3' in group_names -%}site3
      {%- else -%}unknown
      {%- endif -%}

- name: Deploy Alloy configuration
  ansible.builtin.template:
    src: config.alloy.j2
    dest: /etc/alloy/config.alloy
    mode: '0644'
  notify: Restart Alloy

- name: Deploy systemd service
  ansible.builtin.copy:
    src: alloy.service
    dest: /etc/systemd/system/alloy.service
    mode: '0644'
  notify:
    - Reload systemd
    - Restart Alloy

- name: Enable and start Alloy
  ansible.builtin.systemd:
    name: alloy
    enabled: true
    state: started
    daemon_reload: true

- name: Clean up downloaded files
  ansible.builtin.file:
    path: "{{ item }}"
    state: absent
  loop:
    - /tmp/alloy-linux-amd64.zip
    - /tmp/alloy-linux-amd64
  when: alloy_install_required

Deploy config file for alloy

Apart from the installation of Alloy for each host, we will also want to deploy a config file so that it knows about where the Prometheus server is and what hostname should Loki use (which we take from the actual hostname in Ansible).

Apart from a variable called host_site defined in the previous template, there is another variable expected to be present for you to have defined for each host:

ansible_host — either the IP or the FQDN

If you do not have it defined, check out my previous tutorial or try adding it or adjusting the script below, whatever works for you 😇

Feel free to add/remove services as you need per your own environment.

See below for the content of the config.alloy.j2 file:

// Grafana Alloy Configuration
// Managed by Ansible - do not edit manually
// Host: {{ ansible_host }}
// Site: {{ host_site }}

prometheus.exporter.unix "local" {
  enable_collectors = ["cpu", "diskstats", "filesystem", "loadavg", "meminfo", "netdev", "systemd", "pressure"]
  systemd {
    enable_restarts = true
    unit_include = "(mariadb|mysql|nginx|apache2|docker|sshd|alloy|proxmox-backup-proxy|pveproxy|pvedaemon|corosync|gitea|postfix|dovecot|fail2ban|syncthing@.*)\\\\.service"
  }
}

prometheus.exporter.process "default" {
  matcher {
{% raw %}
    name    = "{{.Comm}}"
{% endraw %}
    cmdline = [".+"]
  }
}

discovery.relabel "unix" {
  targets = prometheus.exporter.unix.local.targets

  rule {
    target_label = "instance"
    replacement  = "{{ ansible_host.split('.')[0] }}"
  }
}

discovery.relabel "process" {
  targets = prometheus.exporter.process.default.targets

  rule {
    target_label = "instance"
    replacement  = "{{ ansible_host.split('.')[0] }}"
  }
}

prometheus.scrape "unix" {
  targets         = discovery.relabel.unix.output
  forward_to      = [prometheus.remote_write.default.receiver]
  scrape_interval = "30s"
  job_name        = "integrations/unix"
}

prometheus.scrape "process" {
  targets         = discovery.relabel.process.output
  forward_to      = [prometheus.remote_write.default.receiver]
  scrape_interval = "30s"
  job_name        = "integrations/process"
}

{% if 'docker' in group_names %}
prometheus.exporter.cadvisor "docker" {
  docker_host = "unix:///var/run/docker.sock"
  docker_only = true
}

discovery.relabel "docker" {
  targets = prometheus.exporter.cadvisor.docker.targets

  rule {
    target_label = "instance"
    replacement  = "{{ ansible_host.split('.')[0] }}"
  }
}

prometheus.scrape "docker" {
  targets         = discovery.relabel.docker.output
  forward_to      = [prometheus.remote_write.default.receiver]
  scrape_interval = "30s"
  job_name        = "integrations/docker"
}
{% endif %}

prometheus.remote_write "default" {
  endpoint {
    url = "http://{{ monitoring_server }}:9090/api/v1/write"
  }
  external_labels = {
    host = "{{ ansible_host }}",
    site = "{{ host_site }}",
  }
}

loki.source.journal "systemd" {
  forward_to    = [loki.process.add_labels.receiver]
  relabel_rules = loki.relabel.journal.rules
  labels        = { job = "systemd-journal" }
}

loki.relabel "journal" {
  forward_to = []
  rule {
    source_labels = ["__journal__systemd_unit"]
    target_label  = "unit"
  }
  rule {
    source_labels = ["__journal_priority_keyword"]
    target_label  = "level"
  }
}

loki.source.file "varlogs" {
  targets    = [
    { __path__ = "/var/log/*.log", job = "varlogs" },
    { __path__ = "/var/log/**/*.log", job = "varlogs" },
  ]
  forward_to = [loki.process.add_labels.receiver]
}

loki.process "add_labels" {
  forward_to = [loki.write.default.receiver]
  stage.static_labels {
    values = {
      host     = "{{ ansible_host }}",
      instance = "{{ ansible_host.split('.')[0] }}",
      site     = "{{ host_site }}",
    }
  }
}

loki.write "default" {
  endpoint {
    url = "http://{{ monitoring_server }}:3100/loki/api/v1/push"
  }
}

Systemd unit file for Alloy

The content of the alloy.service file used for systemd:

# roles/alloy/files/alloy.service

[Unit]
Description=Grafana Alloy
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/alloy run /etc/alloy/config.alloy --storage.path=/var/lib/alloy
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

Service restart handlers

This short playbook handles the reloading of the systemd daemon and the restarting of the alloy service after Alloy is installed.

# roles/alloy/handlers/main.yml

---
- name: Reload systemd
  ansible.builtin.systemd:
    daemon_reload: true

- name: Restart Alloy
  ansible.builtin.systemd:
    name: alloy
    state: restarted

Docker Monitoring server info

This is optional but recommended — these are default variables shared across all hosts (monitoring server IP, Alloy agent version).

# inventory/group_vars/all.yml

---
monitoring_server: "your_actual_vm_ip"
alloy_version: "1.8.2" # Replace this with your current version

Sync Playbook and add a template

Once all the files are uploaded to your preferred source version control software, go to Projects and click on the sync button to sync the newest addition with your AWX instance.

Then head to Templates → Add button → Add job template.

Name: Deploy Alloy agent
Inventory: your inventory – best to have it all under one unified one with all sites
Project: Your source version controlled repo
Execution Environment: your preferred EE that has at least the ansible.netcommon library. See more info on how to set up your own EE.
Playbook: Choose the playbook we created above
Credentials: All your hosts need to be reachable via SSH, choose a credential
Tick the box for Privilege Escalation

Screenshot from AWX showing how the Deploy Alloy agent job is configured. — Job configuration of a Deploy Alloy agent

Run the job on just one or a few hosts before running it on the whole fleet. Then watch and enjoy 😇

Ensure data is flowing to Grafana

Let’s confirm that the data is actually present in Grafana.

Go to Explore → select Prometheus → run this query (use the ‘code’ view to just copy paste the command below):

count by (instance) (up{job="integrations/unix"})

The result should not surprise you – based on the number of successful completions in AWX:

Screenshot from Grafana showing an increase of hosts after the deployment of the Alloy agent. — Notice the increase of hosts in Grafana after the Alloy agent was deployed to the fleet

With hosts being added, we can now consider customizing our dashboard to see what we want to see.