3. Deploy Alloy to one host

Until recently, to collect OS-level metrics as well as system logs, two agents were commonly used, such as node_exporter and promtail. However, promtail has recently entered long-term support with no active development (see their Github repo). Instead, they suggest to use Alloy.

For those who have been using promtail and this change catches them by surprise, see more info on SUSE blog from December 2025.
As explained on Github, Alloy is an ‘open source OpenTelemetry Collector distribution with built-in Prometheus pipelines and support for metrics, logs, traces, and profiles’. Therefore, it collects both types of logs while still maintaining connectivity with existing Prometheus and Loki backends.
The one difference with Alloy is that it pushes data to Prometheus (rather than Prometheus pulling it), so a remote write receiver has to be enabled in the Docker (this was done already in the previous step).

Manually deploy on one host

Before we deploy it ‘en masse’ to all hosts, pick one of your choice and let’s test that the connection works. If it works, we will then transfer it into an Ansible playbook.

Note: Replace YOUR_MONITORING_SERVER_IP, YOUR_HOSTNAME, and YOUR_SITE with your actual values in the config below.

# Download the binary
cd /tmp
curl -LO <https://github.com/grafana/alloy/releases/download/v1.8.2/alloy-linux-amd64.zip>
unzip alloy-linux-amd64.zip

# Install Alloy
sudo mv alloy-linux-amd64 /usr/local/bin/alloy
sudo chmod +x /usr/local/bin/alloy

# Create config directory
sudo mkdir -p /etc/alloy

# Create config file
sudo tee /etc/alloy/config.alloy > /dev/null <<'EOF'
// ============================================
//     METRICS: System metrics collection
// ============================================

prometheus.exporter.unix "local" {
  enable_collectors = ["cpu", "diskstats", "filesystem", "loadavg", "meminfo", "netdev", "systemd", "pressure"]
  systemd {
    enable_restarts = true
    unit_include = "(mariadb|mysql|nginx|apache2|docker|sshd|alloy|proxmox-backup-proxy|pveproxy|pvedaemon|corosync|gitea|postfix|dovecot|fail2ban|syncthing@.*)\\\\.service"
  }
}

prometheus.exporter.process "default" {
  matcher {
    name    = "{{.Comm}}"
    cmdline = [".+"]
  }
}

discovery.relabel "unix" {
  targets = prometheus.exporter.unix.local.targets

  rule {
    target_label = "instance"
    replacement  = "YOUR_HOSTNAME"
  }
}

discovery.relabel "process" {
  targets = prometheus.exporter.process.default.targets

  rule {
    target_label = "instance"
    replacement  = "YOUR_HOSTNAME"
  }
}

prometheus.scrape "unix" {
  targets         = discovery.relabel.unix.output
  forward_to      = [prometheus.remote_write.default.receiver]
  scrape_interval = "30s"
  job_name        = "integrations/unix"
}

prometheus.scrape "process" {
  targets         = discovery.relabel.process.output
  forward_to      = [prometheus.remote_write.default.receiver]
  scrape_interval = "30s"
  job_name        = "integrations/process"
}

// ============================================
//     REMOTE WRITE: Push metrics to Prometheus
// ============================================

prometheus.remote_write "default" {
  endpoint {
    url = "http://YOUR_MONITORING_SERVER_IP:9090/api/v1/write"
  }
  external_labels = {
    host = "YOUR_HOSTNAME",
    site = "YOUR_SITE",
  }
}

// ============================================
//     LOGS: Systemd journal collection
// ============================================

loki.source.journal "systemd" {
  forward_to    = [loki.process.add_labels.receiver]
  relabel_rules = loki.relabel.journal.rules
  labels        = { job = "systemd-journal" }
}

loki.relabel "journal" {
  forward_to = []
  rule {
    source_labels = ["__journal__systemd_unit"]
    target_label  = "unit"
  }
  rule {
    source_labels = ["__journal_priority_keyword"]
    target_label  = "level"
  }
}

// ============================================
//     LOGS: File-based log collection
// ============================================

loki.source.file "varlogs" {
  targets    = [
    { __path__ = "/var/log/*.log", job = "varlogs" },
    { __path__ = "/var/log/**/*.log", job = "varlogs" },
  ]
  forward_to = [loki.process.add_labels.receiver]
}

// ============================================
//     LOGS: Label enrichment + push to Loki
// ============================================

loki.process "add_labels" {
  forward_to = [loki.write.default.receiver]
  stage.static_labels {
    values = {
      host     = "YOUR_HOSTNAME",
      instance = "YOUR_HOSTNAME",
      site     = "YOUR_SITE",
    }
  }
}

loki.write "default" {
  endpoint {
    url = "http://YOUR_MONITORING_SERVER_IP:3100/loki/api/v1/push"
  }
}
EOF

# Create systemd service
sudo tee /etc/systemd/system/alloy.service > /dev/null <<EOF
[Unit]
Description=Grafana Alloy
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/alloy run /etc/alloy/config.alloy --storage.path=/var/lib/alloy
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

# Create storage directory
sudo mkdir -p /var/lib/alloy

# Start it
sudo systemctl daemon-reload
sudo systemctl enable --now alloy

# Verify
sudo systemctl status alloy

Watch the logs to see if there are any errors (see the troubleshooting section below if you find any).

If all goes well, you will be able to go to Grafana under Drilldown → Metrics, and manually type instance (press enter), followed by selecting the = (equal sign) and then type your hostname. This will show graphs related to the host that you have installed the Alloy agent on.

Screenshot of Grafana showing metrics for one host that we configured using Alloy. — Our first host was correctly spotted in Grafana!

Alternatively, you can also go to ‘Explore’ and build your query there from the available options. For example, you can look at available RAM for your host using the node_memory_MemAvailable_bytes parameter. If you prefer to store and copy paste your queries, you can click on the ‘Code’ button (instead of using the ‘Builder’ and enter the full query) and then click on the blue button to run the query:

# Memory available on your host
node_memory_MemAvailable_bytes{instance="your_hostname"}

# CPU usage (all cores)
node_cpu_seconds_total{instance="your_hostname"}

# Disk space free
node_filesystem_avail_bytes{instance="your_hostname"}

# All metrics from your host
{instance="your_hostname"}

Screenshot of a custom query being visualized. — Build your own dashboard using queries that are relevant to you

For queries such as those related to drive space available versus used, you can change the graph style to ‘Stacked lines’. I recommend you to play around in Grafana for a while.

A change in configuration to display stacked lines for drive space. — A slight change in the visualization method can prove to be more useful (stacked lines)

Troubleshooting connectivity with Alloy

Issue 1: Alloy service not starting or reporting errors

Check the logs first:

sudo journalctl -u alloy -f --no-pager

Common causes:

Config syntax errors: Alloy’s River syntax is strict. Run a syntax check: /usr/local/bin/alloy fmt /etc/alloy/config.alloy If it reformats without errors, the syntax is valid.
Permission denied: Alloy needs read access to /var/log/ files and the systemd journal. If running as a non-root user, ensure it has the systemd-journal and adm groups: sudo usermod -aG systemd-journal alloy sudo usermod -aG adm alloy
Port already in use: Alloy’s default HTTP debug UI runs on port 12345. If another service uses it, add -server.http.listen-addr=0.0.0.0:12346 to the ExecStart line in the systemd unit file.
Storage path doesn’t exist: Ensure /var/lib/alloy exists and is writable.

Issue 2: Nothing appearing in Grafana

Explore Prometheus without Grafana. In the query box, try the following commands to see if it sees the host data:

# Any Unix-based hosts
{job="integrations/unix"}

# Summary of available memory for Unix hosts:
node_memory_MemAvailable_bytes{job="integrations/unix"}

# CPU info for a specific host:
node_cpu_seconds_total{instance="your_host"}

Issue 3: No data flowing to Loki

Check if logs are arriving. From the Docker host, run the following:

curl -s "http://your_docker_host_ip:3100/loki/api/v1/labels"

If you see labels like host, job, unit, then logs are arriving. If empty, check:

Alloy logs for Loki push errors
Firewall rules between the host and the monitoring server on port 3100

Issue 4: Alloy is running but Prometheus shows no data

Verify the remote write endpoint is reachable from the host:

curl -s -o /dev/null -w "%{http_code}" http://your_monitoring_server:9090/api/v1/write

A 204 response means the endpoint is accepting writes. A connection timeout means a firewall or routing issue.

Issue 5: Firewall blocking connections

If using OPNSense, ensure you have firewall rules allowing traffic from your monitored hosts to the monitoring server on ports 9090 (Prometheus), 3100 (Loki), and 3000 (Grafana).

Issue 6: Labels are missing or incorrect in Grafana

If instance, site, or host labels aren’t appearing, add them. Then restart the agent by running:

sudo systemctl restart alloy

Check the Alloy config on the affected host:

cat /etc/alloy/config.alloy

Verify the external_labels block in prometheus.remote_write and the stage.static_labels block in loki.process

3. Deploy Alloy to one host

Manually deploy on one host

Troubleshooting connectivity with Alloy

Leave a Comment Cancel Reply