3 minutes
Testing Prometheus Metrics In Integration Tests In Golang
Integration tests are the ones that test how our application integrate with external dependencies. To do so we usually use Docker containers that emulate those services.
An example of external service can be:
- Database
- Third-party API
- Pub/Sub Service
- Mailing Service
- …
For that kind of tests I usually check the important outcome that the application gave. This can be for example an HTTP response, a message in a queue, an email…
I usually avoid checking metrics since that ones are meant to help but are not domain requirements, at least in the businesses I’ve worked so far.
Anyways it can happen that there is no other way to check the outcome of the flow we are testing than checking a registered metric. In the case I faced recently it was a background job that is used to check the integrity of the data matching with a third-party service.
This process is key for the business since it alerts the developers in case we forgot to process a transaction from the third-party service. Then checking the metrics became necessary.
In that case I needed to get the initial state of the metric and then to check if the metric got modified as expected. This allows us to not work with fixed values but with increments which is better to have more robust tests.
Get Prometheus values from a server
To get the current metric value of a counter I prepared this function:
import (
"fmt"
"io"
"net/http"
"regexp"
"strconv"
"testing"
"github.com/stretchr/testify/require"
)
func GetPrometheusCounter(t *testing.T, serverURL, metricName, metricTags string) int {
// we call the server to get the metrics
resp, err := http.Get(serverURL + "/metrics")
require.NoError(t, err)
defer func() {
require.NoError(t, resp.Body.Close())
}()
// we read all the content of the response body
body, err := io.ReadAll(resp.Body)
require.NoError(t, err)
// we prepare a regex that will search for counters
re := regexp.MustCompile(fmt.Sprintf(`%s{%s} (\d+)`, metricName, metricTags))
// we run the regex
matches := re.FindStringSubmatch(string(body))
if len(matches) < 2 {
t.Logf("could not find metric %s{%s}", metricName, metricTags)
return 0
}
// we convert the string result to integer
i, err := strconv.Atoi(matches[1])
require.NoError(t, err)
return i
}
With that we can easily check the current value:
irc := GetPrometheusCounter(t, serverURL, "myapp_requests_counter", "{app=myapp}")
// run the code we want to test here
crc := GetPrometheusCounter(t, serverURL, "myapp_requests_counter", "{app=myapp}")
require.Equal(t, 2, crc-irc)
Using prometheus/client_golang/prometheus/testutil
With this function we could already do a complete test as we’ve seen in the previous example.
While checking the how to do that I saw the package github.com/prometheus/client_golang/prometheus/testutil
which contains some helpers to check Prometheus metrics.
In our case testutil.ScrapeAndCompare
looked very interesting since it scrapes the value from the server and compare with a reader.
To use it I prepared the following function:
import (
"fmt"
"strings"
"testing"
"time"
"github.com/prometheus/client_golang/prometheus/testutil"
"github.com/stretchr/testify/require"
)
func ExpectPrometheusCounter(t *testing.T, serverURL, metricName, metricTags string, expectedValue int) {
// Eventuallyf allows us to retry for a limit amount of time
// We'll retry for 1 second every 100 milliseconds
require.Eventuallyf(t, func() bool {
// this is the expected content
expected := fmt.Sprintf(`
# TYPE %s counter
%s{%s} %d
`, metricName, metricName, metricTags, expectedValue)
// ScrapeAndCompare will check the expected content with the current server metrics
if err := testutil.ScrapeAndCompare(serverURL+"/metrics", strings.NewReader(expected), fullMetricName); err != nil {
t.Log(err.Error())
return false
}
return true
}, time.Second, 100*time.Millisecond, "could not find metric %s with tags %s at value %d", metricName, metricTags, expectedValue)
}
Then the initial example would end up like this:
irc := GetPrometheusCounter(t, serverURL, "myapp_requests_counter", "{app=myapp}")
// run the code we want to test here
ExpectPrometheusCounter(t, serverURL, "myapp_requests_counter", "{app=myapp}", irc+2)
This option has the advantage that it retries for one second every 100 milliseconds.
Conclusion
In conclusion, while metrics may not always be considered primary business requirements, testing them in integration scenarios proves invaluable for maintaining the overall health and reliability of our applications.