Skip to content

Prometheus

Prometheus is a widespread open-source software package for monitoring and alerting. It collects metrics from configured targets at given intervals to populate its internal timeseries database. PromQL queries can then be run against the timeseries data to display results. Rules allow to specify conditions for generating alarm events, which can be handled by the Alertmanager module to e.g notify and silence alarms.

Integration

Sipfront provides "exporter" API endpoints for Prometheus to scrape test results and metrics. This allows you to integrate specific Sipfront tests into your own Prometheus, display them in your existing dashboards and alert using your existing alerting mechanisms.

Concept

The Prometheus exporter represent HTTP API endpoints to monitor specific Sipfront tests, and also whole projects containing multiple tests.

Test level endpoints

A test level endpoint allows you to scrape the measures and result status of the recurring runs of one particular test over time. There are three different types of endpoints:

  1. Gauges: scrape measure value/avg
    • https://app.sipfront.com/prometheus/tests/<testid>/gauges
  2. Summaries: scrape measure quantiles (median, p95, ...)
    • https://app.sipfront.com/prometheus/tests/<testid>/summaries
  3. Status: scrape result status (passed/failed)
    • https://app.sipfront.com/prometheus/tests/<testid>/status/hour
    • https://app.sipfront.com/prometheus/tests/<testid>/status/minute
    • https://app.sipfront.com/prometheus/tests/<testid>/status/day

Gauges

Gauges export endpoints allow you to capture all the measure values over time in Prometheus, and plot relevant graphs in your Graphana instance to view how certain metrics evolve during the time of day or the week. Similar to graphs in the Sipfront WebApp UI, you will be able to see edges and anomalies quickly, as they may happen after configuration or infrastucture changes in your systems.

A gauge provides the measure value together with the time of the most recent run of a particular test:

  • visqol Recorded Audio MOS (Visqol)
  • transcribe Speech-to-text transcription confidence

Below gauges for RTP measures provide the average value across all test agent instances involved with the test run:

  • rtp_rtt Round trip time
  • rtp_mos Network MOS

Below gauge for RTP and SIPP measures provide the sum value of all test agent instances involved with the test run:

  • rtp_byte_per_sec Sent and received bytes per second
  • rtp_jitter Jitter
  • rtp_lost Lost packets count
  • rtp_lost_per_sec Lost packets per second
  • rtp_pkt_per_sec Sent and received packets per second
  • sipp_call_rate Calls per second
  • sipp_concurrent_calls Concurrent call load
  • sipp_failed_calls Failed call count

A measure value can be present multiple times, ie. for each involved call party. Therefore value dimensions are provided depending on the measure:

  • role for call party:
    • caller
    • callee
    • A, B, C, ...
  • dir for call direction:
    • in
    • out
  • confidence for transcription confidence:
    • min
    • max
    • avg

Scraped metric names and dimensions can be explored in the Prometheus/Graphana UI. The exported metric names for prometheus follow a canonical format:

sipfront_Internal_Tests_v6_iotcore_sipfront_a_b:rtp_mos

  • sipfront: namespace prefix
  • Internal_Tests: sanitized project name 'Internal Tests'
  • v6_iotcore_sipfront_a_b: sanitized test name 'v6_iotcore_sipfront_a_b'
  • rtp_mos: measure

Graphana graph showing rtp_mos measures (averages) scraped from a prometheus gauge endpoint for a Sipfront test

exploring gauge endpoint in Graphana

NOTE: Prometheus basically takes the scrape time as measurement time, but accepts the exported test start time only if it is within the last hour. Hence gauges are suitable for tests run with periods less than an hour.

Summaries

While gauges for rtp_rtt, rtp_mos, rtp_byte_per_sec, rtp_jitter, rtp_lost, rtp_lost_per_sec, rtp_pkt_per_sec, sipp_call_rate, sipp_concurrent_calls and sipp_failed_calls report the average measure value over the test run, there are also corresponding summary endpoints which include the pre-calculated median, p75, p90 and p95 quantiles as additional dimension.

The exported metric summary names for prometheus follow a canonical format:

sipfront_Internal_Tests_v6_iotcore_sipfront_a_b:rtp_mos_summary

  • sipfront: namespace prefix
  • Internal_Tests: sanitized project name 'Internal Tests'
  • v6_iotcore_sipfront_a_b: sanitized test name 'v6_iotcore_sipfront_a_b'
  • rtp_mos: measure
  • summary: summary measure suffix

Graphana graph showing rtp_mos measures summaries scraped from a prometheus summary endpoint for a Sipfront test

exploring summary endpoint in Graphana

Status

Status endpoints provide you with the number of successful and failed test runs over the last minute, hour or day, depending on the endpoint url. This allows you to track test results over time and set alarms if the number of failed test runs during a specific time period exceeds a certain threshold.

The status endpoint is a gauge that reports the run count values using dimensions:

  • total total number of runs started during the last full minute, hour or day
  • passed number of runs passing the Sipfront test conditions, started during the last full minute, hour or day
  • failed number of runs failing the Sipfront test conditions, started during the last full minute, hour or day

The exported metric names for prometheus follow a canonical format:

sipfront_Selenium_Tests_Selenium_Test_UDP_v4_minute

  • sipfront: namespace prefix
  • Selenium_Tests: sanitized project name 'Internal Tests'
  • Selenium_Test_UDP_v4: sanitized test name 'Selenium Test UDP v4' (n/a for project level status endpoint)
  • minute: endpoint period minute, hour or day

NOTE: The project level status endpoints will provide the cumulated results of all tests of the project (sum of total/successful/failed runs).

Graphana graph showing number of runs of tests scraped from a prometheus status endpoint for a Sipfront test project

exploring status endpoint in Graphana

NOTE: The total number of runs is stable, while passed and failing status require time to settle until the run is finished. Make sure the duration of a run is less than the period of the minute, hour or day endpoint.

Project level endpoints

A project level endpoint allows you to scrape measures and result status of the recurring runs of all project tests over time. There are the same types of endpoints:

  1. Gauges: scrape measure value/avg of each test

    • https://app.sipfront.com/prometheus/projects/<projectid>/gauges

      For a project containing 3 tests, this is quivalent to scraping the 3 individual test level gauge endpoints, ie. https://app.sipfront.com/prometheus/tests/<test1id>/gauges https://app.sipfront.com/prometheus/tests/<test2id>/gauges
      https://app.sipfront.com/prometheus/tests/<test3id>/gauges

  2. Summaries: scrape measure quantiles (median, p95) of each test

    • https://app.sipfront.com/prometheus/projects/<projectid>/summaries

      For a project containing 3 tests, this is quivalent to scraping the 3 individual test level summaries endpoints, ie. https://app.sipfront.com/prometheus/tests/<test1id>/summaries https://app.sipfront.com/prometheus/tests/<test2id>/summaries
      https://app.sipfront.com/prometheus/tests/<test3id>/summaries

  3. Status: scrape cumulated result status (passed/failed) of all project tests

    • https://app.sipfront.com/prometheus/projects/<projectid>/status/hour
    • https://app.sipfront.com/prometheus/projects/<projectid>/status/minute
    • https://app.sipfront.com/prometheus/projects/<projectid>/status/day

      For a project containing 3 tests, this is quivalent to scraping the 3 individual test level status endpoints, ie. https://app.sipfront.com/prometheus/tests/<test1id>/status/minute https://app.sipfront.com/prometheus/tests/<test2id>/status/minute
      https://app.sipfront.com/prometheus/tests/<test3id>/status/minute
      and querying the sum using a PromQL expression (sipfront_project_test1_minute + sipfront_project_test2_minute + sipfront_project_test3_minute)

Configuration

Integrating Sipfront tests into your prometheus requires two things:

  1. A Sipfront API key
  2. The Sipfront test ID or project ID

Obtain an API key

In order to scrape the test results, you need an API key to authenticate with the Sipfront API. You can create an API key in the Sipfront web interface under Account -> API Keys:

Create new API key

Create new API key

Once created, copy/paste the public key (the username for prometheus) and the secret key (the password) and use it in the configuration below.

Store new API key

Store the API credentials for use in prometheus config

Configure Prometheus

To configure Prometheus to scrape Sipfront test results and metrics, you need to add scrape jobs to your prometheus.yml as shown in the examples below.

NOTE: Prometheus scrape jobs allow to specify the scrape interval, which needs to be aligned to the frequency of your Sipfront test. To avoid loosing information because of aliasing, make sure to configure a scrape interval that is (less than) half of the test recurrence interval.

Configure a test level gauges scrape job

  - job_name: sipfront_test_99_gauges
    scheme: https
    scrape_interval: 60s
    scrape_timeout: 50s
    metrics_path: /prometheus/tests/99/gauges
    basic_auth:
      username: your-public-api-key
      password: your-secret-api-key
    static_configs:
      - targets: ['app.sipfront.com']

Configure a project level gauges scrape job

  - job_name: sipfront_project_77_gauges
    scheme: https
    scrape_interval: 60s
    scrape_timeout: 50s
    metrics_path: /prometheus/projects/77/gauges
    basic_auth:
      username: your-public-api-key
      password: your-secret-api-key
    static_configs:
      - targets: ['app.sipfront.com']

Configure a test level summaries scrape job

  - job_name: sipfront_test_99_summaries
    scheme: https
    scrape_interval: 60s
    scrape_timeout: 50s
    metrics_path: /prometheus/tests/99/summaries
    basic_auth:
      username: your-public-api-key
      password: your-secret-api-key
    static_configs:
      - targets: ['app.sipfront.com']

Configure a project level summaries scrape job

  - job_name: sipfront_project_77_summaries
    scheme: https
    scrape_interval: 60s
    scrape_timeout: 50s
    metrics_path: /prometheus/projects/77/summaries
    basic_auth:
      username: your-public-api-key
      password: your-secret-api-key
    static_configs:
      - targets: ['app.sipfront.com']

Configure a test level status scrape job

  - job_name: sipfront_test_99_status
    scheme: https
    scrape_interval: 60s
    scrape_timeout: 50s
    metrics_path: /prometheus/tests/99/status/minute
    basic_auth:
      username: your-public-api-key
      password: your-secret-api-key
    static_configs:
      - targets: ['app.sipfront.com']

Configure a project level status scrape job

  - job_name: sipfront_project_77_status
    scheme: https
    scrape_interval: 60s
    scrape_timeout: 50s
    metrics_path: /prometheus/projects/77/status/minute
    basic_auth:
      username: your-public-api-key
      password: your-secret-api-key
    static_configs:
      - targets: ['app.sipfront.com']