Aggregated Datasources for API Queries
  • 12 Mar 2024
  • 5 Minutes to read
  • Contributors
  • Dark
    Light
  • PDF

Aggregated Datasources for API Queries

  • Dark
    Light
  • PDF

Article Summary

This topic explains how data rolls up to aggregate datasources.

Multi-Resolution Data Store (MRDS)

Multi-resolution Data Store (MRDS) improves performance for large amounts of data, for example, high session volume or high granularity (per second data). MRDS aggregates data at ingestion so that smaller aggregate datasources can be used when querying longer intervals, reducing the space required to retain data and improving query speeds.

Each MDRS datasource aggregates raw metrics into new min, max, avg, sum, and count columns per raw metric.

Default Configuration

When MRDS is enabled, these aggregate datasources are typically created (datasources are configurable):

  • raw
  • 1-minute
  • 15-minute
  • 1-hour

Retention Policy

Each datasource has a different retention period. The standard retention policy for each aggregated datasource is documented in: What is my performance data's retention?

Datasource Selection

When querying data from Analytics, the datasource that serves the data is determined at query time by the duration of the query interval. The APIs do not allow the client to directly select the datasource from which the data will be retrieved; the datasource selection algorithm of Analytics ensures queries are returned as quickly as possible.

For more information, see the Examples section of this topic.

API Response

API responses include a meta section describing which datasource was selected, and the rules that were used to make the selection.

{
    "datasource": "ef333a1d-05bd-4d8e-b312-62005f7db6e2-npav-ts-metrics",
    "datasourceSelectionRules": [
        "Rule Context: [Target Datapoints] | Tenant [ef333a1d-05bd-4d8e-b312-62005f7db6e2] | Interval [2024-02-07T10:48:24.449Z/2024-02-07T18:48:24.449Z] | Context [250]",
        "Processed Rule: [Queryable Datasources] | Tenant [ef333a1d-05bd-4d8e-b312-62005f7db6e2] | Interval [2024-02-07T10:48:24.449Z/2024-02-07T18:48:24.449Z] | Input Candidates | [] | Output Candidates [raw] | Info [Tenant Queryable Datasource: [raw]]",
        "Processed Rule: [Query Interval] | Tenant [ef333a1d-05bd-4d8e-b312-62005f7db6e2] | Interval [2024-02-07T10:48:24.449Z/2024-02-07T18:48:24.449Z] | Input Candidates | [raw] | Output Candidates [raw] | Info [Tenant Datasource Query Durations: map[raw:PT744H]]",
        "Processed Rule: [Loaded Period] | Tenant [ef333a1d-05bd-4d8e-b312-62005f7db6e2] | Interval [2024-02-07T10:48:24.449Z/2024-02-07T18:48:24.449Z] | Input Candidates | [raw] | Output Candidates [raw] | Info [Tenant Loaded Period By Datasource: map[1h:[RangeStart: 2024-01-08 00:00:00 +0000 UTC - RangeEnd: 2024-02-07 18:48:31.854152299 +0000 UTC] raw:[RangeStart: 2024-01-08 18:00:00 +0000 UTC - RangeEnd: 2024-02-07 18:48:31.854168171 +0000 UTC]]]",
        "Rule Context: [Final Datasource Granularity] | Tenant [ef333a1d-05bd-4d8e-b312-62005f7db6e2] | Interval [2024-02-07T10:48:24.449Z/2024-02-07T18:48:24.449Z] | Context [raw]"
    ],
    "queryID": "34b40472-3ba6-4742-9bfc-ccc2e93894c9"
}

The above example illustrates how Analytics determines the optimal datasource from which to satisfy a query.

In this example we can see:

  1. datasource → The datasource used to execute the query.
  2. datasourceSelectionRules → A series of rule evaluations that led to the selection of the datasource. Rules listed in the example:
    1. Target Datapoints → The number of points the query should attempt to produce.
    2. Queryable Datasources → Datasources that have the granularity to satisfy the query.
    3. Query Interval → Determines if the interval is aligned with the retention period for a queryable datasource. Intervals outside of the retention period exclude a queryable datasource from being able to satisfy the query.
    4. Loaded Period → The intervals currently available for each datasource
    5. Final Datasource Granularity → The datasource granularity selected after evaluation

Examples

For all of the following examples the following preconditions apply:

  1. The customer is sending data at a PT1M frequency.
  2. The raw PT1M data is being stored in the “raw” datasource, with a 3 months retention
  3. Aggregated PT15M data is being stored in the “PT15M” datasource, with a 6 month retention
  4. Aggregated PT1H data is being stored in the “PT1H” datasource, with a 12 month retention
  5. Target datapoints for the query is set to 250
  6. Target intervals for each datasource are:
    1. raw - Queries less than 1 day in duration
    2. PT15M - Queries less than 30 days in duration
    3. PT1H - Queries less than 365 days in duration

Example: Last 4 hours

  • Query Description: Fetch the MAX value of MetricA in 1-minute buckets over the past 4 hours (240 datapoints)
    • Since the granularity is at PT1M resolution and the query is small enough in duration to fit within the target datapoints and the data is available in the PT1M datasource, this query will be executed against the raw datasource.

Example: 4 hours, 2 months ago

  • Query Description: Fetch the SUM value of MetricA in 1-minute buckets over a 4 hour period that took place 60 days ago (240 datapoints)
    • Since the granularity is at PT1M resolution and the query is small enough in duration to fit within the target datapoints and the data is available in the raw datasource (90 days retention), this query will be executed against the PT1M datasource.

Example: Last 31 Days

  • Query Description: Fetch the AVG value of MetricA in 6-hour buckets over the last 30 days (4 datapoints per day * 31 days = 124 datapoints)
    • Since the granularity is at PT6H resolution and the query duration to fits within the target datapoints and the data is available in the PT1H datasource (which is more granular than the PT6H bucket requirement), this query will be executed against the PT1H datasource.

Example: Last day, Granularity 5 minutes

  • Query Description: Fetch the COUNT value of MetricA in 5-minute buckets over the last 24 hours (12 datapoints per hour * 24 days = 288 datapoints).
    • Since the granularity is at PT5M resolution but the query duration is too large to fit into the target datapoints (288 > 250), the query will be rejected with a response message indicating either a smaller time interval (say 20 hours = 12 * 20 = 240 datapoints) or a larger query granularity (say PT10M = 6 datapoints per hour * 24 hours = 144 datapoints) should be used.

Example: Last 2 days, Granularity All

  • Query Description: Fetch the MIN value of MetricA over the last 48 hours (1 single datapoint capturing the absolute min of MetricA over the 48 hours).
    • Since the granularity is ALL a single datapoint will be returned, so the query fits in the target datapoints. Since all 3 datasources have the full dataset necessary to perform the query, Analytics will choose the datasource that most closely aligns to the interval of the metricQueryConfig . In this case the PT15M datasource would be selected as it targets queries larger than 1 day but less than 7 days.

© 2024 Accedian Networks Inc. All rights reserved. Accedian®, Accedian Networks®,  the Accedian logo™, Skylight™, Skylight Interceptor™ and per-packet intel™, are trademarks or registered trademarks of Accedian Networks Inc. To view a list of Accedian trademarks visit: http://accedian.com/legal/trademarks/. 


Was this article helpful?

Changing your password will log you out immediately. Use the new password to log back in.
First name must have atleast 2 characters. Numbers and special characters are not allowed.
Last name must have atleast 1 characters. Numbers and special characters are not allowed.
Enter a valid email
Enter a valid password
Your profile has been successfully updated.