Skip to content

Improve staleness handling #398

Closed
Closed
@brian-brazil

Description

@brian-brazil
Contributor

Currently prometheus considers a timeseries stale when it's got no data points more recent than 5m (configurable).

This causes dead instances' data to hang around for longer than it should, makes less often scrape intervals difficult, makes advanced self-referential use cases more difficult and prevents the timestamp propagating from the push gateway.

I propose that we consider a timeseries stale if it wasn't present in the most recent scrape of the target we get it from.

Activity

discordianfish

discordianfish commented on Jul 15, 2014

@discordianfish
Member

👍 It seems more reasonable to me to assume that a metric ceased to exists once it's gone from the source. Instead of looking at the metrics itself, the decision whether some metric is 'gone' should be based on the actual scraping results.

I'm personally just affected by the pushgateway issues: My batch job is running every 10 minutes and it pushes a metric with timestamp to the pushgateway. Prometheus scrapes it and can reach it just fine, so one would expect that it persists that data point. But since there are more than 5 minutes between the date points, it considers them as stale.

juliusv

juliusv commented on Jul 15, 2014

@juliusv
Member

👍 in general, but this is hard to implement given the current design. Right now we have no way of tracking which timeseries we got from a target during a previous scrape which are now no longer present (to mark them stale). The scraper sends all scraped samples into a channel which on the other end just get appended to storage.

If you have a good idea how to redesign the current scraping/storage integration to allow for this (efficiently), let me know!

juliusv

juliusv commented on Jul 15, 2014

@juliusv
Member

@discordianfish Regarding your issue with the pushgateway: the expectation is that you normally wouldn't explicitly assign timestamps to pushed samples. That is only for power users. Just send the sample value, and Prometheus will attach a current timestamp to the pushed sample value on every scrape. Of course you need to get used to the semantics then that the timestamp is from the last scrape and not the last push, but this is the expected use case.

beorn7

beorn7 commented on Jul 15, 2014

@beorn7
Member

@discordianfish Yup. The timestamp field is in most cases not what you want. The typical use-case where you want to report something like the time of completion of a batch job via the pushgateway, you would create a metric last_completion_time_seconds and put the Unix time into it as a value. The timestamp in the exchange format really means "scrape time", and you really need to know what you do if you want to manipulate that.

discordianfish

discordianfish commented on Jul 15, 2014

@discordianfish
Member

Well, you can't tell whether a metric is up-to-date but just '42' each time or if the thing pushing to the pushgateway stopped working and the '42' is just the latest result. I thought I could just add timestamps for that, raise the stalenessDelta and just alert if the metric is gone. But something like last_completion_time_seconds will do the job as well.

brian-brazil

brian-brazil commented on Oct 21, 2015

@brian-brazil
ContributorAuthor

I was just thinking there from the rate discussion, and have a sketch of a solution.

The two basic cases we want to solve are:

  1. If a pushgateway exports a sample with a timestamp, it should be considered fresh for as long as it exported.
  2. If a target scrape fails or a target is removed, we no longer want to consider it's time series fresh. This is the sum(some_gauge) case.

In addition we want something that'll produce the same results now as back in time, and across Prometheus restarts.

My idea is to add two new values that a sample can have: "fresh" and "stale". These would be persisted like normal samples, but not directly exposed to users.

When we get a scrape with timestamps set on a sample we don't have, we'd add a sample for the exported value, and a 2nd one with the "fresh" value and the scrape time. If we get a sample we already have, then we add a new sample with the "fresh" value and scrape time. When querying for a given time with an instant selector, if we get a "fresh" then we walk back until we get the actual sample. For a range selector we'd ignore that sample.

When a scrape fails, a evaluation no longer produces certain timeseries or a target is removed we'd add in a "stale" on all affected time-series. For an instant selector if the first sample we hit is stale, we stop and return nothing. For a range selector we'd ignore the stale samples, and return the other samples as usual (irate needs special handling here as it's more instant in semantics - could we change it to an instant selector?, and we'll need to be careful with SD blips).

There's various corner cases and details to be worked out, but this seems practical and has the right semantics.

beorn7

beorn7 commented on Oct 23, 2015

@beorn7
Member

When a scrape fails, a evaluation no longer produces certain timeseries or a target is removed we'd add in a "stale" on all affected time-series.

I believe the actual challenge in staleness handling is to identify which time-series are "affected" in the above sense. To do that, we needed to track which target exported which time-series in their respective last scrape. All the updates and lookups required are not trivial to get right and fast, especially in a highly concurrent, high throughput scenario.

56 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @redbaron@octo@discordianfish@juliusv@WIZARD-CXY

        Issue actions

          Improve staleness handling · Issue #398 · prometheus/prometheus