delayed updates

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

delayed updates

Mike Perham-3-3
We have a monitoring service which collects metrics.  If that service
is down, we queue up the stats to be pushed into RRD later.  Once the
service comes back online, we want to push the latest metric data
along with the older data (not all at once so as to not overwhelm the
service once it is back online) so we can fill in the gap in RRD.  My
question is with rrdupdate, it does not like updating an RRD file with
older data.  If we push current data into an RRD file, is it possible
to push older data into it since the lastupdated timestamp will be
greater?  Is it possible to fill in the gap without having to send all
the old data at once?

mike


Reply | Threaded
Open this post in threaded view
|

Re: delayed updates

Jason Fesler
> older data.  If we push current data into an RRD file, is it possible
> to push older data into it since the lastupdated timestamp will be
> greater?  Is it possible to fill in the gap without having to send all
> the old data at once?

Nope.  SEnd the oldest data first.  Maybe don't send it all at once - but
record into rrd from your queue oldest first.

If the data is too much, you might consider a max amount that you'll keep
in the queue, and if you queue more, drop older entries (creating a gap)
but once your system is back online and catches up you'll have recent
history.


Reply | Threaded
Open this post in threaded view
|

Re: delayed updates

Dan Cech
In reply to this post by Mike Perham-3-3
Mike Perham wrote:

> We have a monitoring service which collects metrics.  If that service
> is down, we queue up the stats to be pushed into RRD later.  Once the
> service comes back online, we want to push the latest metric data
> along with the older data (not all at once so as to not overwhelm the
> service once it is back online) so we can fill in the gap in RRD.  My
> question is with rrdupdate, it does not like updating an RRD file with
> older data.  If we push current data into an RRD file, is it possible
> to push older data into it since the lastupdated timestamp will be
> greater?  Is it possible to fill in the gap without having to send all
> the old data at once?

You need to queue your updates and send them to rrdtool in order.  It is
not possible to go back and 'fill in' the old data.

>From the sounds of your setup you already have a queue mechanism, so I
would recommend having the collector just push data onto the queue, then
use a separate process to feed the elements to rrdtool in order.

Dan


Reply | Threaded
Open this post in threaded view
|

Re: delayed updates

Mike Perham-3-3
In reply to this post by Jason Fesler
Jason, thanks.  We considered that but it's not ideal:

1) Metric data is less valuable over time.  If it takes us 2 days to
recover, starting with the oldest data, our customers will be
screaming to know how their systems are performing NOW.
2) The amount of metric data is potentially huge so we need to
throttle the catch-up process or it will be very easy to spike our
Internet connection.

No way around this limitation?

On 11/13/07, Jason Fesler <[hidden email]> wrote:

> > older data.  If we push current data into an RRD file, is it possible
> > to push older data into it since the lastupdated timestamp will be
> > greater?  Is it possible to fill in the gap without having to send all
> > the old data at once?
>
> Nope.  SEnd the oldest data first.  Maybe don't send it all at once - but
> record into rrd from your queue oldest first.
>
> If the data is too much, you might consider a max amount that you'll keep
> in the queue, and if you queue more, drop older entries (creating a gap)
> but once your system is back online and catches up you'll have recent
> history.
>
>


Reply | Threaded
Open this post in threaded view
|

Re: delayed updates

Jason Fesler
> No way around this limitation?

How good is your C programming? :-)

Consider only replaying the last *hour* of data?

(I'm very aware the value of time based data.. in my case only the last
hour is critical, the rest of it, I can afford gaps, as it is trend only).


Reply | Threaded
Open this post in threaded view
|

Re: delayed updates

Mike Perham-3-3
Well, since we're a commercial service, we don't have as much
flexibility in our data retention policy.  I'll discuss with the team
here - I certainly don't want to muck with RRDtool's innards.

On 11/13/07, Jason Fesler <[hidden email]> wrote:

> > No way around this limitation?
>
> How good is your C programming? :-)
>
> Consider only replaying the last *hour* of data?
>
> (I'm very aware the value of time based data.. in my case only the last
> hour is critical, the rest of it, I can afford gaps, as it is trend only).
>
>


Reply | Threaded
Open this post in threaded view
|

Re: delayed updates

Jason Fesler
> Well, since we're a commercial service, we don't have as much
> flexibility in our data retention policy.  I'll discuss with the team
> here - I certainly don't want to muck with RRDtool's innards.

If you're too far behind, you might consider a ramdisk.  Copy .rrd to
ramdisk, then do all the updates, then copy .rrd back to real disk.  Only
do this if you're > N updates behind, since you'll spend time and I/O on
the copy.

If you're required to have ALL the data, then you are limited on options
(get dirty in C, or add more capacity to make catch-up easier).

Another option, which.. involves a bit of  complexity.. is if you're too
far behind, then make a new .rrd file entirely for temporary use.  Have
your graphs show both .rrd's .  Once you're caught up on backfilling your
original rrd, nuke the temp rrd file.   Doable but a few too many moving
parts for my taste..


Reply | Threaded
Open this post in threaded view
|

Re: delayed updates

Sam Umbach
Mike,

One workaround I've seen for the huge volume of metric data is to
break your large RRD files with multiple DS's into multiple RRD files
with a single (or small number of) DS.  If you must avoid a spike in
your traffic, catching up with all the data at once impossible (i.e.,
you must throw something away or delay more recent data in order to
catch up with historical data).  You could discard historical
information for some DS's for which historical information is less
important and gaps are acceptable, while catching up w/ historical and
current data for the more important DS's.  With some more complex
queuing logic, you may not have to discard any data.

Another solution is to perform some data consolidation on the queued
data in order to reduce the number of data points you need to update
in your RRDs.  Unfortunately, if you have different RRAs using
different consolidation functions, consolidating the data before
passing it to rrdtool may (probably will) result in a loss of
information.  If it would be better to show only average numbers
(without min and max) instead of a gap in the graph, this may be an
reasonable trade-off.

-Sam Umbach


Reply | Threaded
Open this post in threaded view
|

Re: delayed updates

Fabien Wernli
In reply to this post by Jason Fesler
On Tue, Nov 13, 2007 at 09:28:13AM -0800, Jason Fesler wrote:
> Nope.  SEnd the oldest data first.  Maybe don't send it all at once - but
> record into rrd from your queue oldest first.

what we do is to drop one entry out of two, so we don't get any gaps (as
long as our heartbeat=4*step isn't attained)