Using a windowed counter

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Using a windowed counter

Marcelo Pasin
Hi,

I am a newcomer, so I am probably asking something silly here.

So, I want to use rrdtool to manage my weather station data. I read data in 5 minute intervals, but my station only tells me the rain counter for the last hour. So the number I get is a summation of my rain counter deltas for the last 12 intervals.

For instance, suppose in the last 5 hours I get the values the first line below (one char for each 5-min interval), it means that, for every 5 minutes, actual rainfall has been as in the second line.

011111244445678888889998767777765555553444335888889899987677
010000120001121000121000012000110000010100012300011010001210

Is there a way to specify that the values to store in the RRD would be calculated as:

stored[now] = input[now] - input[now - 1] + stored[now - 12]

Thanks a lot!

MP
_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|

Re: Using a windowed counter

Alex van den Bogaerdt-5
> So, I want to use rrdtool to manage my weather station data. I read data
> in 5 minute intervals, but my station only tells me the rain counter for
> the last hour. So the number I get is a summation of my rain counter
> deltas for the last 12 intervals.
>
> For instance, suppose in the last 5 hours I get the values the first line
> below (one char for each 5-min interval), it means that, for every 5
> minutes, actual rainfall has been as in the second line.
>
> 011111244445678888889998767777765555553444335888889899987677
> 010000120001121000121000012000110000010100012300011010001210
>
> Is there a way to specify that the values to store in the RRD would be
> calculated as:
>
> stored[now] = input[now] - input[now - 1] + stored[now - 12]

This is not a complete answer but hopefully it helps you to tackle this
problem.

Short answer: no, unless something can be done with the COMPUTE DS type,
which I do not know enough.

RRDtool does not work with values. It works with rates. After processing
its input, the resulting rate may be further processed. The original input
is not kept.

This said: your 'values' are actually rates: rainfall in the past hour. It
probably means you will have to use the GAUGE data source type. And then
your 'values' are in the database, as rates.
Make sure you understand rates are <something> per second. Just multiply
by 3600 if your rates are per hour.

Before anything else:
You will probably end up in some trial and error. It would be of a very
big help both to you and to the members of this list to have actual values
being given to rrdtool, the time that these happened, so that you can
recreate the same conditions.
This also means you will have to use real time stamps, not 'now'.

To make things easier, it would be a very good idea to query your weather
station not just every 5 minutes, but more precise at time stamps which
are whole multiples of 300 seconds.  Thus: 12:05, 12:10, 12:15 and not
12:07, 12:12, 12:17. Again this means using real time stamps, not 'now'.
Read about normalization and consolidation:
http://rrdtool.vandenbogaerdt.nl/process.php to understand why this helps.

Your graphs should also start and end on nice numbers. That means you will
have a known number of intervals in your graph. Beware: there have been,
are, and probably will be off-by-one errors. Sometimes they are fixed,
sometimes they pop up again. While debugging your solution always keep
this in mind and modify your times accordingly.
One example: start of graph is 12:00, end of graph is 13:00, number of
5-minute intervals should be 12, but actually was 13 because the interval
13:00 to 13:05 was also included. Another time with the same start and end
times the last interval, 12:55 to 13:00, was not included and I ended up
with only 11 intervals.

https://en.wikipedia.org/wiki/Off-by-one_error#Fencepost_error

You may need to change your start, or end time for the graph to compensate
for this.


Some ideas to investigate:

* just write a program (C, bash, perl, whatever suits you) that does the
processing as you described above. Feed the result to RRDtool.
* use CDEFs with some PREVs and see where that leads you
* what happens if you just record the data as is, and look at long term
stats, e.g. one hour per pixel column, after RRDtool has averaged 12
5-minute rates into 1 1-hour rate. You can have more than one RRA in your
database. Define an RRA which collects 12 5-minute intervals per bucket,
consolidation function AVERAGE.

Some random tips I can think of right now:

* start with an empty database and fill the first 12 time slots with zero.
This helps when using PREV.
You can do so by specifying a start time at least one hour before your
first entry. Then feed rate 0 to RRDtool. Either set heartbeat high enough
to allow you to do this with a single update rate 0, or actually to 12
updates 5 minutes apart.

* keep it simple. Your task is hard enough without all those extra
features. Add those later when so desired. Focus now on getting the
numbers right.

* make your graphs big. E.g. 400 pixels, showing just 40 slots of 5
minutes worth of data (--width 400 --end <some timestamp> --start
end-12000). Are the rates the same as you put in? If not, investigate.
Logic error, or fencepost problem?

* In your first few tries, send a rate, dump the database, make sure that
the resulting rate is what you expect. Unless you find a bug (which I
doubt, at this point for this part in the process) there is an error in
your reasoning.

* make sure to use http://oss.oetiker.ch/rrdtool/doc/index.en.html and so on.

* keep discussions/questions on-list.

HTH
Alex


_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|

Re: Using a windowed counter

Donovan Baarda
A far simpler solution is; it's just a counter that resets every hour.

Provided you set sensible min and max rates, rrd will just see the hourly reset as a counter reset and record an UNKNOWN rate (NaN) for that 5min (since rrd cannot know how much rain fell between the reset and the sample taken before it). 

If you want to reduce or eliminate the NaNs you can sample at a faster rate (at least 2x faster) and set the xff on your rra's to a reasonable 0.5. This will "average out" the small unknown period using the known periods to give you a reasonably accurate estimated rate for the 5min period.

On 13 Jun. 2017 8:18 pm, "Alex van den Bogaerdt" <[hidden email]> wrote:
> So, I want to use rrdtool to manage my weather station data. I read data
> in 5 minute intervals, but my station only tells me the rain counter for
> the last hour. So the number I get is a summation of my rain counter
> deltas for the last 12 intervals.
>
> For instance, suppose in the last 5 hours I get the values the first line
> below (one char for each 5-min interval), it means that, for every 5
> minutes, actual rainfall has been as in the second line.
>
> 011111244445678888889998767777765555553444335888889899987677
> 010000120001121000121000012000110000010100012300011010001210
>
> Is there a way to specify that the values to store in the RRD would be
> calculated as:
>
> stored[now] = input[now] - input[now - 1] + stored[now - 12]

This is not a complete answer but hopefully it helps you to tackle this
problem.

Short answer: no, unless something can be done with the COMPUTE DS type,
which I do not know enough.

RRDtool does not work with values. It works with rates. After processing
its input, the resulting rate may be further processed. The original input
is not kept.

This said: your 'values' are actually rates: rainfall in the past hour. It
probably means you will have to use the GAUGE data source type. And then
your 'values' are in the database, as rates.
Make sure you understand rates are <something> per second. Just multiply
by 3600 if your rates are per hour.

Before anything else:
You will probably end up in some trial and error. It would be of a very
big help both to you and to the members of this list to have actual values
being given to rrdtool, the time that these happened, so that you can
recreate the same conditions.
This also means you will have to use real time stamps, not 'now'.

To make things easier, it would be a very good idea to query your weather
station not just every 5 minutes, but more precise at time stamps which
are whole multiples of 300 seconds.  Thus: 12:05, 12:10, 12:15 and not
12:07, 12:12, 12:17. Again this means using real time stamps, not 'now'.
Read about normalization and consolidation:
http://rrdtool.vandenbogaerdt.nl/process.php to understand why this helps.

Your graphs should also start and end on nice numbers. That means you will
have a known number of intervals in your graph. Beware: there have been,
are, and probably will be off-by-one errors. Sometimes they are fixed,
sometimes they pop up again. While debugging your solution always keep
this in mind and modify your times accordingly.
One example: start of graph is 12:00, end of graph is 13:00, number of
5-minute intervals should be 12, but actually was 13 because the interval
13:00 to 13:05 was also included. Another time with the same start and end
times the last interval, 12:55 to 13:00, was not included and I ended up
with only 11 intervals.

https://en.wikipedia.org/wiki/Off-by-one_error#Fencepost_error

You may need to change your start, or end time for the graph to compensate
for this.


Some ideas to investigate:

* just write a program (C, bash, perl, whatever suits you) that does the
processing as you described above. Feed the result to RRDtool.
* use CDEFs with some PREVs and see where that leads you
* what happens if you just record the data as is, and look at long term
stats, e.g. one hour per pixel column, after RRDtool has averaged 12
5-minute rates into 1 1-hour rate. You can have more than one RRA in your
database. Define an RRA which collects 12 5-minute intervals per bucket,
consolidation function AVERAGE.

Some random tips I can think of right now:

* start with an empty database and fill the first 12 time slots with zero.
This helps when using PREV.
You can do so by specifying a start time at least one hour before your
first entry. Then feed rate 0 to RRDtool. Either set heartbeat high enough
to allow you to do this with a single update rate 0, or actually to 12
updates 5 minutes apart.

* keep it simple. Your task is hard enough without all those extra
features. Add those later when so desired. Focus now on getting the
numbers right.

* make your graphs big. E.g. 400 pixels, showing just 40 slots of 5
minutes worth of data (--width 400 --end <some timestamp> --start
end-12000). Are the rates the same as you put in? If not, investigate.
Logic error, or fencepost problem?

* In your first few tries, send a rate, dump the database, make sure that
the resulting rate is what you expect. Unless you find a bug (which I
doubt, at this point for this part in the process) there is an error in
your reasoning.

* make sure to use http://oss.oetiker.ch/rrdtool/doc/index.en.html and so on.

* keep discussions/questions on-list.

HTH
Alex


_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users


_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|

Re: Using a windowed counter

Alex van den Bogaerdt-5
In reply to this post by Marcelo Pasin
> A far simpler solution is; it's just a counter that resets every hour.

No it's not.

>> > 011111244445678888889998767777765555553444335888889899987677
>> > 010000120001121000121000012000110000010100012300011010001210
                  ------------
                   ------------

Look approximately halfway:  98767
Then look below: 00012

The "counter" goes from 7 to 6, as a result of rainfall 1 being added and
2 being removed from the window.

If you treat this as a counter being reset every hour, you would get
negative rainfall, or counter overflow with the resulting huge flood, or
rates becoming unknown because of sanity checks.





_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|

Re: Using a windowed counter

Marcelo Pasin
Thanks everyone for the answers, especially Alex for the long one. :-)

I will then write my own (window) interpreter for my rain gauge values.

Thanks! MP



> On 14 Jun 2017, at 01:07, Alex van den Bogaerdt <[hidden email]> wrote:
>
>> A far simpler solution is; it's just a counter that resets every hour.
>
> No it's not.
>
>>>> 011111244445678888889998767777765555553444335888889899987677
>>>> 010000120001121000121000012000110000010100012300011010001210
>                  ------------
>                   ------------
>
> Look approximately halfway:  98767
> Then look below: 00012
>
> The "counter" goes from 7 to 6, as a result of rainfall 1 being added and
> 2 being removed from the window.
>
> If you treat this as a counter being reset every hour, you would get
> negative rainfall, or counter overflow with the resulting huge flood, or
> rates becoming unknown because of sanity checks.
>
>
>
>
>

_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|

Re: Using a windowed counter

Donovan Baarda
In reply to this post by Alex van den Bogaerdt-5
Uh... my bad, I miss-read the details. You are right, it's a rainfall per hour rate.

I think a pre-filter script to convert your hourly rate back into a counter before feeding it to rrd would be the best option. A little python scripting would probably do it pretty easily.

On 14 June 2017 at 09:07, Alex van den Bogaerdt <[hidden email]> wrote:
> A far simpler solution is; it's just a counter that resets every hour.

No it's not.

>> > 011111244445678888889998767777765555553444335888889899987677
>> > 010000120001121000121000012000110000010100012300011010001210
                  ------------
                   ------------

Look approximately halfway:  98767
Then look below: 00012

The "counter" goes from 7 to 6, as a result of rainfall 1 being added and
2 being removed from the window.

If you treat this as a counter being reset every hour, you would get
negative rainfall, or counter overflow with the resulting huge flood, or
rates becoming unknown because of sanity checks.








--
Donovan Baarda <[hidden email]>

_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users