Diurnal average

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Diurnal average

Helios
Hi,

Is there an easy way to plot a diurnal average?
I.e. the average from all days at 00:00, 00:01, 00:02 etc. so that I
have a plot of an average day?

My only idea is to dump the database and calculate the average for each
minute using a bash script. But it's nasty and extremely slow.
I hope there is a better (rrdtool internal?) way. Any ideas are appreciated.

Thanks

_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Diurnal average

Simon Hobson
Helios Solaris <[hidden email]> wrote:

> Is there an easy way to plot a diurnal average?
> I.e. the average from all days at 00:00, 00:01, 00:02 etc. so that I
> have a plot of an average day?

Over how many days ?
You might be able to do it with some of the time shift functions. If, for example, you wanted to do one day and average over the last , then I think you might do it like this :

Assuming START and END are already defined variables which give the start and end of our 7th day, and wiring "sort of shell script" (I'm used to dynamically generating graphs from Bash scripts and piping the resulting rrd command through rrd) ...

graph ... start=$START end=$END
def:day7=datafile.rrd:data
def:day6=datafile.rrd:data:start=$START-1day:end=$END-1day
...
def:day1=datafile.rrd:data:start=$START-6days:end=$END-6days

def:dayave=day1,day2,day3,day4,day5,day6,day7,add,add,add,add,add,add,7,div


You should now have a cdef with values which are the averages of the corresponding time slots over the last 7 days. I would suggest having a good read of the docs as the syntax above may not be completely accurate.


_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Diurnal average

Helios
On 07/14/2017 09:51 PM, Simon Hobson wrote:
> Over how many days ?

Over the entire database (2 years).
Yes, using time shift may work, but it's not really flexible when I want
to change parameters.

_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Diurnal average

Alex van den Bogaerdt-5
>> Over how many days ?
>
> Over the entire database (2 years).
> Yes, using time shift may work, but it's not really flexible when I want
> to change parameters.

Not just that. If it's 2 years worth of data, this means 730 or so days
with evenso number of DEFs. You may run into performance problems or even
limits. Perhaps it just won't fit in a script (hint: pipe mode may still
work).

First of all make sure you have those 2 years in the resolution required
("standard" examples speak of 2 years, but only part of it is hi-res).

Then, perhaps you can generate your numbers using rrdtool in pipe mode,
xport each day, and process the results outside rrdtool using e.g. perl?

Another possible unexpected detail: depending on where you live, you may
find that you are adding up different "wall clock" times. Make sure you
understand timezones and daylight saving / summer time.


_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Diurnal average

Simon Hobson
In reply to this post by Helios
Helios Solaris <[hidden email]> wrote:

>> Over how many days ?
>
> Over the entire database (2 years).

Ouch - as already mentioned by Alex, that could be a LOT of processing and also prevents you using consolidation.

> Yes, using time shift may work, but it's not really flexible when I want
> to change parameters.

Depends on what you want to change. I would script it so that the parameters are easy to change - that's how most of my graphs work, passing (eg) start, end, and step values that are generated by a case statement in a shell script that takes "day", "week", "month", "year" as a parameter.

I do have another thought ...
How many time periods do you want during the day ? If it's not too many (where "too many" is somewhat subjective), then you might want to look at turning it around - store (say) midnight-1am in one rrd file, 1am to 2am in another, and so on (in this example, using 24 rrd files). Each rrd file can then use consolidation.

And (I'm sort of typing as my thought process flows here), I'm led to another option - but which limits consolidation again.
For midnight to 1am, use a time of day function to set a CDEF to the stored value if the function is true and to either zero or unknown if not. Then another time function and cdef for 1am to 2am, and so on. Then you can use a VDEF to get a consolidation across the whole CDEF to get a single value - which you can then print with a PRINT (not GPRINT) statement.
However, a quick look at the docs suggest that there isn't a simple "time of day" function which complicates matters somewhat.
I **think** this may do it !
DEF:x=somefile.rrd...
CDEF:tod=TIME,DUP,86400,/,FLOOR,86400,*,-
# get time of sample, divide by 86400 (1 day), take integer part, multiply by 86400 (get time value of midnight at start of day) and subtract from time of sample to get the number of seconds since midnight.

CDEF:h0=tod,0,GE,x,3600,LT,[0|UNKN],IF,[0|UNKN],IF
# If 0<=tod<3600 then get x else get [0|UNKN]
VDEF:h0ave=h0,AVERAGE
# Get an average value.
PRINT:h0ave,"%6.2lf'

CDEF:h1=tod,3600,GE,x,7200,LT,[0|UNKN],IF,[0|UNKN],IF
and so on ...


This is the sort of thing I'd just script in Bash - almost trivial to generate an arbitrary number of statements covering appropriate timescales

It's so long since I've used the TIME function, dunno if it should be tod=TIME... or tod=x,TIME... (ie get x, then get the TIME value for the current sample.

Whether you use zero or unknown for the times outside of each window depends a little on your requirements. I suspect that unknown (UNKN) is probably the correct one to use.

Whatever happens, I don't think you can calculate and graph in one go. You'll have a "graph" going back an arbitrary time to generate just one day's worth of data - and I don't think you can have a graph covering (say) 2 years of data but only drawing a line for one day. Something like gnuplot might be a better fit for that.

I'd also expect this to be "quite slow" and resource intensive. Lets consider the case where you do it only by hours. You are creating 1 duplicate of your dataset with time of day calculated, then another 24 duplicates. So that's effectively 25 copies of your database in memory at once.
Go down to (say) 5 minute periods, and it then means 289 (1 + 288) copies of the database in memory at once !
AND you must store all the data you want to use. That means, if you want 5 minute slots for a full 2 years, you need to store 5 minute consolidated data points for 2 years - that's 210240 samples to store (though I guess that's not so much data).

_______________________________________________
rrd-users mailing list
[hidden email]
https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
Loading...