Hello,
I am currently working on a swimming pool control. I store values every 15 min (precise timestamp; 16:15:00; 16:30:00; etc.) and for every 15 min I store the "heat_flag", which is either "0" (heating off) or "1" (heating on). # rrdtool create temp_pool.rrd --step 900 --start "20140101 00:00" \ # DS:pump_flag:GAUGE:1200:0:1 \ # DS:heat_flag:GAUGE:1200:0:1 \ # DS:device:GAUGE:1200:0:90 \ # RRA:AVERAGE:0.5:1:103680 \ # RRA:MIN:0.5:96:3650 \ # RRA:MAX:0.5:96:3650 \ # RRA:AVERAGE:0.5:96:3650 I would like to calculate the total heating time per day. For that, I would like to find the number of "1" values per day. In the end, multiply by 15min and I know the total heating time. I tried: CDEF:heattime2=heat_flag,900,*,60,/ \ VDEF:totalheattime=heattime2,AVERAGE' GPRINT:totalheattime:"Total\: %2.2lf h \n"' but I get wrong results. I probably have to do the calculation completly different. Can anyone help me out here? I should get perfectly aligned values as a result: 0.25 h; 0.50h; etc. How can I achieve this? Thanks a million, spo |
spock <[hidden email]> wrote:
> I tried: > CDEF:heattime2=heat_flag,900,*,60,/ \ > VDEF:totalheattime=heattime2,AVERAGE' > GPRINT:totalheattime:"Total\: %2.2lf h \n"' > but I get wrong results. I probably have to do the calculation completly > different. > > Can anyone help me out here? > > I should get perfectly aligned values as a result: 0.25 h; 0.50h; etc. > How can I achieve this? Give us a clue to go on, what values do you get ? Is it that you expected (say) 15 but got 14.99, or you got something completely different ? Your approach is right, I suspect rounding errors come into it. _______________________________________________ rrd-users mailing list [hidden email] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users |
Hi Simon, thanks for helping me out.
I will include a more detailed example here: The graph shows 24h; database has step 900 Lets take the grey area: we have "1" values from 09:30 to 19:45 = 10.25h. relevant graph parameters: DEF:heat_flag=temp_pool.rrd:heat_flag:AVERAGE \ DEF:pump_flag=temp_pool.rrd:pump_flag:AVERAGE \ CDEF:pumptime=pump_flag,1,EQ,INF,UNKN,IF \ CDEF:heattime=heat_flag,1,EQ,INF,UNKN,IF \ CDEF:heattime2=heat_flag,900,*,60,/ \ CDEF:pumptime2=pump_flag,900,*,60,/ \ VDEF:totalpumptime=pumptime2,AVERAGE \ VDEF:totalheattime=heattime2,AVERAGE' AREA:pumptime#10101010 \ AREA:heattime#FF000015 \ LINE2:pump_flag#000000 \ GPRINT:totalpumptime:"Filterzeit\: %2.2lf h " \ GPRINT:totalheattime:"Heizzeit\: %2.2lf h \n"' As you can see, the calculation for totalpumptime is 6.49h, although it should be 10.25h. The LINE2: pump_flag shows clearly, that the data is there, all "1" values - rest of the day "0" values. Thanks again, spock |
----- Original Message ----- From: "spock" <[hidden email]> To: <[hidden email]> Sent: Saturday, August 23, 2014 9:57 PM Subject: Re: [rrd-users] How to calculate desired value? > Hi Simon, thanks for helping me out. > I will include a more detailed example here: > <http://rrd-mailinglists.937164.n2.nabble.com/file/n7582382/tempday1.png> > > The graph shows 24h; database has step 900 > Lets take the grey area: > we have "1" values from 09:30 to 19:45 = 10.25h. I think it is 09:15, and 10.5 hours. Relevant parameters include start and end time, number of pixels. You should have "--end {some timestamp equal to midnight} --start end-24h --width {any number being a whole multiple of 96}" If not, something has to give and question #1 (see below) is answered. You have a rate of 1 from 09:15 to 19:45. 10.5 hours You have a rate of 0 from 00:00 to 09:15 and from 19:45 to 24:00. 9.25 hours and 4.25 hours, total 13.5 hours. You compute the average rate, which is 10.5/24=0.4375. You multiply that by 900, divide by 60, and the answer should be 6.5625 so the two questions are: 1: why do you get a different outcome 2: why do you think you should multiply by 15 ( *900/60 is the same as *15) to get hours? The answer to question #1 could be as Simon suggested: rounding errors. I expect the amount of time could be not exactly 24 hours. The solution to question #2 is to think it over again. Assuming your start and end times are exactly midnight, you are averaging over a 24 hour period, you have an average "pump on" ratio which you should multiply by 24 to get the amount of hours. The answer you should get would be 10.5 (or 10.25 if I see things wrong) except for that (relatively small) error from question #1. Further testing: Fake some data, have the (fake) pump on during exactly 1 hour 23:00 the day before until midnight, and no other times. Then print the average pump time. It should be zero. In a different testing round, do the same from midnight at the end of that day until 01:00. Again the average pump time should be zero. If it is not in either test, then you know there is a problem with the start time or the end time. In a third test, have the pump on from 12:00 to 15:00. The average rate should be exactly 0.125 Just print the average of pump_flag to debug, no CDEFs involved. Look at the output of rrdtool dump and verify that it matches what you expect. Oh and by the way: i got confused because you started talking about heat_flag and then switched to pump_flag. I know how it is when debugging, it is easy to confuse yourself in similar ways. Whenever a value is not what you expect, take a step back and look at the problem again. HTH Alex _______________________________________________ rrd-users mailing list [hidden email] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users |
Hi Alex,
ok, I did some more tests with artificial data. See script. testdata.sh Sorry for the confusion between heat_flag and pump_flag, both values follow the same concept (value 0 = off; value 1 = on; no other values allowed) and I want to calculate the total hours. I am not sure, if I should be more confused than before. First of all, thanks for the correct formula. If I use the CDEF:heattime2=heat_flag,24,* VDEF:totalheattime=heattime2,AVERAGE I get the desired value - most of the time. Then I played around with some scenarios: My timeframe is: -s 1408917600 -e 1409004000 Mon Aug 25 00:00:00 CEST 2014 Tue Aug 26 00:00:00 CEST 2014 I found out, that the very first value (1408917600) will NOT be taken into account for the calculation. This is probably by design. The very last value is taken into account (1409004000), this is also by design. My mistake was, that I did not always use a complete 24h timeframe for the graph creation. The other mistake was, that the calculation does not work, if you have missing values within the timeframe. For that reason, I substituted missing values with "0" using: CDEF:heattimeclean=heat_flag,UN,0,heat_flag,IF \ and then used the cleaned time series for the calculation. CDEF:heattime2=heattimeclean,24,* \ VDEF:totalheattimeclean=heattime2,AVERAGE \ Question (1): Unfortunately this introduces (rounding?) errors. In my artificial testdata I have 17 x "1" values for heat_flag; which corresponds to 4.25h of heattime. The totalheattime is calculated as "4.25" - perfect! But the totalheattimeclean is calculated as 4.21 - is this a rounding error? I do not understand this, because the heat_flag,UN,0,heat_flag,IF should not touch the original values, only unknowns. How do I avoid the rounding? Question (2) My other problem is: On my original "production" database, I have always the above described "rounding" error - but for both values! If I re-create the very same day in my artificial database, I get the rounding errors only for the cleaned values. We have 8.25h of heating time "Production" database: For this day (Aug. 21) with 8.25h heattime I get: totalheattimeclean = totalheattime = 8.16h "Artificial" database: For this day (Aug. 21) with 8.25h heattime I get: totalheattimeclean = 8.16; totalheattime = 8.25h I thought, well, I am comparing apples with oranges. I verified: - parameters are the identical (copy & paste any and all parameter into testdata script) - used same environment (e.g. export LANG=de_DE.UTF8) - used the same data I took the data from my "production" rrd database and wrote it with enclosed script into a new rrd database. I then did a fetch with rrdtool fetch /testtemp/temp_pool_debug.rrd -s 1408572000 -e 1408658400 AVERAGE on both databases and compared the resulting file - no difference except for the timestamp 1408659300, which is outside of the timeframe. Or is this the key? The resulting graphs from production and artificial database look completly identical - except for the calculated heattime. Do you have an explanation for that? --> Regards, Spock |
> I am not sure, if I should be more confused than before.
> > First of all, thanks for the correct formula. If I use the > CDEF:heattime2=heat_flag,24,* > VDEF:totalheattime=heattime2,AVERAGE > I get the desired value - most of the time. So focus on those other times and see what's different there. > Then I played around with some scenarios: > My timeframe is: > -s 1408917600 -e 1409004000 > Mon Aug 25 00:00:00 CEST 2014 > Tue Aug 26 00:00:00 CEST 2014 > > I found out, that the very first value (1408917600) > will NOT be taken into account for the calculation. > This is probably by design. It is. The intervals have a timestamp which denotes the end of that interval. Consider one interval of 15 minutes. It starts at 00:00 and it ends at 00:15. If RRDtool would also include the interval stamped "00:00", you would get two intervals, for a total of 30 minutes, from 23:45 to 00:15. > The very last value is taken into account (1409004000), > this is also by design. Yup. I asked because sometimes bugs fly into the memory and then there are off-by-one errors. > My mistake was, that I did not always use a complete 24h timeframe for the > graph creation. > The other mistake was, that the calculation does not work, if you have > missing values within > the timeframe. And, if for some unknown reason, you get values which are not exactly 0 or 1, you will also see problems. > For that reason, I substituted missing values with "0" using: > CDEF:heattimeclean=heat_flag,UN,0,heat_flag,IF \ > and then used the cleaned time series for the calculation. > CDEF:heattime2=heattimeclean,24,* \ > VDEF:totalheattimeclean=heattime2,AVERAGE \ > Question (1): > > Unfortunately this introduces (rounding?) errors. > In my artificial testdata I have 17 x "1" values for heat_flag; which > corresponds to 4.25h of heattime. > The totalheattime is calculated as "4.25" - perfect! > But the totalheattimeclean is calculated as 4.21 - is this a rounding > error? Right now I have no clue what causes this. Your script is a bit big, and my time is a little limited. Sorry. > I do not understand this, because the heat_flag,UN,0,heat_flag,IF should > not > touch the original values, only unknowns. I agree. > > How do I avoid the rounding? Make sure you are working with exactly 96 intervals: 24 hours times 15 minutes per hour. Don't assume everything is working as designed, test and verify. The problem may be in your script (i don't think so at a first glance) or in RRDtool (it does sometimes happen, try another version), or there is still an error in your logic which you and I don't spotted yet. > Question (2) > My other problem is: > On my original "production" database, I have always the above described > "rounding" error - but for both values! > If I re-create the very same day in my artificial database, I get the > rounding errors only for the cleaned values. Which sounds very unlikely. My guess is that "very same" is not true. > We have 8.25h of heating time > "Production" database: > For this day (Aug. 21) with 8.25h heattime I get: > totalheattimeclean = totalheattime = 8.16h 33 periods "1", 63 others are 0. Suppose you are looking at 97, not 96, intervals: 33/97 * 24 = 8,164948453608247422680412371134 which "%2.2lf" prints as 8.16 A bug may have creeped in somewhere. > "Artificial" database: > For this day (Aug. 21) with 8.25h heattime I get: > totalheattimeclean = 8.16; totalheattime = 8.25h > > > I thought, well, I am comparing apples with oranges. > I verified: > - parameters are the identical (copy & paste any and all parameter into > testdata script) > - used same environment (e.g. export LANG=de_DE.UTF8) > - used the same data > > I took the data from my "production" rrd database and wrote it with > enclosed > script into a new rrd database. > I then did a fetch with > rrdtool fetch /testtemp/temp_pool_debug.rrd -s 1408572000 -e 1408658400 > AVERAGE > on both databases and compared the resulting file - no difference except > for > the timestamp > 1408659300, which is outside of the timeframe. Or is this the key? You ask for "--end 1408658400" yet RRDtool gives you the interval beyond that. If memory serves me right, this has come up a couple of years ago, for some unknown reason this bug was decided not to be fixed, and the workaround was to "--end 1408658399". HTH Alex _______________________________________________ rrd-users mailing list [hidden email] https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users |
Free forum by Nabble | Edit this page |