This one of our supplements to our whitepaper on open-source NMEC tools.

In this post, I’ll address some common questions about the occupancy schedule detection algorithm used in RMV2.0 and other popular NMEC tools, using code to find the answers.

Most of these questions are answered by interacting with the widget at the end of the page.

The previous supplement discussed the basic TOWT model. This post will not discuss the GBM model, which is the focus of the next supplement.

In this notebook, we try to visualize the automatic occupancy detection algorithm used in the RMV2.0 time-of-week temperature (TOWT) model. This article does not promote use of the algorithm for all cases. You should determine whether it is appropriate for your building and usage history.

Intro

The occupancy detection is done by findOccUnocc(). Comments in the code explain the proccess:

Define ‘occupied’ and ‘unoccupied’ based on a regression of load on outdoor temperature: times of week that the regression usually underpredicts the load will be called ‘occupied’, the rest are ‘unoccupied’ This is not foolproof but usually works well.
If the regression underpredicts the load more than 65% of the time then assume it’s an occupied period.

Some details are important to clarify:

Here is the call stack:

Recovering the data

Can we recover enough data to show the process, from a saved project file? If not, at what point in the code do we need to store some more data?

# Load the 'project' file saved when you used the RMV2.0 add-in.
rds_file <- "C:/RMV2.0 Workshop/something/Project_02.19.rds"
Project <- readRDS(rds_file)
i = 1

What next? Here, we recreate the function call to towt_baseline.

pre_Data_i <- Project$Data_pre[[i]]
timescaleDays <- Project$model_obj_list$models_list[[i]]$towt_model$timescaleDays
intervalMinutes <- Project$Data_pre_summary[1,6]
fahrenheit <- Project$fahrenheit

res_baseline <- towt_baseline(train_Data = pre_Data_i,
                              pred_Data = pre_Data_i,
                              timescaleDays = timescaleDays,
                              intervalMinutes = intervalMinutes,
                              fahrenheit = fahrenheit,
                              )
## Loading required package: shiny
## For time weights, tcenter =  2006-01-01 01:00:00 MST
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## For time weights, tcenter =  2006-03-15 01:00:00 MST
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## For time weights, tcenter =  2006-05-27 MST
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## For time weights, tcenter =  2006-08-08 MST
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## For time weights, tcenter =  2006-10-19 23:00:00 MST
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## For time weights, tcenter =  2006-12-31 23:00:00 MST
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667

Now, to illustrate some of the data structures used in the next step …

train <- Project$model_obj_list$models_list[[i]]$train
train
train$time <- as.POSIXlt(train$time, format="%m/%d/%y %H:%M")
head(train$time)
## [1] "2006-01-01 01:00:00 MST" "2006-01-01 02:00:00 MST"
## [3] "2006-01-01 03:00:00 MST" "2006-01-01 04:00:00 MST"
## [5] "2006-01-01 05:00:00 MST" "2006-01-01 06:00:00 MST"
pred <- Project$model_obj_list$models_list[[i]]$pred
pred
pred$time <- as.POSIXlt(pred$time, format="%m/%d/%y %H:%M")
head(pred$time)
## [1] "2006-01-01 01:00:00 MST" "2006-01-01 02:00:00 MST"
## [3] "2006-01-01 03:00:00 MST" "2006-01-01 04:00:00 MST"
## [5] "2006-01-01 05:00:00 MST" "2006-01-01 06:00:00 MST"

What next? Here, we recreate the function call to makeBaseline. The function loops over a list of timestamps spaced by timescaleDays (the hyperparameter set by the user in the GUI, from 15 to 90 days). At each step, it sets up weights centered at the timestamp and calls fitLBNLregress to create a mini-regression. It then bundles all these regressions together.

verbosity=5
towt_model <- makeBaseline(train$time,
                             train$eload,
                             train$Temp,
                             pred$time,
                             pred$Temp,
                             intervalMinutes=intervalMinutes,
                             timescaleDays=timescaleDays,
                             fahrenheit=fahrenheit,
                             verbose=verbosity)
## [1] "starting makeBaseline()"
## [1] "running regression at 6 steps"
## [1] "starting model run number 1"
## For time weights, tcenter =  2006-01-01 01:00:00 MST
## [1] "starting fitLBNLregress()"
## [1] "done determining occupied hours"
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## [1] "done setting up temperature matrix"
## [1] "fitting regression for occupied periods"
## [1] "done with prediction for occupied periods"
## [1] "fitting regression for unoccupied periods"
## [1] "done with prediction for unoccupied periods"
## [1] "leaving fitLBNLregress()"
## [1] "starting model run number 2"
## For time weights, tcenter =  2006-03-15 01:00:00 MST
## [1] "starting fitLBNLregress()"
## [1] "done determining occupied hours"
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## [1] "done setting up temperature matrix"
## [1] "fitting regression for occupied periods"
## [1] "done with prediction for occupied periods"
## [1] "fitting regression for unoccupied periods"
## [1] "done with prediction for unoccupied periods"
## [1] "leaving fitLBNLregress()"
## [1] "starting model run number 3"
## For time weights, tcenter =  2006-05-27 MST
## [1] "starting fitLBNLregress()"
## [1] "done determining occupied hours"
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## [1] "done setting up temperature matrix"
## [1] "fitting regression for occupied periods"
## [1] "done with prediction for occupied periods"
## [1] "fitting regression for unoccupied periods"
## [1] "done with prediction for unoccupied periods"
## [1] "leaving fitLBNLregress()"
## [1] "starting model run number 4"
## For time weights, tcenter =  2006-08-08 MST
## [1] "starting fitLBNLregress()"
## [1] "done determining occupied hours"
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## [1] "done setting up temperature matrix"
## [1] "fitting regression for occupied periods"
## [1] "done with prediction for occupied periods"
## [1] "fitting regression for unoccupied periods"
## [1] "done with prediction for unoccupied periods"
## [1] "leaving fitLBNLregress()"
## [1] "starting model run number 5"
## For time weights, tcenter =  2006-10-19 23:00:00 MST
## [1] "starting fitLBNLregress()"
## [1] "done determining occupied hours"
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## [1] "done setting up temperature matrix"
## [1] "fitting regression for occupied periods"
## [1] "done with prediction for occupied periods"
## [1] "fitting regression for unoccupied periods"
## [1] "done with prediction for unoccupied periods"
## [1] "leaving fitLBNLregress()"
## [1] "starting model run number 6"
## For time weights, tcenter =  2006-12-31 23:00:00 MST
## [1] "starting fitLBNLregress()"
## [1] "done determining occupied hours"
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## [1] "done setting up temperature matrix"
## [1] "fitting regression for occupied periods"
## [1] "done with prediction for occupied periods"
## [1] "fitting regression for unoccupied periods"
## [1] "done with prediction for unoccupied periods"
## [1] "leaving fitLBNLregress()"
## [1] "leaving makeBaseline()"

What next? Here, we recreate the function call to fitLBNLregress.

dataTime <- train$time
dataLoad <- train$eload
dataTemp <- train$Temp
predTime <- pred$time
predTemp <- pred$Temp

tempKnots = (c(40, 55, 65, 80, 90)-32)*5/9
doTemperatureModel<-T
verbose<-5

npoints = length(dataLoad)
t0 = min(dataTime,na.rm=T)
t1 = max(dataTime,na.rm=T)
deltaT = as.numeric(difftime(t1,t0,units="days"))
nsegments = max(1,ceiling(deltaT/timescaleDays))
segmentwidth = (npoints-1)/nsegments
pointlist = floor(sort(npoints-segmentwidth*(0:nsegments))+0.001)
nModelRuns = max(1,length(pointlist))

#for (irun in 1:nModelRuns)
irun <- 1
tcenter = dataTime[pointlist[irun]]
tDiff = as.numeric(difftime(tcenter,dataTime,units="days"))
tDiffPred = as.numeric(difftime(tcenter,predTime,units="days"))
weightvec = timescaleDays^2/(timescaleDays^2 + tDiff^2)

regOut = fitLBNLregress(dataTime, dataLoad, dataTemp, predTime, predTemp,
            tempKnots = tempKnots, weightvec=weightvec,
            intervalMinutes=intervalMinutes,fahrenheit=fahrenheit,
            doTemperatureModel=doTemperatureModel,verbose=verbose)
## [1] "starting fitLBNLregress()"
## [1] "done determining occupied hours"
## In fitLBNLregress, reduced tempKnots:
## [1]  4.444444 12.777778 18.333333 26.666667
## [1] "done setting up temperature matrix"
## [1] "fitting regression for occupied periods"
## [1] "done with prediction for occupied periods"
## [1] "fitting regression for unoccupied periods"
## [1] "done with prediction for unoccupied periods"
## [1] "leaving fitLBNLregress()"

What next? Here, we recreate the function call to findOccUnocc (from the first step inside fitLBNLregress). Note:

timeVec <- dataTime
loadVec <- dataLoad
tempVec <- dataTemp
#predTime
#predTemp
#tempKnots
#weightvec
#intervalMinutes
#fahrenheit
#doTemperatureModel
#verbose

minuteOfWeek = 24*60*timeVec$wday+60*timeVec$hour + timeVec$min
intervalOfWeek = 1+floor(minuteOfWeek/intervalMinutes)

# If we have temperature data then fit the time-of-week-and-temperature model

if (fahrenheit) {
    # temperature vector is already in fahrenheit
    tempVecF = tempVec
    tempVec = (tempVec-32)*5/9
    tempVecPredF = predTemp
    tempVecPred = (predTemp-32)*5/9
} else {
    tempVecF = (tempVec*9/5)+32
    tempVecPredF = (predTemp*9/5)+32
    tempVecPred = predTemp
}

# findOccUnocc requires Fahrenheit temperatures; everywhere else we can use either
#  Celsius or Fahrenheit, as long as temperature knots are set appropriately
#
# base occupied/unoccupied decision only on cases where we have load data:
okload = !is.na(loadVec)
occInfo = findOccUnocc(intervalOfWeek[okload],loadVec[okload],tempVecF[okload])
head(occInfo,40)
##       uTOW okocc
##  [1,]    2     0
##  [2,]    3     0
##  [3,]    4     0
##  [4,]    5     0
##  [5,]    6     0
##  [6,]    7     0
##  [7,]    8     0
##  [8,]    9     0
##  [9,]   10     0
## [10,]   11     0
## [11,]   12     0
## [12,]   13     0
## [13,]   14     0
## [14,]   15     0
## [15,]   16     0
## [16,]   17     0
## [17,]   18     0
## [18,]   19     0
## [19,]   20     0
## [20,]   21     0
## [21,]   22     0
## [22,]   23     0
## [23,]   24     0
## [24,]   25     0
## [25,]   26     0
## [26,]   27     0
## [27,]   28     0
## [28,]   29     0
## [29,]   30     0
## [30,]   31     0
## [31,]   32     0
## [32,]   33     1
## [33,]   34     1
## [34,]   35     1
## [35,]   36     1
## [36,]   37     1
## [37,]   38     1
## [38,]   39     1
## [39,]   40     1
## [40,]   41     1
tail(occInfo,20)
##        uTOW okocc
## [149,]  150     0
## [150,]  151     0
## [151,]  152     0
## [152,]  153     1
## [153,]  154     1
## [154,]  155     1
## [155,]  156     1
## [156,]  157     1
## [157,]  158     1
## [158,]  159     0
## [159,]  160     0
## [160,]  161     0
## [161,]  162     0
## [162,]  163     0
## [163,]  164     0
## [164,]  165     0
## [165,]  166     0
## [166,]  167     0
## [167,]  168     0
## [168,]    1     0

What next? Here, we demonstrate the occupancy detection algorithm within findOccUnocc. Note:

intervalOfWeek2 <- intervalOfWeek[okload]
loadVec2 <- loadVec[okload]
TempF <- tempVecF[okload]
#intervalMinutes
#verbose

# Figure out which times of week a building is in one of two modes
#  (called 'occupied' or 'unoccupied')

# RMV2.0 does not sort this vector. Although that doesn't matter,
# I prefer to have it sorted now.
#uTOW = unique(intervalOfWeek2)
uTOW = sort(unique(intervalOfWeek2))
nTOW = length(uTOW)

# Define 'occupied' and 'unoccupied' based on a regression
# of load on outdoor temperature: times of week that the regression usually
# underpredicts the load will be called 'occupied', the rest are 'unoccupied'
# This is not foolproof but usually works well.
#
TempF50 = TempF-50
TempF50[TempF > 50] = 0
TempF65 = TempF-65
TempF65[TempF < 65] = 0

if (verbose > 4) {
    cat("Fitting temperature regression...\n")
}
## Fitting temperature regression...
amod = lm(loadVec2 ~ TempF50+TempF65,na.action=na.exclude)

okocc = rep(0,nTOW)
cat("Detecting occupancy ...\n")
## Detecting occupancy ...
for (itow in 1:nTOW) {
    okTOW = intervalOfWeek2==uTOW[itow]
    # if the regression underpredicts the load more than 65% of the time
    # then assume it's an occupied period
    if ( sum(residuals(amod)[okTOW]>0,na.rm=T) > 0.65*sum(okTOW) ) {
        okocc[itow]=1
    }
    if (itow < 40) {
  cat(glue('[{format(t,width=3)}] {format(nunder,width=2)}/{format(ntotal,width=2)} -> {occ}',
           t=uTOW[itow],
           nunder=sum(residuals(amod)[okTOW]>0,na.rm=T),
           ntotal=sum(okTOW),
           occ=okocc[itow]
           ),'\n')
    }
}
## [  1]  0/52 -> 0 
## [  2]  0/53 -> 0 
## [  3]  0/52 -> 0 
## [  4]  0/53 -> 0 
## [  5]  0/53 -> 0 
## [  6]  0/53 -> 0 
## [  7]  0/53 -> 0 
## [  8]  0/53 -> 0 
## [  9]  0/53 -> 0 
## [ 10]  0/53 -> 0 
## [ 11]  0/53 -> 0 
## [ 12]  0/53 -> 0 
## [ 13]  0/53 -> 0 
## [ 14]  0/53 -> 0 
## [ 15]  0/53 -> 0 
## [ 16]  0/53 -> 0 
## [ 17]  0/53 -> 0 
## [ 18]  0/53 -> 0 
## [ 19]  0/53 -> 0 
## [ 20]  0/53 -> 0 
## [ 21]  0/53 -> 0 
## [ 22]  0/53 -> 0 
## [ 23]  0/53 -> 0 
## [ 24]  0/53 -> 0 
## [ 25]  0/52 -> 0 
## [ 26]  0/52 -> 0 
## [ 27]  0/52 -> 0 
## [ 28]  0/52 -> 0 
## [ 29]  0/52 -> 0 
## [ 30]  0/52 -> 0 
## [ 31]  0/52 -> 0 
## [ 32] 12/52 -> 0 
## [ 33] 40/52 -> 1 
## [ 34] 46/52 -> 1 
## [ 35] 46/52 -> 1 
## [ 36] 46/52 -> 1 
## [ 37] 46/52 -> 1 
## [ 38] 46/52 -> 1 
## [ 39] 46/52 -> 1
cat("Etc.\n")
## Etc.
occInfo = cbind(uTOW,okocc)

Visualization

To visualize the occupancy detection algorithm, I am using a customized HTML widget, mostly scripted in Javascript with Charts.js. (This is more interactive than an animation, and works better because it doesn’t try to animate the point cloud in the background.)

Try pointing the mouse over the time-of-week grid:

# Step 1. Export the data to the widget.
mydata=data.frame(x=TempF,y=loadVec2,ypred=predict(amod),tow=intervalOfWeek2,timeVec=timeVec)
#mydata[mydata$tow<10,]
dataByTOW=uTOW %>% purrr::map(~ mydata[mydata$tow==.x,])

#fitLine = unique(mydata[with(mydata,order(x)),c('x','ypred')])
# Using the magrittr pipe for a more object-oriented writing style.
fitLine = data.frame(x=TempF,y=predict(amod)) %>% unique() %>% arrange(x)

# Step 2. Display the widget.
library(mywidget)
occupancy_widget("hello, world!",dataByTOW, fitLine, 'r', mydata,
                 width='auto', height='auto')

Time of week

clock clock
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday

65%
Hello world

End of supplement 2

To summarize, we explored the call stack, showing how RMV2.0 invokes the automatic detection algorithm for occupancy schedules. We saw that there is a very simple linear regression (not yet the regression used in our final model) being used to draw a dividing line through our data, classifying it into low and high usage regimes. And we saw that a large number of holidays or unhandled daylight savings time shifts can confuse the algorithm.

Thank you for reading. Go back to article or supplements.