data analytics, Data to Insight, Forecasting

Forecasting Water Demands: Part II

In Part I of this series by John Cook and Edwin Roehl of ADMI we learned that sensitivity analysis quantifies the relationships between a dependent variable of interest and causal variables, e.g., we know demand is somehow dependent on ambient temperature and precipitation. Computing sensitivities requires defining the relationships between variables through modeling. Models generally fall into one of two categories, deterministic and empirical. Deterministic models are created from first-principles equations, while empirical modeling adapts generalized mathematical functions to fit a line or surface through data from two or more variables. Calibrating either type of model attempts to optimally synthesize a line or surface through the observed data. Calibrating models is made difficult when data has substantial measurement error or is incomplete, and the variables for which data is available may only be able to provide a partial explanation of the causes of variability. The principal advantages that empirical models have over deterministic models are they can be developed much faster and are more accurate when the modeled systems are well characterized by data. However, empirical models are prone to problems when poorly applied. Overfitting and multicollinearity caused by correlated input variables can lead to invalid mappings between input and output variables (Roehl et al. 2003).

It is well known that the most common empirical approach to demand forecasting is ordinary least squares (OLS), which relates variables using straight lines.  Sometimes, if more variables are involved, one sees the use of planes, or hyper-planes whether the actual relationships are linear or not. Ballard (2003) suggests, “Given the changing nature of technology and the globalization of business and financial markets, it is becoming increasingly important to be able to more quickly and accurately predict trends and patterns in data in order to maintain competitiveness. More specifically, it is becoming increasingly important for forecasting models today to be able to detect nonlinear relationships while allowing for high levels of noisy data and chaotic components.”  So the issue becomes one of being able to model chaotic behavior in order to capture more complex forecasting.  Accordingly, Ballard reviewed the use of artificial neural networks (ANNs), a “machine learning” technique from the field of AI, in several financial prediction applications including securities management, fraud detection, risk modeling, stock price forecasting, and forecasting macroeconomic variables such as GDP[1]. Charytoniuk et al. (2000) described how ANNs can be used to forecast electric power demand in markets transitioning to deregulation. Their approach created different customer classes, modeled the historical demand of each class with ANNs, and then aggregated the predicted demands to produce a total demand estimate. Jensen (1994) provides details of the “multi-layer perceptron” (MLP) ANN, the type used in the applications described by Ballard, Charytoniuk et al., and the approach used by the authors. MLP ANNs can synthesize functions to fit high-dimension, non-linear multivariate data. Devine et al. (2003) and Conrads and Roehl (2004) describe their use in multiple applications to model and control combined man-made and natural systems including disinfection byproduct formation, industrial air emissions monitoring, and surface water systems impacted by point and non-point source pollution.

The “chaotic components” mentioned by Ballard alludes to the dynamic nature of variable relationships that change in time. Chaos Theory provides a conceptual framework called “state space reconstruction” (SSR) for representing dynamic relationships. Data collected at a point in time can be organized as a vector of measurements, e.g., element one of the vector might be the demand, element two the rainfall, and so on. Engineers will say that a process evolves from one state to another in time and that a vector of measurements (a.k.a. a “state vector”), represents the process’ state at the moment the measurements were taken. A sequence of state vectors represents a “state history.” Mathematicians will say that the state vector is a point in a “state space” having a number of dimensions equal to the number of elements in the vector, e.g., eight vector elements equates to eight dimensions. Empirical modeling is the fitting of a multidimensional surface to the points arrayed in state space.

Chaos Theory proposes that a process can be optimally represented (reconstructed) by a collection of state vectors Y(t) using an optimal number of measurements, equal to “local dimension” dL, that are spaced in time by integer multiples of an optimal time delay td (Abarbanel 1996)[2]. For a multivariate process of k independent variables:

Y(t) = {[x1(t), x1(t – td1),…, x1(t – (dL1 – 1)td1)],….,[xk(t), xk(t – tdk),…, xk(t – (dLk – 1)tdk)]}         eq. 1

where each x(t,tdi) represents a different dimension in state space, and therefore a different element in a state vector. Values of dL and td are estimated analytically or experimentally from the data. The mathematical formulations for models are derived from those for state vectors. To predict a dependent variable of interest y(t) from prior measurements (a.k.a. forecasting) of k independent variables (Roehl et al. 2000):

y(t) = F{[x1(t – tp1), x1(t – tp1 – td1),…, x1(t – tp1 – (dM1 – 1)td1)],

….,[xk(t – tpk), xk(t – tpk – tdk),…, xk(t – tpk – (dMk – 1)tdk)]}                                eq. 2

where F is an empirical function such as an ANN, each x(t,tpi,tdi) is a different input to F, and tpi is yet another time delay. For each variable, tpi is either: 1) constrained to the time delay at which an input variable becomes uncorrelated to all other inputs, but can still provide useful information about y(t); or, 2) constrained to the time delay of the most recent available measurement of xi; or, 3) the time delay at which an input variable is most highly correlated to y(t). Here, the state space local dimension dL of Equation 1 is replaced with a model input variable dimension dM, which is determined experimentally. dM £ dL, and tends to decrease with increasing k.

The ability to model chaotic behavior is critical to the success of being able to accurately predict demand forecasting because, as it is well known, weather plays a major role in water demand and weather behaves highly chaotically.  The next article will demonstrate how theory is applied to practice.


Abarbanel, H.D.I., 1996, Analysis of Observed Chaotic Data, Springer-Verlag New York, Inc., New York, 4-12, 39.

Ballard, R., 2003, “Forecasting with Neural Networks – A Review,” National Social Science J., Feb. 24, 2003.

Conrads, P.A. and Roehl, E.A., 2004, “Integration of Data Mining Techniques with Mechanistic Models to Determine the Impacts of Non-Point Source Loading on Dissolved Oxygen in Tidal Waters,” In Proc. South Carolina Environmental Conference, Myrtle Beach, March 2004.

Charytoniuk, W., Box, E.D., Lee, W.J., Chen, M.S., Kotas, P., and Van Olinda, P., 2000, “Neural-Network-Based Demand Forecasting in a Deregulated Environment,” In IEEE Transactions on Industry Applications, 36(3).

Devine, T.W., Roehl, E.A., and Busby, J.B., 2003, “Virtual Sensors – Cost Effective Monitoring,” In Proc. Air and Waste Management Association Annual Conference, June 2003

Jensen, B.A., 1994, Expert Systems – Neural Networks, Instrument Engineers’ Handbook Third Edition, Chilton, Radnor PA.

Roehl, E.A., Conrads, P.A., and Roehl, T.A., 2000, “Real-Time Control of the Salt Front in a Complex, Tidally Affected River Basin,” Proceedings of the Artificial Neural Networks in Engineering Conference, St. Louis, 947-954

Roehl, E.A., Conrads, P.A., and Cook, J.B., 2003, “Discussion of Using Complex Permittivity and Artificial Neural Networks for Contaminant Prediction,” J. Env. Engineering., Nov. 2003, pp. 1069-1071.

Weiss, S.M. and Indurkhya, N., 1997, Predictive Data Mining: A Practical Guide, Morgan Kaufmann.

[1] Many types of ANNs are used to solve different kinds of problems, and are part of the data mining toolkit.

[2] In Chaos Theory, dL and td are called “dynamical invariants”, and are analogous to the amplitude, frequency, and phase angle of periodic time series.


About noahmorgenstern

Entrepreneurial Warlock, mCouponing evangelist, NFC Rabbi, Innovation and Business Intelligence Imam, Secular World Shaker, and General All Around Good Guy


No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: