data analytics, Data to Insight, Resistance to Instrumentation, Smart Water Grid, Water Technology Innovations

Part IV: Data management and the phenomenon of data richness versus information poverty

In this week’s edition of the resistance to the effective use of instrumentation Oliver Grievson examines the subject of data management and the phenomenon of data richness versus information poverty and how this causes a resistance to the effective use of instrumentation

The modern water utility, depending upon its size, will produce a vast plethora of data from its instrumentation everyday. Let us take the example of water company a. It has 500 wastewater treatment works with each of them having monitoring on the final effluent for BOD, Suspended Solids, Ammonia and flow . Converting this into numbers this is 2,000 instruments recording at a frequency of every 15 minutes (or 96 times a day. The number of individual readings of data is equal to a staggering 192,000. In practice of course not every treatment works is monitored with an instrument and not at a 15 minute interval but of course some treatment works have a lot more than four instruments. The point of this quick calculation is that the water industry worldwide produces individual pieces of data numbering into the billions of different analysis everyday of the year. It would be fair to say that the industry is “data rich.”

The practicality of this is that it is virtually impossible to economically analyse all of this data! So what is the solution?

Quite simply tailor the data and convert it into the information that you need and for your end customer (within the business). So what does this look like?

The diagram shows typical stakeholders and is not meant to be all inclusive but includes groups of people who need to see different amounts of data and different amounts of information and different levels of information. The billing department needs to measure what the consumption of consumers is and bill them appropriately. The operators at the treatment works needs to see how the process is working and make adjustments as necessary.

In real terms though what does this actually mean?

Let’s take a typical wastewater treatment works with an activated sludge plant. The treatment process would typically have an aeration source, MLSS measurement, dissolved oxygen measurement, RAS pumps, RAS suspended solids, flow measurement and maybe some quality measurement as well in the main reactor. If the settlement tanks are included as well then rotation sensors and sludge blanket detection can be added as well, if the belmouths are actuated then the amount that the actuator is open. Additionally add the amount of sludge wasted.

Using the exercise that we performed earlier our activated sludge plant has 10-15 instruments measuring every 15 minutes or 960 -14,400 different pieces of data a day. This is an underestimate! In order for the operators to operate this work data analysis becomes the majority of that person’s day. In reality what can be done? With a bit of data manipulation there are only a few pieces of information that the operator actually needs reducing the data to a more manageable level.



Inlet Flow Average & Peak Flow (Inlet & RAS)
RAS FLow Sludge Age
SAS Flow Average Mixed Liquor Concentrations
MLSS Concentration Low, Average & High DO Concentrations
RAS Concentration Amount of Sludge Wasted (kg)
DO Concentration Average & Peak Sludge Blanket Level
DO Control Valve position SSVI3.5
Settlement Tank Blanket Levels Solid Flux Ratio (Actual v design)
Effluent Ammonia Concentrations Total & average load
Effluent Solids Concentrations
and so on…
Amount of data: 14,400 figures Pieces of Information: 15

This is a grossly simplified example and the point being is that the difference between amount of data that is produced versus the amount of information that is needed to run the treatment process is vast.

Now to convert the data into information is an incredibly large task, especially to tailor it to the different stakeholders and it is this enormity of the task to convert  the data that it is produced by instrumentation and converting it to information that creates the resistance to the effective use of information.

There are some companies in the industry that are attempting this task at this very time updating the systems that they have to take into account the different need of the stakeholders and of course the company and it will be these companies that hold the advantage in the future.


About noahmorgenstern

Entrepreneurial Warlock, mCouponing evangelist, NFC Rabbi, Innovation and Business Intelligence Imam, Secular World Shaker, and General All Around Good Guy


7 thoughts on “Part IV: Data management and the phenomenon of data richness versus information poverty

  1. Wonderful article! I like to think of data as musical notes and information as a symphony; or data as nucleotides and information as the human genome.

    We are presently working on a project that has well over 20 GB of data but virtually no information. While data is critical, we make decisions (hopefully!) based upon information and information assimilated into knowledge; hence, knowledgeable (information-rich) decision-making.

    Posted by John B Cook | February 17, 2012, 1:26 pm
  2. Sorry Noah but I have to disagree on a number of points. Coming from a biotech background, the amount of data that typically is dealt with in this industry is miniscule. I’m not trying to under-rate the importance of that data, but merely point out that there are many other “data rich” industries which dwarf this one who run into the same challenges of how to extract information from data. A single next generation mass spectrometer can produce a terabyte of biological data in a month, a single next generation genome sequencer can generate that in a single run.

    Like most of these challenges, if you can create an adherable standard around the gathering of data, creating automated and easily replicated processes to deal with that data which in turn create your information becomes the challenge you are looking at.

    Some fantastic “turn data into useable information” products that I have personal experience with are ; commercial offering – Pipeline Pilot from Accelrys or a very good open source product called KNIME ( .



    Posted by Sheldon Foisy | February 17, 2012, 4:30 pm
  3. I agree with John. This was a good article. Data collection today is becoming more important as time progresses on. The advancements that we see in both instrumentation and general SCADA in the last few years has enabled us to provide the types of end data users can find most useful.

    The great thing I have found is to utilize the technology in the field end and have it (remote data loggers) pre-process and consolidate data before transmission back to the central location.

    You often find that not all of your stake holders for the data are not at the central location but can also be located at the remote site. The nice thing is that “big picture” data can be transferred to the remote site via peer-to-peer communications. Data products can range from different logged data collections to real-time displays on the remote end.

    Designing for the needs of the stakeholders and having the ability to leverage it within the system often gives us unique programming challenges to solve in supporting the system. Oliver’s article covered it quite well.

    Posted by Dave Gunderson | February 17, 2012, 7:46 pm
  4. @Sheldon –

    First, the credit for the article goes to Oliver Grievson. I am simply the vehicle for it’s dissemination.

    The water industry is definitely lagging behind other data rich industries, that’s for certain. I think a better comparison would be to the energy industry. That industry was highly successful in speeding up the commercialization and utilization of such data to knowledge technologies because they were able to band together and like you said (a) standards were discussed early on, but also (b) the financial motivation for optimization and efficiency were recognized early (c) quantified (giving an ROI to managers) (d) change managers were used to make sure the constituents internally were brought to understand the need from their own perspectives and everyone could “buy in” to the system so that a successful implementation of ICT, smart technologies, and process automation and control would be realized successfully rather than some sort of resistance being the break in the chain that caused the technologies potential to fail.

    Taking that comparison a bit further you already see vast amounts of instrumentation available that can provide this data, and technologies are arriving daily that are “sensor agnostic” which can capture and log the data regardless of make or model and data format. I think these type of data management, analytics, and enterprise management tools are going to be the first to be utilized by early adopters. Specifically, the energy industry recognized the need once telemetry was in place to use software tools for alarm rationalization or severity management to optimize resource utilization. They also recognized pattern recognition software that used advanced algorithms to detect anomalies would greatly improve early warnings on critical errors in the plant. When you look at the industry today complete expert systems replaced legacy it systems to unify the data management and transfer between divisions, facilities, and constituents. Even further you see full automation and processes handled through this unification of data, correlation, situation analysis, and automated procedures recommended or initiated to optimize operating activities.

    @John – Through your efforts I think its fairly clear to all our blog readers that the mathematical possibility exists to optimize most processes in a water treatment and WWTP, so long as you can build the correct model and utilize that model in a properly designed closed/open loop system that takes the data, analyses it, and automates and controls processes. I think the potential for reductions in opex using these techniques are endless.

    @Dave – you bring up a great point. If the field teams and mobile workforce is not taken into account the potential of these new technologies are not being realized. Like you said through peer to peer transfer technologies information can be shared in real time on a PDA or other mobile device and information relayed for an even more accurate “big picture” view.

    Noah Morgenstern

    If you would like to find information on Whitewater’s data driven technologies you can download the data sheets in the widget on the blog in the right hand toolbar:

    -BlueBox – event detection system (anomaly recognition) alarm rationalization, and analytics

    -WaterWall – enterprise operational management software (Situation awareness and operations)

    Posted by noahmorgenstern | February 18, 2012, 7:43 am
  5. You can also ask why the Water Industry hasn’t developed as the energy industry or in deed the biotechnology industry. This enters into the perceived low value of water and especially wastewater treatment & supply. For a packet of aspirin (lets say) $1 you “buy” a 1000L of water, both are remarkable products that save or maintain lives but there is a bit of a cost differential there.

    The value of water is under-rated especially considering a poll in 2007 by the British Medical Journey stated that Sanitation was the most important breakthrough since 1840 and the day’s of Dr Chadwick & Dr Snow and is directly attributable for increasing life expectancies by approximately 30 years!

    Yes the water industry is behind most other industries because to my mind at least it is the great “hidden industry” (and rightly so) as people open their taps and clean water comes out and flush their toilets and their sewage disappears

    Posted by Oliver Grievson | February 18, 2012, 8:43 am
  6. Oliver makes a great point:

    Water is not valued at market level in many places around the world. Why is it that water is viewed as a right and not as a commodity (ie. Oil). He also hit on the water as a “buried asset” It’s out of sight out of mind.

    Seth Johnstone is creating a great series directly highlighting these topics. Please check out his first post if you have not already:

    Outta Sight Outta Mind or Outta Mind Outta Site: Clean Water and Why We Forget About It

    Posted by noahmorgenstern | February 18, 2012, 9:11 am
  7. @Sheldon,

    You make a good point, and one that is not in disagreement with Oliver’s, that there are indeed many industries that have copious amounts of data but obtain less-than-optimal information from them. You have given another excellent example of how easily data “explodes” as the technology becomes more sophisticated. But, of course, Oliver was particularly addressing the problem in the water/wastewater industry.

    I see this problem in other industries, but, generally speaking, they have established ways in which the data is transformed into information and put to a greater level of application. That is the chief difference as to what I have observed in practice. But you are correct in that all industries have the ability with current technology to generate more data than can be optimally leveraged, and some industries generate far more than others.

    Posted by John B Cook | February 20, 2012, 5:20 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: