FAQs – The Long Answers

Following is more in-depth information about NOAH’s primary modeling techniques.
What is Artificial Intelligence
Artificial Intelligence or AI was the term coined for the seminal summer workshop convened at Dartmouth University in 1956, which included pioneers in this new and exciting area of research. Participating luminaries included Marvin Minsky and John McCarthy, who later founded the AI laboratories at MIT and Stanford University, respectively. Today, AI refers to using computers to do critically important tasks that require human intelligence, allowing for processing and analyzing larger amounts of data more quickly and effectively than humans can.
The theoretical inception of AI originates from the seminal work of the brilliant mathematician, Alan Turing, who famously cracked the Nazi code during World War II. While Turing’s work was theoretical in nature, beginning with the pioneering work of McCullough and Pitts in 1940, scientists have developed mathematical structures and learning algorithms in an attempt to mimic human learning.
From this time forward, AI has evolved through a series of theoretical and practical breakthroughs over many decades. The underlying theory of AI is predicated on the principle that human learning can be replicated by a brain-like interconnected mathematical structure that learns select behavior by processing representative data patterns through its architecture.
In the 1960’s, the most fundamental AI learning algorithm, back-propagation, was invented. It was not until the 1980’s, however, that the mathematical foundation was established to prove that AI could theoretically learn even the most complex non-linear behavior.

Still, it took the recent convergence of automatic data collection systems (i.e., Big Data) with incredibly powerful computing technology for AI to truly come of age. Today, AI technology is embedded in virtually all sectors, from the most obvious like science and technology to the far less obvious like social sciences and even art. Although there are various AI structures developed for specific types of applications, the multi-layered artificial neural networks, commonly referred to as ANNs, are the most commonly used AI structure, and are presented in some detail below.

How are ANN models different from traditional physics-based and statistical-based models?
Unlike physics-based models like numerical groundwater flow models, ANNs do not rely upon governing physical laws (e.g., Conservation of Momentum). Consequently, difficult to estimate physical parameters (e.g., hydraulic conductivity, streambed thickness, etc.) are typically not required for their development and operation. Instead, more easily measurable and less uncertain variables like water levels and weather conditions can be used as inputs or predictor variables. 
Additionally, unlike physical-based and statistical models, ANNs are not constrained by simplifying mathematical assumptions (e.g., linear system, normal distribution, etc.) and/or physical assumptions (e.g., laminar flow).  Because of their powerful non-linear modeling capability (see below), ANNs can accurately model highly non-linear and complex phenomena. In addition, unlike advanced physics-based numerical models, ANNs can be initialized to real-time conditions, improving prediction accuracy.   

What is the mathematical foundation for ANNs?

What distinguishes ANN from regression is the famous Kolmogorov’s Theorem. The theorem asserts that any continuous function can be represented exactly by a three-layer feedforward neural network with n elements in the input layer, 2n+1 elements in the hidden layer, and m elements in the output layer, where n and m are arbitrary positive integers. This mathematical theorem establishes the singular power of AI for learning and accurately simulating complex systems to a degree of accuracy previously unattainable. Furthermore, the presence of common arcs in the ANN architecture allows it to identify important interrelationships that may exist between input and output variables, providing critical insights into system behavior that would otherwise remain unknown.

How much data is required for robust ANN development?

Robust ANN development is dependent upon the quantity and quality of the data used to train the models. The appropriate training set size used for ANN learning depends upon a number of factors, including the required prediction accuracy, the probability distribution of behavior, the level of “noise” in the system, the complexity of the system, and the size of the ANN model. 

What other functions can ANNs perform besides prediction?

ANN models can perform a number of important tasks. This includes but is not limited to automatic quality assurance/quality control by flagging erroneous data, improving system understanding via sensitivity analyses, filling in missing data, improving data collection strategies, filtering noise, and serving as “meta-models” for more complex physics-based models for both prediction and optimization.

What are some of the prediction and data analyses problems to which NOAH has applied AI?

As our select case studies in part show, we have applied AI to a variety of challenging water resources prediction and management problems.
Applications include accurately predicting and simulating hydraulic states and water quality conditions in groundwater, surface water, and water distribution systems. We have also used AI to accurately predict water and energy demand for large regional areas. From system characterization and engineering analysis perspectives, NOAH personnel have also used AI to filter out tidally induced hydraulic effects from groundwater data and to help identify defects in a multi-million-dollar cut-off wall installed to protect a river from contamination. NOAH has also used AI to correctly flag erroneous data, identify important cause and effect relationships, increase system understanding, improve data collection strategies, and perform rigorous mathematical optimization.

What is mathematical optimization?

Optimization is a Nobel Prize winning mathematical branch used to solve highly complex resource allocation problems. In mathematics, computer science, and economics, optimization or mathematical programming refers to selecting the best element or solution from some set of available alternatives, which for continuous cases, are infinite.
Optimization explicitly expresses a management problem within a logical and transparent mathematical formulation that can be solved using optimization algorithms. The optimization formulation consists of an objective function and constraint set, mathematically expressed at least in part in terms of decision variables for which the optimal values are unknown. The decision variables are not only the most basic component of the optimization formulation but are also the motivation for solving the problem. They constitute the human controls for which the decision maker is seeking to identify optimal values, such as the optimal pumping rates for minimizing energy consumption or achieving maximum water quality. The constraints represent resource and management limits, and when properly posed constrain the solutions within some feasible space
The optimization program not only computes the optimal values for the decision variables, but also generates a sensitivity analysis of how the optimal solution changes with different imposed constraint limits and coefficients. For example, how much more reductions in energy costs can be achieved by a unit increase in the allowable groundwater level decline at a particular location. In short, the optimal solution provides the optimal values for each of the decision variables that collectively produce the lowest feasible objective function value for a minimization problem (e.g., minimize costs) or the highest objective feasible objective function value for a maximization problem (e.g., maximize profits) without violating any constraints.

How did mathematical optimization originate and develop?

The origin of mathematical optimization dates back to Karl Gauss (1777-1855), considered by many the greatest mathematician in world history. Gauss developed the steepest descent method, an algorithm where the local minimum of a mathematical function is identified by successively computing and transecting the function along its gradient. 
Because many resource allocation and decision-making problems seek to minimize an objective (e.g., cost) or maximize an objective (e.g., profit) subject to various constraints, the basis of Gauss’s algorithm has been used to solve many classes of optimization problems. 
A number of other prominent mathematicians have contributed to this rich and important field.  Leonid Vitaliyevich Kantorovich is a famous Russian mathematician known for his theory and development of techniques for optimal resource allocation, the earliest form of linear programming, for which he was awarded the Nobel Prize in Economics.  George Dantzig independently invented and improved linear programming (Simplex Method) while at the University of Berkley.  John von Neumann, considered by many the greatest mathematician of the 20th century, developed the duality theorem for linear programming.  Other noteworthy contributors to the field include Richard Ernest Bellman, who invented dynamic programming, and Albert Tucker and Harold Kuhn, who made seminal contributions to non-linear programming. 

Why is mathematical optimization superior to typical trial and error approaches?

So-called “optimal solutions” identified for complex management problems are often far from optimal.  Rather than being identified by sophisticated optimization algorithms, the “optimal solution” is often identified by the vastly inferior trial and error approach, whereby the decision variables in a model are systematically changed until a solution deemed acceptable is selected. 
Not only is trial and error highly inefficient and time consuming, it may only succeed in identifying the least poor of a limited number of solutions or scenarios simulated by the modeler.  As the system becomes more complicated, with more decision variables, constraints, and multiple and even conflicting objectives, the system becomes far too complex for human experience, intuition, and analysis to deduce potentially good solutions. 
Mathematically, there often exist an infinite number of possible solutions within the feasible decision space.  Mathematical optimization uses sophisticated algorithms to efficiently search the feasible decision space to converge to local optima (for non-concave non-linear problems) if not the global optimum (for linear or concave problems). In addition, multi-objective optimization generates the formal trade-off curve for multiple and conflicting objectives from which the optimal compromise solution can be identified using a variety of methods. 
Studies show that optimization achieves substantial improvements in solutions, including cost savings. For example, the United States Environmental Protection Agency conducted a formal optimization study of Superfund sites undergoing groundwater contaminant remediation with recovery wells. They concluded that a “20 percent in reduction of the objective function (i.e., cost) is typical” and that “improved pumping strategies at some sites could yield millions of dollars in life-cycle cost savings at some sites.”

What is multi-objective optimization?

There are often management decision or resource allocation problems with multiple and even conflicting objectives for which the decision maker is interested in identifying the optimal trade-off or compromise solution.  A typical example is to maximize water quality to the extent possible while minimizing water treatment costs. 
Multi-objective optimization utilizes optimization techniques for generating a formal trade-off or Pareto frontier between multiple and conflicting objectives.  Following generation of the Pareto frontier, the optimal compromise solution or trade-off point can be identified using a variety of methods in accordance with the preferences and priorities of the decision makers.
The Toms River, New Jersey case study provides a real-world water management problem solved by multi-objective optimization, whereby rigorous decision-making methodologies were applied to identify the optimal trade-off between conflicting objectives in accordance with the preferences and priorities of the stakeholders and decision makers.

What are some of the problems to which NOAH has applied optimization?

As our select case studies in part show, we have applied formal optimization to a diverse number of challenging water management problems. Assets to which we have applied optimization include public supply wellfields, conjunctive management of multiple assets including groundwater, reverse osmosis, and water blending, water distribution systems, and groundwater contaminant control and recovery systems. Applications include minimizing energy costs, maximizing water supplies, optimizing consumer provided water quality, minimizing salt-water degradation of a regional aquifer, minimizing wellfield vulnerability to groundwater contamination, and ensuring a sufficient water supply under uncertainty. In a number of these cases, multiple objectives were simultaneously optimized.

Are there different versions of NOAH’s decision support system?

While the most advanced version of our system integrates AI and optimization with real-time data streams, it can easily be customized to fit your specific needs. In some instances, only AI analysis and predictions are necessary. In other cases, optimization may be the feature of choice. Models can be integrated with real-time data streams, or stand alone with data initialization and model deployment as necessary. Furthermore, as discussed below, other models may be integrated into the decision support system as needed.

Does NOAH Global Solutions have additional modeling expertise?

NOAH’s modeling expertise is not limited to AI and optimization. Our expertise spans a wide variety of modeling and mathematical methodologies, including but not limited to physics-based numerical and analytical models, statistics, fuzzy logic (another form of AI), dynamic systems theory, and multi-objective analysis.
We have extensive experience in industry standard modeling software programs.
Both theory and practice demonstrate that the best model is not necessarily the most sophisticated or advanced model. Often, as we have demonstrated, a simpler model is best for the problem at hand.
Model selection depends on many factors, including the physical and mathematical nature of the problem, data availability, modeling and prediction needs, time, and resources. In some cases, multiple models of different levels of complexity can be used in combination to both validate and support each other, while in other cases one model is all that is needed.