Three of the most common myths about using reinforcement learning based optimization technology in closed loop. Busted.
By Allison Buenemann, Product Marketing Manager at Imubit
The application of reinforcement learning (RL) in the optimization and control of industrial processes, while successfully proven, is still relatively new in an industrial software market where incumbent closed loop technologies have been deployed for decades. This relative newness, coupled with the AI hype driven by ChatGPT and other forms of generative AI, has created a lot of confusion about the different types of AI and the roles they can play in the critical process industries.
This blog post doesn’t attempt to define the entire industrial AI landscape (that’s a full white paper – Demystifying Industrial AI: From Mainstream Approaches to State-of-the-Art), and there are certainly well more than 3 myths to be busted with regards to the use of RL in closed loop process optimization. But the journey must begin somewhere, so here are 3 of the most commonly encountered myths, busted.
Myth #1: Data-First = Data-Only
It’s true that many AI-first technologies begin by building a robust process model from a plant’s actual historical process data, rather than from kinetic or first principles models. But data-first is only the beginning. Similarly to how a traditional model becomes a hybrid model when AI is added on top of a first principles model, first principles are incorporated into an AI-based process model. By taking the data-first approach, models aren’t foundationally burdened with limiting assumptions or human biases. The models are however, guided by first principles during the development process, in which every modeling step involves first principles choices and decisions by process industry SMEs, based on their chemical engineering background and real world experience. The result is just enough guardrails to keep us on track, but not so much as to limit the evolving trajectory of the track provided by the actual process data.
Myth #1 Busted: A data-first approach in AI models allows for the flexibility to incorporate first principles, combining the best of both worlds without limiting assumptions.
Myth #2: RL models don’t handle model mismatch or prediction error, requiring frequent, time-intensive retrains
When an offline RL trained optimizer is used rather than an online solver, you’re working within the known space of the trained model. However, for a large neural network, the scope of that predefined space can be the entire operational history of the process unit. During model training, noise is deliberately injected, allowing the model to learn from not only the actual historical scenarios, but also millions of subtle variations of history. This breadth of training data makes the RL model inherently more robust to prediction error than traditional model predictive control (MPC). Given full (and then some) historical context, there are many cases when the change in conditions is close enough to something the model has experienced in training that it can infer what action to take without retrain. In fact, the frequency at which the models deployed in the refining industry encounter an entirely unknown scenario is no more frequent than the maintenance interval for a traditional APC or optimization solution. And luckily, when wildly new operations are encountered, this model retrain just requires a few quick hours overnight to wake up with a whole new span of knowledge.
Myth #2 Busted: RL models don’t always require frequent, time-intensive retrains; they are robust to model mismatch, can adapt to similar conditions without retraining, and when retraining is needed, it’s quick and efficient.
Myth #3: Support is limited to the RL technology vendor
When RL was brand new to the industrial landscape, this myth was certainly true! However, as the buzz around AI & RL has grown, so has the number of parties interested in understanding, implementing, and supporting it. While AI intrigue is one source of second- and third-party practitioner recruitment, ease of use is another. A no-code user interface goes a long way to lowering the barrier to entry for third party or in-house customer support teams. Building cutting edge algorithmic technology into a modern software platform lets you leverage your existing employee base of domain experts – process engineers, controls, P&E, and operations – no need to hire additional AI competency to support the technology.
Myth #3 Busted: Several of the largest global energy companies are currently training their own RL closed loop models without hiring AI experts. This is democratizing AI in their organizations, and leveraging the intrigue of modern technology to attract and retain talent.
We hope that you leave this page with a clearer understanding of the details behind a few of the common misconceptions associated with reinforcement learning.
And if you’ve still got questions, check out Imubit CTO Nadav Cohen’s talk, “A Taxonomy of Process Control and Optimization: From Model Predictive Control to Reinforcement Learning,” at Transcend Houston September 11, 2024.