Blog

The latest news and updates from Imubit.

You’re ready for AI, but is your data?

Dec 20, 2024

Implementation experts share the data readiness best practices that will ensure your AI Optimization (AIO) project is successful.

By Jennifer Shine, Principal Solution Engineer, Imubit

Artificial intelligence (AI) projects in the process industries require a high level of data quality in order to deliver on their value potential. When you’re talking about AI for closed loop process optimization and control that quality threshold is even higher. The good news? Through proactive planning, early identification of potential issues, and a well-structured strategy, you can keep AI projects on time, on budget, and ensure quality KPIs are met. Drawing on experience from nearly a hundred Closed Loop AI Optimization (AIO) project implementations, we’ve assembled the minimum data requirements. 

 

1. Data Volume

“The first thing I learned from my academic mentor is that the only thing better than data is more data,” said Imubit CTO Prof. Nadav Cohen in our recent Demystifying Industrial AI webinar series. The more data an AI model is provided to learn from, the broader the experience set the model gains and the better it will perform. To support robust modeling and achieve meaningful results, we recommend 6+ months of unit process data and at least 200 lab samples to support inferentials. This can typically capture a wide range of operating scenarios. We most often see data storage frequency of 1-minute intervals on a business network historian. Neural networks thrive on large, high-quality, and diverse datasets, as their performance and accuracy depend on the richness and reliability of the data used.

 

2. Data Compression

Data compression can impact the accuracy of what the model learns from historical data. Often, consideration is not given to historization and compression settings when instrumentation is added or modified. For instance, re-ranging an instrument without adjusting the default compression settings can significantly reduce data movement visibility. It’s important to validate compression settings on your historian as part of the site’s change management process. We recommend data compression settings be changed to record a data point every 1 minute for all process PVs and control loop data. The sooner you change those settings, the sooner your AIO project can start adding value!

 

3. Data Extraction

High-frequency historized data is often challenging to extract and export for use in cloud computing infrastructures. This limitation is an unfortunate hurdle encountered early in the adoption of promising new AI technologies. The problem is twofold: legacy systems, including historians and computing hardware, lack seamless export capabilities and their outdated data backup strategies exacerbate the problem. To address these limitations, organizations should invest in basic database retrieval tools and modernize data backup strategies to enable retrieval of this highly valuable process data. Establishing a process to access your data effectively now, will ensure preparation for projects in 2025 and beyond.

 

4. Data Quality

Concerns about data quality typically come up early in project discussions as missing or imperfect data can create challenges during the model building and training process. Luckily, modern data analytics practices make locating these problematic periods of time and applying data cleansing techniques possible and speedy. Combining data cleansing with robust pre-processing steps results in data that consistently meets the requirements for constructing machine learning models.


5. Instrumentation Considerations

When designing AI models for closed loop optimization and control applications, it’s important to include data related to how the instrument got to its process value. Verify the completeness of your historized dataset by ensuring all components of primary control loops, such as .PV, .SPT (.SP), and .OUT (.OP), and .Mode (.MD) are properly recorded. If advanced automation systems are in use, ensure their key parameters are captured, including error/status codes, upper limits, targets, and ON/OFF statuses. Additionally, confirm that product prices and lab samples are historized. Lab sample data are most helpful when backdated to the timestamp when the sample was pulled. A proactive instrumentation maintenance and repair program will ensure instrumentation is appropriately ranged and calibrated. 


6. Existing Controls Infrastructure

Readiness of time series data should be assessed in tandem with base layer control system readiness. First, evaluate control loop operation by determining whether primary control loops are operated in automatic/cascade or manual mode. Next, assess manual field manipulations to identify any control loops requiring manual intervention. Consider the regulatory performance to ensure the base layer control performance is acceptable. Analyze data movement to confirm there is sufficient variability in key variables and tolerances are set appropriately. 

Lastly, focus on understanding and addressing latency and synchronization issues within business IT and OT networks by identifying delays in data collection or processing. Take steps to ensure all servers are accurately synchronized with a master timekeeper.

 

You and your data are ready for your AI journey!

By meeting these data requirements and addressing common issues, you set the foundation for successful implementation of AIO in your plant. Imubit’s expertise and tools can help you navigate data challenges and ensure your project achieves its goals. 

To learn more about Closed Loop AI Optimization (AIO) and how it’s revolutionizing the process optimization industry, visit AIO resource center.

Unlock your plant's Untapped Value.

It all starts with Imubit.

REQUEST A DEMO