Author: Partha Deka and Rohit Mittal
Today’s trend of Artificial Intelligence (AI) and the increased level of Automation in manufacturing allow firms to flexibly connect assets and improve productivity through data-driven insights that has not been possible before. As more automation is used in manufacturing, the speed of responses required in dealing with maintenance issues is going to get faster and automated decisions as to what’s the best option from an economic standpoint are getting more complex.
Prescriptive Maintenance is a paradigm shift that fosters moving from a strict dependence on planned events to being able to take real-time action from actual events. Prescriptive maintenance leverages advanced software to pattern identification points to explicitly diagnose root cause issues and then indicate precise and timely actions to change the outcome in a timely manner. Industrial firms are using prescriptive maintenance and analytics software to reduce or eliminate unplanned downtime and maximize profitability and equipment reliability. AI-enabled Prescriptive Maintenance is unique in that instead of just predicting impending failure it strives to produce outcome-focused recommendations for operations and maintenance from analytics.
Driving Up Productivity:
Productivity is one of the three basic elements that manufacturers are seeking along with cost and quality. Today, there’s an estimated $500 billion worth of machine tools in place to help firms manage their industrial equipment. However, modern firms are looking to go beyond preventive maintenance to enable prescriptive maintenance systems. For example, reduction of downtime is important for industrial equipment and machinery to drive up productivity and overall operational equipment efficiency. The OEE golden standard is only at about 80%. Exceeding these efficiency levels have proven difficult without manual effort or costly process.
A key aspect for implementing prescriptive maintenance and analytics is to include next-generation digital core technologies including AI, machine learning, IoT connectivity, collaboration and advanced analytics. These tools must be flexible, scalable, and easily integrated into legacy IT infrastructure. They allow organizations to integrate business process and transactional data from back office systems with massive amounts of structured and unstructured data from various sources. Advanced analytics can then be embedded in the data, on a timely basis, across the digital core enabling the organization to derive new insights, such as predicting outcomes or proposing new actions.
Simulation and Flexible Modeling:
Aggregating and ingesting data across a unified digital core is just the first of the requirements to achieve digital transformation. To optimize the outcomes from the transformation the firm needs to be able to collaborate across all areas of its organization. Virtual collaboration is increasingly becoming the standard for developing and maintaining excellence in industrial operational efficiency and increased productivity. There is a need to extend real-time, offline simulation and modeling to all participants in the data value chain including an interactive input capture from the technician feed and up to the reasoning and modeling for process and production managers in order to optimize maintenance decisions for physical objects. Annually, there’s an excess of $75 billion worth of repair done on machine tools. A key aspect of maintaining these tools and systems allows individual modelling of recipes, processes, equipment, metrics and the customer specific syntax of its manufacturing process. These must allow for managing metadata, establishing data manifests that support fast and reliable processing of process and equipment data and injecting data classification and feature extraction to enable AI/ML algorithms. At the heart of Prescriptive maintenance systems is an ability to act and react quickly across the application and device components using advanced software that supports time-aware, synchronized execution for optimization and control of cycle times. Today systems are largely rule based and less flexible for a seamless integration of AI/ML results. Aligning data collection and mapping with machine learning results for actionable insights and completing automated controlled actions in milliseconds has not been possible before. Advanced analytics and control software unifies functionality from embedded systems, autonomous systems and platform integration without losing its industrial grade providing reliability and safety.
Artificial intelligence elements need to be embedded and synchronized on a deterministic basis in the data across the digital core enabling the firm to derive new insights, such as predicting outcomes and automating actions to ensure optimize the outcomes.
The AI/ML system is the brain of a Prescriptive maintenance platform. ML models for production machines are designed to detect anomalous behavior during production. Training data helps to develop specific models for individual recipe steps or specific equipment types. Timely detections and prescriptions are accomplished by determining, on some continuous basis, whether a data point falls outside these bounds it is flagged as an anomaly and reported. Capability to train a new model and deploy on demand is the key. In doing so, the model is able to learn on and adapt over time
Visualization of the large volume of data is essential to support managing the analytics and rapid decision-making for continuous monitoring and prescriptive maintenance and analytics. The visualization tool must synthesize multi-dimensional, often fused data and information in order to support assessment, planning and prognosis. The tool needs to be open, dynamic, real-time and available for all participants in order to be capable of collaborating with all participants to access data at various levels of abstraction and network connectivity.
We go through multiple sequential steps for our manufacturing process. Our manufacturing assembly consists of multiple moving parts – most notably motion stages and actuators. One of the most elusive yield issues is intermittent motor failures that randomly occur over time. There is no recognizable consistency in the failure occurrence in terms of time, frequency, components, recipes, etc. They simply occur randomly. It is often too time consuming and negatively impacting our factory throughput to shut down the equipment to iron test out until one can luckily catch live a failure event. On the other hand, if the equipment stays online, it continues to manufacture faulty products intermittently, wasting downstream factory capacity, consumables, labor and high revenue loss. We previously relied on a manual condition-based monitoring solution and a preventive maintenance program that limited our ability to improve yield and productivity. Predictive & prescriptive maintenance has been adopted recently in the heavy manufacturing industry. For e.g. Predictive maintenance of a gas turbine, a Vacuum pump, an aircraft engine etc. These Predictive maintenance solutions are highly custom in nature based on the machinery and its operation, domain etc. Further, the sensors used for these cases are geared towards large and heavy machines and are tuned for large magnitude speed, accelerations, rotations, inclination. Our manufacturing processes are quite different compared to the processes in the heavy machinery industry. Not only our domain and processes for manufacturing is different but sensor requirement for data acquisition would be different. Motors used in our manufacturing assembly are small in size and their motions are shorter in range and less sudden in order to handle delicate and small components. Off-the-shelf IOT sensors (used in heavy manufacturing industry for motion sensing) currently available are not suitable for our manufacturing as they are too bulky to mount or not accurate and sensitive enough. For e.g. Movements causing excessive vibrations potentially leading to misalignment of a tiny component may not be detected by Off-the-shelf sensors.
We proposed solution via an extensive solution requirement document detailing technical design specs such as – performance, scalability, modularity, security, high availability, connectivity, flexible application programming interface, CPU/memory / Network /Hard Disk usage specs, interoperability, configurability, reliability, availability, latency, ML / AI algorithms, database performance, visualization, extensibility, sensor requirements.
We partnered with a vendor who have the technical expertise to meet all our solution design specs. Partnering with the vendor we developed a predictive maintenance framework utilizing custom machine Learning techniques, software engineering principles with appropriate sensors chosen based on form factor, power, and communications protocol etc. more accurate for our manufacturing. Overall, the predictive maintenance framework is able to perform real-time detection, visualization, alert creation as well as recommendations for fixes on different stages of our manufacturing process. This unique system also uses state of the art machine learning techniques and software engineering principles to preemptively and autonomously predict, detect, create alerts and recommend fixes on anomalous vibrations during our manufacturing processes. This system alerts the personal and recommend fix estimations before any actual disruption of the manufacturing process occurs. This has enabled predictive and prescriptive scheduled maintenance of our manufacturing processes reducing sudden & unplanned downtime/ disruption & optimizing our factory capacity, consumables, labor and cost. The platform integration of AI/ML, a time-aware run time system and flexible authoring tool allowed a much better prediction of process and equipment issue and enabled precise maintenance of our equipment assembly avoiding sudden & unplanned downtime / avoiding intermittent faulty manufacturing minimizing revenue loss.
Major benefits we realized:
Boosted worker productivity
Reduced unplanned downtime
Overview of ML algorithms:
What is Anomaly detection? :
Anomaly detection is about finding patterns (such as outliers, exceptions, peculiarities etc.) that deviate from expected behavior within datasets(s) – therefore it can be similar to noise removal or novelty detection. A pattern detected with anomalies are actually of interest, noise detection can be slightly different, because the sole purpose of noise detection is removing those noise. As with most data science projects, the ultimate goal of anomaly detection is not just an algorithm or working model. Instead, it’s about the value of the insight the anomalies/outliers provide – i.e. for the business money saved from preventing equipment damage. In the manufacturing sector – we want to proactively achieve predictive & prescriptive maintenance using anomaly detection before it actually damage the equipment. This would pre-alert and enable “scheduled maintenance” avoiding sudden downtime which usually leads to heavy revenue loss
Supervised vs Unsupervised:
There are two primary architectures for building anomaly detection systems:
Supervised anomaly detection – which we can use if we have labeled dataset where we know whether or not each data point is normal or not
Unsupervised anomaly detection - where the dataset is unlabeled i.e. whether or not each data point is an anomaly is unreliable or unknown
Our ML Pipeline:
Our state-of-the art ML software framework performs real time sensor data acquisition from sensors strategically placed in various parts of our manufacturing assembly. All in real-time, our software framework performs statistical feature extraction from the acquired sensor signals, derive the principal components, and detects anomalous clusters (data points) using supervised and unsupervised learning techniques, performs time series preemptive prediction on the sensor signals along with confidence interval guard bands and detects extreme vibrations. Following is an overview:
A Sample Unsupervised Algorithm detecting various clusters:
Mean Shift clustering aims to discover clusters in a smooth density of data points. It is a centroid based algorithm; it starts by considering each data point as a cluster center. Depending on the bandwidth parameter provided, each data point builds an imaginary sphere of interest. It then updates the centroid of the sphere with the mean of the data points within the sphere, then builds a new sphere of interest around the new mean (the new centroid), then again the update the centroid with the mean of the data points within the area. This process goes on iteratively until it converges meaning until the centroid does not move anymore. This is followed by a filtering post-processing stage to eliminate near-duplicates to form the final set of centroids. The data points in the sphere that converge to the same centroid are considered members of the same cluster
The optimal number of clusters through each technique is achieved by optimizing the Silhouette coefficient. The Silhouette coefficient is calculated using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample. The Silhouette Coefficient for a sample is (b - a) / max (a, b). To clarify, b is the distance between a sample and the nearest cluster that the sample is not a part of. The best value is 1 and the worst value is -1. Values near 0 indicate overlapping clusters. Negative values generally indicate that a sample has been assigned to the wrong cluster, as a different cluster is more similar
Following is a plot where Mean shift clustering detects various vibration stages of one of our manufacturing assembly components:
In the plot above, the cross marks indicate the ground truth labels for the vibration stages and the colored dots are the predicted vibration stages using mean-shift unsupervised learning technique. Here, the unsupervised algorithm is able to detect all six different vibration stages.
Our Software Architecture:
We have a flexible software architecture providing full capabilities at the edge as well it switches on-demand to a hybrid Edge-cloud based architecture based on data / ML compute needs etc. Our framework also provides a flexible API to update our ML models as needed. Following is our sample high level software architecture with Raspberry pie edge computer. We utilized Open technologies such as MQTT (real-time subscription-based messaging protocol), Influx db (Open source time series database for real-time analytics), and Grafana (for real-time visualizations)
Due to various computation limitations from real-time data processing to ML training / prediction, we adopted a flexible software architecture that switches seamlessly to a hybrid form. The Edge-cloud framework continuously archive data into a historical archive database in the cloud in a rolling window, for e.g.- maintaining a week or a month worth of historical data up to the current date in a cloud database, the amount of data to be archived can vary based on the use case. The framework has in place a rule based model health monitoring system that periodically ( as well as on-demand) checks the performance statistics of the ML model - the performance statistics can vary based on but not limited to nature of the use case, nature of ML algorithms etc. If the model performance statistics falls below threshold values (such as f1 score, R squared for supervised learners , Silhouette coefficient for unsupervised learners etc. which can vary based on use case, algorithms etc.), the framework triggers the ML training pipeline (online) and deploy the newly trained model once the threshold criterion are met. Following is our sample hybrid architecture:
Prescriptive Maintenance real-time visualization platform:
We built a state-of-the-art easy to use intelligent monitoring platform for Prescriptive maintenance. An operator can monitor anomalies, drill down to root cause and receive smart recommendations to fix a faulty situation during the manufacturing process in real-time. The ML based smart recommendations are based on anomaly behaviors, sensor locations, sensor data, time of occurrence, equipment types/ properties etc. With this platform, an operator can also effortless visualize & monitor CPU, memory, Network usage etc. on the various manufacturing assembly servers. Our visualization platform also provides look-back options with custom date range. Based on demand and using the look back options the model health monitoring system in-place can direct an anomaly detection algorithm to re-train itself in the cloud with the archived history to better predict the future.
Summary & Conclusion:
We developed an end-end ML powered prescriptive maintenance system which is scalable, secured, modular, extensible, interoperable & configurable. With this platform, we realized substantial improvement on Yield for our manufacturing processes. We are proactively detecting faults, recommending fixes on our manufacturing assembly substantially reducing unplanned downtime and optimizing factory capacity, consumables, labor and operating cost. Machine Learning around anomaly detection is constantly evolving. New unseen anomalies get created during machine operations. Our flexible ML API offered by our platform enable us to update our anomaly detection algorithms as needed.