A deep reinforcement learning approach for maintenance planning

Document Type

Conference Proceeding

Publication Date

1-1-2021

Abstract

Meeting customer requirements is the main goal of production units, which translates into maximizing system uptime. Even planned maintenance actions typically create downtime for the system, reducing production capacity, but can prevent even more costly unplanned failures. In this study, we focus on the decision of whether to postpone (to a future planning period) or not postpone planned maintenance actions over a finite planning horizon, based on sensor data acquired from Industrial IoT (IIoT) devices equipped with condition monitoring. The finite horizon requires a mechanism for learning how to integrate "look ahead" to optimize maintenance policy over the long run. Deep Reinforcement Learning (DRL) provides a mechanism for learning an optimal policy. There is a set of possible health states for equipment based on data gathered from IoT devices, demand and product inventory level. The agent should choose the best action based on the observed state which would be the best maintenance decision whether to plan maintenance or to postpone it to a later time. The main goal of Deep Reinforcement Learning in this study is to find the optimal maintenance policy which maximizes reward. The reward function is dependent on minimizing total cost over the long run and the deep Q network (DQN) algorithm is used to learn the optimal policy. We simulate failure for equipment in a small production system based on wear-type failure distributions with stationary stochastic demand with a six-week planning horizon.

Publication Source (Journal or Book title)

IISE Annual Conference and Expo 2021

First Page

932

Last Page

937

This document is currently not available here.

Share

COinS