An Intelligent Decision-making Scheme in a Dynamic Multi-objective Environment using Deep Reinforcement Learning

Hasan, Md Mahmudul

Hasan_2020.pdf (7.03 MB)

An Intelligent Decision-making Scheme in a Dynamic Multi-objective Environment using Deep Reinforcement Learning

thesis

posted on 2023-08-30, 17:41 authored by Md Mahmudul Hasan

Real-life problems are dynamic and associated with a decision-making process with multiple options. We need to do optimisation to solve some of these dynamic decision-making problems. These problems are challenging to solve when we need trade-off between multiple parameters in a decision-making process, especially in a dynamic environment. However, with the help of artificial intelligence (AI), we may solve these problems effectively. This research aims to investigate the development of an intelligent decision-making scheme for a dynamic multi-objective environment using deep reinforcement learning (DRL) algorithm. This includes developing a benchmark in the area of dynamic multi-objective optimisation in reinforcement learning (RL) settings, which stimulated the development of an improved testbed using the conventional deep-sea treasure (DST) benchmark. The proposed testbed is created based on changing the optimal Pareto front (PF) and Pareto set (PS). To the best of my knowledge, this is the first dynamic multi-objective testbed for RL settings. Moreover, a framework is proposed to handle multi-objective in a dynamic environment that fundamentally maintains an equilibrium between different objectives to provide a compromised solution that is closed to the true PF. To proof the concept, the proposed model has been implemented in a real-world scenario to predict the vulnerable zones based on the water quality resilience in São Paulo, Brazil. The proposed algorithm namely parity-Q deep Q network (PQDQN) is successfully implemented and tested where the agent outperforms in terms of achieving the goal (i.e. obtained rewards). Though, the agent requires higher elapsed time (i.e. the number of steps) to be trained compared to the multi-objective Monte Carlo tree search (MO-MCTS) agent in a particular event, its accuracy in finding the Pareto optimum solutions is significantly enhanced compared to the multi-policy DQN (MPDQN) and multi-Pareto Q learning (MPQ) algorithms. The outcome reveals that the proposed algorithm can find the optimum solution in a dynamic environment. It allows a new objective to accommodate without any retraining and behaviour tuning of the agent. It also governs the policy that needs to be selected. As far as the dynamic DST testbed is concerned, it will provide the researchers with a new dimension to conduct their research and enable them to test their algorithms in solving problems that are dynamic in nature.

History

Institution

Anglia Ruskin University

File version

Accepted version

Language

eng

Thesis name

PhD

Thesis type

Doctoral

Legacy posted date

2020-09-16

Legacy creation date

2020-09-16

Legacy Faculty/School/Department

Theses from Anglia Ruskin University/Faculty of Science and Engineering

Usage metrics

Keywords

Deep reinforcement learning multi-policy multi-objective optimisation dynamic environment deep Q network vector rewards benchmarks water quality evaluation resilience

Licence

CC BY-NC-ND 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

An Intelligent Decision-making Scheme in a Dynamic Multi-objective Environment using Deep Reinforcement Learning

History

Institution

File version

Language

Thesis name

Thesis type

Legacy posted date

Legacy creation date

Legacy Faculty/School/Department

Usage metrics

Categories

Keywords

Licence

Exports