Zero-Trust Based Distributed Collaborative Dynamic Access Control Scheme with Deep Multi-Agent Reinforcement Learning

Qiuqing Jin; Liming Wang

Research Article

Zero-Trust Based Distributed Collaborative Dynamic Access Control Scheme with Deep Multi-Agent Reinforcement Learning

Download782 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/eai.25-6-2021.170246,
    author={Qiuqing Jin and Liming Wang},
    title={Zero-Trust Based Distributed Collaborative Dynamic Access Control Scheme with Deep Multi-Agent Reinforcement Learning},
    journal={EAI Endorsed Transactions on Security and Safety},
    volume={8},
    number={27},
    publisher={EAI},
    journal_a={SESA},
    year={2020},
    month={12},
    keywords={Zero-Trust, Insider Threats, Dynamic Access Control, Reinforcement Learning},
    doi={10.4108/eai.25-6-2021.170246}
}

Qiuqing Jin
Liming Wang
Year: 2020
Zero-Trust Based Distributed Collaborative Dynamic Access Control Scheme with Deep Multi-Agent Reinforcement Learning
SESA
EAI
DOI: 10.4108/eai.25-6-2021.170246

Qiuqing Jin^1,2^,*, Liming Wang

1: Institute of Information Engineering, Chinese Academy of Sciences
2: University of Chinese Academy of Sciences

*Contact email: jinqiuqing@iie.ac.cn

Abstract

Vast majority of organizations and companies strongly depend on intranet with access control to achieve security data accessibility and authorized resource sharing across departments and networks. However, traditional boundary defense has difficulty in mitigating the increasing threats and attacks that mostly originated by insiders. Common insider threat solutions decouple the detection and defense, which requires domain knowledge and human intervention to achieve the mitigation after the protection. Moreover, these static methods have no capability to dynamically monitor various anomaly events and take corresponding protective measures. In this paper, we present a Zero-Trust based collaborative dynamic access control scheme to rebuild a security network architecture from the traffic scheduling perspective for insider threats mitigation. This scheme organically combines anomaly detection and mitigation execution by constructing dynamic updating user trust profile as the evidence of access control and collaboratively adjusting mitigation policy with any subtle requirement and environment changes in a scalable distributed way. We make use of the Multi Agent Deep Deterministic Policy Gradient (MADDPG) to optimize the traffic allocation policy for adaptive and automatic collaborative management scheme with the consideration of network security, network environment and user requirement. The performance of the scheme is analyzed through a network simulator, which shows promising results for DRL to be applied in threat mitigation.

Keywords: Zero-Trust, Insider Threats, Dynamic Access Control, Reinforcement Learning

Received: 2020-11-12
Accepted: 2020-12-13
Published: 2020-12-16
Publisher: EAI

: http://dx.doi.org/10.4108/eai.25-6-2021.170246

Copyright © 2020 Qiuqing Jin et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution license, which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.