Differentially Private High-Dimensional Data Publication via Markov Network

Wei Zhang; Jingwen  Zhao; Fengqiong  Wei; Yunfang  Chen

Research Article

Differentially Private High-Dimensional Data Publication via Markov Network

Download1291 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/eai.29-7-2019.159626,
    author={Wei Zhang and Jingwen  Zhao and Fengqiong  Wei and Yunfang  Chen},
    title={Differentially Private High-Dimensional Data Publication via Markov Network},
    journal={EAI Endorsed Transactions on Security and Safety},
    volume={6},
    number={19},
    publisher={EAI},
    journal_a={SESA},
    year={2019},
    month={1},
    keywords={Differential privacy, High-dimensional, Data publication, Markov network},
    doi={10.4108/eai.29-7-2019.159626}
}

Wei Zhang
Jingwen Zhao
Fengqiong Wei
Yunfang Chen
Year: 2019
Differentially Private High-Dimensional Data Publication via Markov Network
SESA
EAI
DOI: 10.4108/eai.29-7-2019.159626

Wei Zhang^1,2, Jingwen Zhao¹, Fengqiong Wei¹, Yunfang Chen¹^,*

1: School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
2: Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing 210023, China

*Contact email: chenyf@njupt.edu.cn

Abstract

Differentially private data publication has recently received considerable attention. However, it faces some challenges in differentially private high-dimensional data publication, such as the complex attribute relationships, the high computational complexity and data sparsity. Therefore, we propose PrivMN, a novel method to publish high-dimensional data with differential privacy guarantee. We first use the Markov model to represent the mutual relationships between attributes to solve the problem that the direction of relationship between variables cannot be determined in practical application. We then take advantage of approximate inference to calculate the joint distribution of high-dimensional data under differential privacy to figure out the computational and spatial complexity of accurate reasoning. Extensive experiments on real datasets demonstrate that our solution makes the published high-dimensional synthetic datasets more efficient under the guarantee of differential privacy.

Keywords: Differential privacy, High-dimensional, Data publication, Markov network

Received: 2018-12-12
Accepted: 2019-01-21
Published: 2019-01-29
Publisher: EAI

: http://dx.doi.org/10.4108/eai.29-7-2019.159626

Copyright © 2019 Wei Zhang et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.