Efficient and Privacy-Preserving Near-Duplicate Detection in Cloud Computing

Changhee Hahn; Hyung June Shin; Junbeom Hur

Efficient and Privacy-Preserving Near-Duplicate Detection in Cloud Computing

Changhee Hahn

Hyung June Shin

Junbeom Hur

Vol. 44, No. 10, pp. 1112-1123, Oct. 2017

10.5626/JOK.2017.44.10.1112

near-duplicate detection

Searchable encryption

Cloud computing

Privacy

PDF

Abstract

As content providers further offload content-centric services to the cloud, data retrieval over the cloud typically results in many redundant items because there is a prevalent near-duplication of content on the Internet. Simply fetching all data from the cloud severely degrades efficiency in terms of resource utilization and bandwidth, and data can be encrypted by multiple content providers under different keys to preserve privacy. Thus, locating near-duplicate data in a privacy-preserving way is highly dependent on the ability to deduplicate redundant search results and returns best matches without decrypting data. To this end, we propose an efficient near-duplicate detection scheme for encrypted data in the cloud. Our scheme has the following benefits. First, a single query is enough to locate near-duplicate data even if they are encrypted under different keys of multiple content providers. Second, storage, computation and communication costs are alleviated compared to existing schemes, while achieving the same level of search accuracy. Third, scalability is significantly improved as a result of a novel and efficient two-round detection to locate near-duplicate candidates over large quantities of data in the cloud. An experimental analysis with real-world data demonstrates the applicability of the proposed scheme to a practical cloud system. Last, the proposed scheme is an average of 70.6% faster than an existing scheme.

Statistics

Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.

Cite this article

[IEEE Style]

C. Hahn, H. J. Shin, J. Hur, "Efficient and Privacy-Preserving Near-Duplicate Detection in Cloud Computing," Journal of KIISE, JOK, vol. 44, no. 10, pp. 1112-1123, 2017. DOI: 10.5626/JOK.2017.44.10.1112.

[ACM Style]

Changhee Hahn, Hyung June Shin, and Junbeom Hur. 2017. Efficient and Privacy-Preserving Near-Duplicate Detection in Cloud Computing. Journal of KIISE, JOK, 44, 10, (2017), 1112-1123. DOI: 10.5626/JOK.2017.44.10.1112.

[KCI Style]

한창희, 신형준, 허준범, "클라우드 환경에서 검색 효율성 개선과 프라이버시를 보장하는 유사 중복 검출 기법," 한국정보과학회 논문지, 제44권, 제10호, 1112~1123쪽, 2017. DOI: 10.5626/JOK.2017.44.10.1112.

[Endnote/Zotero/Mendeley (RIS)] Download

[BibTeX] Download

Search

Journal of KIISE

ISSN : 2383-630X(Print)
ISSN : 2383-6296(Electronic)
KCI Accredited Journal

Editorial Office

Tel. +82-2-588-9240
Fax. +82-2-521-1352
E-mail. chwoo@kiise.or.kr