TY - JOUR T1 - Ensemble of Sentence Interaction and Graph Based Models for Document Pair Similarity Estimation AU - Choi, Seonghwan AU - Son, Donghyun AU - Lee, Hochang JO - Journal of KIISE, JOK PY - 2021 DA - 2021/1/14 DO - 10.5626/JOK.2021.48.11.1184 KW - news clustering KW - document similarity KW - text similarity KW - interaction-based KW - graph-based AB - Deriving the similarity between two documents, such as, news articles, is one of the most important factors of clustering documents. Sequence similarity models, one of the existing deep-learning based approaches to document clustering, do not reflect the entire context of documents. To address this issue, this paper uses interaction-based and graph-based approaches to construct document pair similarity models suitable for news clustering. This paper proposes four interaction-based models that measures the similarity between two documents through the aggregation of similarity information in the interaction of sentences. The experimental results demonstrated that two out of these four proposed models outperformed SVM and HAN. Ablation studies were conducted on the graph-based model through experiments on the depth of the model’s neural network and its input features. Through error analysis and ensemble of models with an interaction and graph-based approach, this paper showed that these two approaches could be complementarity due to the differences in their prediction tendencies.