TY - JOUR T1 - Automatic Construction of Reduced Dimensional Cluster-based Keyword Association Networks using LSI AU - Yoo, Han-mook AU - Kim, Han-joon AU - Chang, Jae-young JO - Journal of KIISE, JOK PY - 2017 DA - 2017/1/14 DO - 10.5626/JOK.2017.44.11.1236 KW - latent semantic indexing KW - mutual information KW - maximal spanning tree KW - clustering KW - keyword extraction KW - text mining AB - In this paper, we propose a novel way of producing keyword networks, named LSI-based ClusterTextRank, which extracts significant key words from a set of clusters with a mutual information metric, and constructs an association network using latent semantic indexing (LSI). The proposed method reduces the dimension of documents through LSI, decomposes documents into multiple clusters through k-means clustering, and expresses the words within each cluster as a maximal spanning tree graph. The significant key words are identified by evaluating their mutual information within clusters. Then, the method calculates the similarities between the extracted key words using the term-concept matrix, and the results are represented as a keyword association network. To evaluate the performance of the proposed method, we used travel-related blog data and showed that the proposed method outperforms the existing TextRank algorithm by about 14% in terms of accuracy.