Search : [ keyword: graph ] (98)

GPT-2 for Knowledge Graph Completion

Sang-Woon Kim, Won-Chul Shin

http://doi.org/10.5626/JOK.2021.48.12.1281

Knowledge graphs become an important resource in many artificial intelligence (AI) tasks. Many studies are being conducted to complete the incomplete knowledge graph. Among them, interest in research that knowledge completion by link prediction and relation prediction is increasing. The most talked-about language models in AI natural language processing include BERT and GPT-2, among which KG-BERT wants to solve knowledge completion problems with BERT. In this paper, we wanted to solve the problem of knowledge completion by utilizing GPT-2, which is the biggest recent issue in the language model of AI. Triple information-based knowledge completion and path-triple-based knowledge completion were proposed and explained as methods to solve the knowledge completion problem using the GPT-2 language model. The model proposed in this paper was defined as KG-GPT2, and experiments were conducted by comparing the link prediction and relationship prediction results of TransE, TransR, KG-BERT, and KG-GPT2 to evaluate knowledge completion performance. For link prediction, WN18RR, FB15k-237, and UMLS datasets were used, and for relation prediction, FB15K was used. As a result of the experiment, in the case of link prediction in the path- triple-based knowledge completion of KG-GPT2, the best performance was recorded for all experimental datasets except UMLS. In the path-triple-based knowledge completion of KG-GPT2, the model"s relationship prediction work also recorded the best performance for the FB15K dataset.

Knowledge Graph Completion using Hyper-class Information and Pre-trained Language Model

Daesik Jang, Youngjoong Ko

http://doi.org/10.5626/JOK.2021.48.11.1228

Link prediction is a task that aims to predict missing links in knowledge graphs. Recently, several link prediction models have been proposed to complete the knowledge graphs and have achieved meaningful results. However, the previous models used only the triples" internal information in the training data, which may lead to an overfitting problem. To address this problem, we propose Hyper-class Information and Pre-trained Language Model (HIP) that performs hyper-class prediction and link prediction through a multi-task learning. HIP learns not only contextual relationship of triples but also abstractive meanings of entities. As a result, it learns general information of the entities and forces the entities connected to the same hyper-class to have similar embeddings. Experimental results show significant improvement in Hits@10 and Mean Rank (MR) compared to KG-BERT and MTL-KGC.

Knowledge Completion System using Neuro-Symbolic-based Rule Induction and Inference Engine

Won-Chul Shin, Hyun-Kyu Park, Young-Tack Park

http://doi.org/10.5626/JOK.2021.48.11.1202

Recently, there have been several studies on knowledge completion methods aimed to solve the incomplete knowledge graphs problem. Methods such as Neural Theorem Prover (NTP), which combines the advantages of deep learning methods and logic systems, have performed well over existing methods. However, NTP faces challenges in processing large-scale knowledge graphs because all the triples of the knowledge graph are involved in the computation to obtain prediction results for one input. In this paper, we propose an integrated system of deep learning and logic inference methods that can learn vector representations of symbols from improved models of computational complexity of NTP to rule induction, and perform knowledge inference from induced rules using inference engines. In this paper, for rule-induction performance verification of the rule generation model, we compared test data inference ability with NTP using induced rules on Nations, Kinship, and UMLS data set. Experiments with Kdata and WiseKB knowledge inference through inference engines resulted in a 30% increase in Kdata and a 95% increase in WiseKB compared to the knowledge graphs used in experiments.

Ensemble of Sentence Interaction and Graph Based Models for Document Pair Similarity Estimation

Seonghwan Choi, Donghyun Son, Hochang Lee

http://doi.org/10.5626/JOK.2021.48.11.1184

Deriving the similarity between two documents, such as, news articles, is one of the most important factors of clustering documents. Sequence similarity models, one of the existing deep-learning based approaches to document clustering, do not reflect the entire context of documents. To address this issue, this paper uses interaction-based and graph-based approaches to construct document pair similarity models suitable for news clustering. This paper proposes four interaction-based models that measures the similarity between two documents through the aggregation of similarity information in the interaction of sentences. The experimental results demonstrated that two out of these four proposed models outperformed SVM and HAN. Ablation studies were conducted on the graph-based model through experiments on the depth of the model’s neural network and its input features. Through error analysis and ensemble of models with an interaction and graph-based approach, this paper showed that these two approaches could be complementarity due to the differences in their prediction tendencies.

Improving Subgraph Isomorphism with Pruning by Bipartite Matching

Yunyoung Choi, Kunsoo Park

http://doi.org/10.5626/JOK.2021.48.9.973

In recent years, it has become increasingly important to efficiently solve NP-hard graph problems. One of the fundamental problems in graph analysis is subgraph isomorphism. Given a query graph and a data graph, the subgraph isomorphism problem is to determine whether there is an embedding of the query graph in the data graph. Although a lot of practical algorithms have been developed for the problem, existing algorithms showed limited running time scalability in dealing with many real-world graphs. In this paper, we propose a new pruning technique based on bipartite matching which enables us to capture and remove redundancies in the search space. We also conduct experiments on several real datasets to show effectiveness of our technique.

Community Detection Using Link Attribute-Based Classification

Jeongseon Kim, Soohwan Jeong, Sungsu Lim

http://doi.org/10.5626/JOK.2021.48.8.959

Attempts to discover knowledge through data are becoming gradually diversified to understand a fast and complex world. Graph data analysis, which models and analyzes correlated data as graphs, is drawing much attention as it is combined with the latest machine learning techniques. In this work, we propose a novel methodology for discovering graph community structures. We analyze similarity, curvature-based attributes to allow links existing inside and outside the community to have different attribute values, and exploit them to design and analyze algorithms that eliminate links that affect the community structure less to find better community structures on sparse graphs.

EFA-DTI: Prediction of Drug-Target Interactions Using Edge Feature Attention

Erkhembayar Jadamba, Sooheon Kim, Hyeonsu Lee, Hwajong Kim

http://doi.org/10.5626/JOK.2021.48.7.825

Drug discovery is a high-level field of research requiring the coordination of disciplines ranging from medicinal chemistry, systems biology, structural biology, and increasingly, artificial intelligence. In particular, drug-target interaction (DTI) prediction is central to the process of screening for and optimizing candidate substances to treat disease from a nearly infinite set of compounds. Recently, as computer performance has developed dramatically, studies using artificial intelligence neural networks have been actively conducted to reduce the cost and increase the efficiency of DTI prediction. This paper proposes a model that predicts an interaction value between a given molecule and protein using a learned molecule representation via Edge Feature Attention-applied Graph Net Embedding with Fixed Fingerprints and a protein representation using pre-trained protein embeddings. The paper describes architectures, experimental methods, and findings. The model demonstrated higher performance than DeepDTA and GraphDTA, which had previously demonstrated the best performance in DTI studies.

Knowledge Completion System through Learning the Relationship between Query and Knowledge Graph

Min-Sung Kim, Min-Ho Lee, Wan-Gon Lee, Young-Tack Park

http://doi.org/10.5626/JOK.2021.48.6.649

The knowledge graph is a network comprising of relationships between the entities. In a knowledge graph, there exists a problem of missing or incorrect relationship connection with the specific entities. Numerous studies have proposed learning methods using artificial neural networks based on natural language embedding to solve the problems of the incomplete knowledge graph. Various knowledge graph completion systems are being studied using these methods. In this paper, a system that infers missing knowledge using specific queries and knowledge graphs is proposed. First, a topic is automatically extracted from a query, and topic embedding is obtained from the knowledge graph embedding module. Next, a new triple is inferred by learning the relationship between the topic from the knowledge graph and the query by using Query embedding and knowledge graph embedding. Through this method, the missing knowledge was inferred and the predicate embedding of the knowledge graph related to a specific query was used for good performance. Also, an experiment was conducted using the MetaQA dataset to prove the better performance of the proposed method compared with the existing methods. For the experiment, we used a knowledge graph having movies as a domain. Based on the assumption of the entire knowledge graph and the missing knowledge graph, we experimented on the knowledge graph in which 50% of the triples were randomly omitted. Apparently, better performance than the existing method was obtained.

An Explainable Knowledge Completion Model Using Explanation Segments

Min-Ho Lee, Wan-Gon Lee, Batselem Jagvaral, Young-Tack Park

http://doi.org/10.5626/JOK.2021.48.6.680

Recently, a large number of studies that used deep learning have been conducted to predict new links in incomplete knowledge graphs. However, link prediction using deep learning has a major limitation as the inferred results cannot be explained. We propose a high-utility knowledge graph prediction model that yields explainable inference paths supporting the inference results. We define paths to the object from the knowledge graph using a path ranking algorithm and define them as the explanation segments. Then, the generated explanation segments are embedded using a Convolutional neural network (CNN) and a Bidirectional Long short-term memory (BiLSTM). The link prediction model is then trained by applying an attention mechanism, based on the calculation of the semantic similarity between the embedded explanation segments and inferred candidate predicates to be inferred. The explanation segment suitable for link prediction explanation is selected based on the measured attention scores. To evaluate the performance of the proposed method, a link prediction comparison experiment and an accuracy verification experiment are performed to measure the proportion of the explanation segments suitable to explain the link prediction results. We used the benchmark datasets NELL-995, FB15K-237, and countries for the experiment, and accuracy verification experiments showed the accuracies of 89%, 44%, and 97%, respectively. Compared with the existing method, the NELL-995, FB15K-237 data exhibited 35%p and 21%p higher performance on average.

An Effective Detection Method of Anomalous Sequences Considering the Occurrence Order and Time Interval of the Elements

Jooyeon Lee, Ki Yong Lee

http://doi.org/10.5626/JOK.2021.48.4.469

Recently, a rapid generation of sequence data consisting of elements in various applications has been witnessed over time. Although various methods for detecting anomalous sequences among the given sequences have been actively studied, most of them mainly consider only the occurrence order of the elements. In this paper, we propose an effective anomalous sequence detection method considering not only the occurrence order of the elements but also the time interval between the elements. Apparently, the proposed method uses a model that combines two autoencoders. The first is an LSTM autoencoder, which learns the features of the occurrence order of elements, and the second is a graph autoencoder, which learns the features of the time interval between the elements. After completion of the training, each sequence is input to the trained model and reconstructed by the trained model. If the occurrence order and time interval of elements in the reconstructed sequence greatly differ from those in the original sequence, the corresponding sequence is determined as an anomalous sequence. Through various experiments using synthetic data, we confirmed that the proposed method can detect anomalous sequences more effectively than the method that uses an RNN autoencoder to learn the occurrence order of the elements, the methods that use a single LSTM autoencoder and the method that doesn’t use deep learning model.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr