Search : [ keyword: sentence embedding ] (4)

SBERT-PRO: Predicate Oriented Sentence Embedding Model for Intent and Event Detection

Dongryul Ko, Jeayun Lee, Dahee Lee, Yuri Son, Sangmin Kim, Jaeeun Jang, Munhyeong Kim, Sanghyun Park, Jaieun Kim

http://doi.org/10.5626/JOK.2024.51.2.165

Intent detection is a crucial task in conversational systems for understanding user intentions. Additionally, event detection is vital for identifying important events within various texts, including news articles, social media posts, and reports. Among diverse approaches, the sentence embedding similarity-based method has been widely adopted to solve open-domain classification tasks. However, conventional embedding models tend to focus on specific keywords within a sentence and are not suitable for tasks that require a high-level semantic understanding of a sentence as opposed to a narrow focus on specific details within a sentence. This limitation becomes particularly evident in tasks such as intent detection, which requires a broader understanding of the intention of a sentence, and event detection, which requires an emphasis on actual events within a sentence. In this paper, we construct a training dataset suitable for intent and event detection using entity attribute information and entity relation information. Our approach is inspired by the significance of emphasizing the embedding of predicates, which unfold the content of a sentence, as opposed to focusing on entity attributes within a sentence. Furthermore, we suggest an adaptive learning strategy for the existing sentence embedding model and demonstrate that our proposed model, SBERT-PRO (PRedicate Oriented), outperforms conventional models

A Contrastive Learning Method for Automated Fact-Checking

Seonyeong Song, Jejun An, Kunwoo Park

http://doi.org/10.5626/JOK.2023.50.8.680

As proliferation of online misinformation increases, the importance of automated fact-checking, which enables real-time evaluation, has been emphasized. In this study, we propose a contrastive learning method for automated fact-checking in Korean. The proposed method deems a sentence similar to evidence as a positive sample to determine the authenticity of a given claim. In evaluation experiments, we found that the proposed method was more effective in the sentence selection step of finding evidence sentences for a given claim than previous methods. such as a finetuned pretrained language model and SimCSE. This study shows a potential of contrastive learning for automated fact-checking.

Automatic Extraction of Sentence Embedding Features for Question Similarity Analysis in Dialogues

Kyo-Joong Oh, Dongkun Lee, Chae-Gyun Lim, Ho-Jin Choi

http://doi.org/10.5626/JOK.2019.46.9.909

This paper describes a method for the automatic extraction of feature vectors that can be used to analyze the similarity among natural language sentences. Similarity analysis among sentences is a necessary aspect of measuring semantic or structural similarity in natural language understanding. The analysis results can be used to find answers in Question and Answer (Q&A) systems and dialogue systems. The similarity analysis uses sentence vectors extracted by two deep learning models: the Recurrent Neural Network (RNN) to reflect sequential information of expressions such as syllables and semantic morphemes, and the Convolutional Neural Network (CNN) for characterizing the appearance patterns of similar expressions such as words or phrases. In this paper, we examine the accuracy and quality of the method using sentence vectors that are automatically extracted by the models from dialogues related to banking service. This method can find more similar questions and answers in FAQs than existing methods. The automatic feature extraction method can be used to analyze the similarity of Korean sentences across various application domains and systems.

Document Summarization Using TextRank Based on Sentence Embedding

Seok-won Jeong, Jintae Kim, Harksoo Kim

http://doi.org/10.5626/JOK.2019.46.3.285

Document summarization is creating a short version document that maintains the main content of original document. An extractive summarization has been actively studied by the reason of it guarantees the basic level of grammar and high level of accuracy by copying a large amount of text from the original document. It is difficult to consider the meaning of sentences because the TextRank, which is a typical extractive summarization method, calculates an edge of graph through the frequency of words. In a bid to solve these drawbacks, we propose a new TextRank using sentence embedding. Through experiments, we confirmed that the proposed method can consider the meaning of the sentence better than the existing method.


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr