Search : [ keyword: 생성 모델 ] (6)

A Diffusion-based Trajectory Prediction Model for Flight Vehicles Considering Pull-up Maneuvers

Seonggyun Lee, Joonseong Kang, Jeyoon Yeom, Dongwg Hong, Youngmin Kim, Kyungwoo Song

http://doi.org/10.5626/JOK.2025.52.3.241

This paper proposes a new model for processing multivariate time series data aimed at predicting nonlinear trajectories related to aircraft pull-up maneuvers. To achieve this, aircraft trajectories were predicted using CSDI (Conditional Score-based Diffusion Models for Imputation), a state-of-the-art generative AI model. Specifically, because the flight distance and shape of the aircraft vary significantly depending on the presence of pull-up maneuvers, the data were separated into subsets with and without these maneuvers to train and predict distinct models. Experimental results demonstrated that the model predicted trajectories very similar to actual trajectories and achieved superior performance in MAE, RMSE, and CRPS metrics compared to existing deep learning models. This study not only enhances the accuracy of aircraft trajectory prediction but also suggests the potential for more sophisticated predictions through future integration with Classifier Diffusion models.

Multi-Level Attention-Based Generation Model for Long-Term Conversation

Hongjin Kim, Bitna Keum, Jinxia Huang, Ohwoog Kwon, Harksoo Kim

http://doi.org/10.5626/JOK.2025.52.2.117

Research into developing more human-like conversational models is actively underway, utilizing persona memory to generate responses. Many existing studies employ a separate retrieval model to identify relevant personas from memory, which can slow down the overall system and make it cumbersome. Additionally, these studies primarily focused on ability to respond by reflecting a persona well. However, the ability to determine the necessity of referencing a persona should precede this. Therefore, in this paper, we propose a model that does not use a retriever. Instead, the need to reference memory was determined through multi-level attention operations within the generation model itself. If a reference is deemed necessary, the response reflects the relevant persona; Otherwise, the response focuses on the conversational context. Experimental results confirm that our proposed model operates effectively in long-term conversations.

Improvement of Background Inpainting using Binary Masking of a Generated Image

Jihoon Lee, Chan Ho Bae, Seunghun Lee, Myung-Seok Choi, Ryong Lee, Sangtae Ahn

http://doi.org/10.5626/JOK.2024.51.6.537

Recently, image generation technology has been rapidly advancing in the field of deep learning. One of the most effective ways to represent images is by using text prompts to generate them. The performance of models that generate images using this technique is outstanding. However, it is not easy to naturally change specific parts of an image using only text prompts. This is considered a typical problem with conventional image generation models. Thus, in this study, we developed a background inpainting technique that extracts text for each area of an image and uses it as a basis to seamlessly change the background while preserving the objects in the image. In particular, the background transformation inpainting technique developed in this study has the advantage of not only transforming a single image but also rapidly transforming multiple images. Therefore, the proposed text prompt-based image style transfer can be used in fields with limited data for training, and the technique could enhance the performance of models through image augmentation.

Polyphonic Music Generation with Sequence Generative Adversarial Networks

Sang-gil Lee, Uiwon Hwang, Seonwoo Min, Sungroh Yoon

http://doi.org/10.5626/JOK.2024.51.1.78

In this paper, we propose an application of sequence generative adversarial networks (SeqGAN) for generating polyphonic musical sequences. We introduce a representation of polyphonic MIDI files that could encapsulate both chords and melodies with dynamic timings. This method condensed the duration, octaves, and keys of both melodies and chords into a single word vector representation. Our generator composed of recurrent neural networks was trained to predict distributions of musical word sequences. Additionally, we employed the least square loss function for the discriminator to stabilize training of the model. Our model could create sequences that are musically coherent. It exhibited improved quantitative and qualitative measures.

Response-Considered Query Token Importance Weight Calculator with Potential Response for Generating Query-Relevant Responses

So-Eon Kim, Choong Seon Hong, Seong-Bae Park

http://doi.org/10.5626/JOK.2022.49.8.601

The conversational response generator(CRG) has made great progress through the sequence-to-sequence model, but it often generates an over-general response which can be a response to all queries or an inappropriate response. Some efforts have been made to modify the traditional loss function to solve this problem and reduce the generation of irrelevant responses to the query by solving the problem of the lack of background knowledge of the CRG, but they did not solve both problems. This paper propose the use of a query token importance calculator because the cause of generating unrelated and overly general responses is that the CRG does not capture the core of the query. Also, based on the theory that the questioner induces a specific response from the listener and designs the speech, this paper proposes to use the golden response to understand the core meaning of the query. The qualitative evaluation confirmed that the response generator using the proposed model was able to generate responses related to the query compared to the model that did not use the proposed model.

Sentence Generation from Knowledge Base Triples Using Attention Mechanism Encoder-decoder

Garam Choi, Sung-Pil Choi

http://doi.org/10.5626/JOK.2019.46.9.934

In this paper, we have investigated the generation of natural language sentences by using Knowledge Base Triples data with a structured structure. In order to generate a sentence that expresses the triple, a LSTM (Long Short-term Memory Network) encoder-decoder structure is used along with an Attention Mechanism. The BLEU score and ROUGE score for the test data were 42.264 (BLEU-1), 32.441 (BLEU-2), 26.820 (BLEU-3), 24.446 (BLEU-4), and 47.341 and 0.8% (based on BLEU-1) for the data comparison model. In addition, the average of the top 10 test data BLEU scores was recorded as 99.393 (BLEU-1).


Search




Journal of KIISE

  • ISSN : 2383-630X(Print)
  • ISSN : 2383-6296(Electronic)
  • KCI Accredited Journal

Editorial Office

  • Tel. +82-2-588-9240
  • Fax. +82-2-521-1352
  • E-mail. chwoo@kiise.or.kr