Digital Library[ Search Result ]
Spatio-Temporal Modeling via Adaptive Frequency Filtering for Video Action Recognition
Minji Kim, Taehoon Kim, Jonghyeon Seon, Bohyung Han
http://doi.org/10.5626/JOK.2024.51.12.1078
Modeling long-term spatio-temporal dependencies in video data is challenging, as CNNs often struggle to capture global context through their local receptive fields. To address this problem, we propose an efficient global spatio-temporal modeling method that integrates easily with existing CNN models. Our approach utilizes Discrete Cosine Transform (DCT) to shift information into the frequency domain, where two adaptive filtering paths operate complementarily: one removes redundant frequencies while preserving essential information, and the other enhances important frequencies for spatio-temporal modeling. We introduce DynamicMNIST, a lightweight dataset featuring various digit behaviors like shifting, rotating, and scaling. Our evaluations on three public benchmarks and DynamicMNIST demonstrate that the proposed module enhances activity recognition performance across different CNN models with minimal additional parameters and computational costs.
Ensemble Modeling with Convolutional Neural Networks for Application in Visual Object Tracking
Minji Kim, Ilchae Jung, Bohyung Han
http://doi.org/10.5626/JOK.2021.48.2.211
In the area of computer vision, visual object tracking aims to estimate the status of a target object from an input video stream, which can be broadly applicable to industries such as surveillance and the military. Recently, deep learning-based tracking algorithms have gone through significant improvements by using tracking-by-detection or template-based approach. However, these approaches are still suffering from inherent limitations caused by each strategy. In this paper, we propose a novel method to model ensemble trackers by fusing the two strategies, tracking-by-detection and template-based approach. We report significantly enhanced performance on widely adopted visual object tracking benchmarks, OTB100, UAV123, and LaSOT.
Leveraging the Physical Properties of Real Objects to Manage Digital Photography in Augmented Reality
Han Joo Chae, Youli Chang, Minji Kim, Gwanmo Park, Jinwook Seo
http://doi.org/10.5626/JOK.2020.47.10.900
We introduced the concept of physical-object-oriented interaction that provides a natural user experience by leveraging the physical properties of real objects, and the development of ARphy, a tangible interface that enables people to manage and interact with digital photographs using real physical objects in augmented reality (AR). Unlike traditional mobile photo applications, ARphy utilizes the physical attributes and affordances of real objects for more intuitive usages. For example, people can hang travel photos on a souvenir, keep meaningful photos inside a box, or delete photos by putting them into a trash can. We designed the architecture of ARphy for use in various types of AR devices (e.g., mobile devices and headsets). Our qualitative user evaluation demonstrated that ARphy was intuitive, immersive, and fun to use and well-suited for managing digital photos in an AR environment.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr