Digital Library[ Search Result ]
An Improved Algorithm of Finding a Maximal Common Subsequence
Hyeonjun Shin, Joong Chae Na, Jeong Seop Sim
http://doi.org/10.5626/JOK.2023.50.9.737
A maximal common subsequence (MCS) of two strings is a common subsequence that cannot be extended by inserting any character. Unlike the longest common subsequence (LCS), the length of MCS can vary as the longest MCS is an LCS. Although LCS is commonly used to compare similarities of two sequences, computing can take a significant amount of time. Hence, finding a longer MCS is important, as it can be computed faster than the LCS. An algorithm was proposed to compute one of the MCSs of two strings X and Y of total length n using O(kn) space and O(n√(logn/loglogn)) time. Improved algorithms were also proposed. In this paper, we present an algorithm that can check for more characters to compute an MCS. The algorithm proposed in this paper runs in O(kn) space and O(n√(logn/loglogn)) time for a given constant k. Experimental results using both real and randomly generated data showed that the length of the MCS computed by the algorithm proposed in this paper could be up to 6.31 times longer than those computed by previous algorithms.
Algorithms for the k-Scaled Order-Preserving Pattern Matching Problem
Kyung Bin Park, Youngho Kim, Joong Cha Na, Jeong Seop Sim
http://doi.org/10.5626/JOK.2022.49.8.585
Two strings of the same length are order-isomorphic if the relative orders of their characters are the same. Given text T of length n and pattern P of length m, the order-preserving pattern matching problem is to find all substrings of T that are order-isomorphic to P. Order-preserving pattern matching can be used to analyze time-series data such as stock indices and melodies. In this paper, we defined the k-scaled order-preserving pattern matching problem and proposed an O(n+mlogm)-time algorithm for the problem. We also proposed a parallel algorithm for the problem, which runs in O(m+k) time using O(n+m) threads.
Parallel Implementation of the Order-Preserving Multiple Pattern Matching Algorithm using the Karp-Rabin Algorithm
Kyung Bin Park, Youngho Kim, Jeong Seop Sim
http://doi.org/10.5626/JOK.2021.48.3.249
When two strings of the same length have the same relative orders, they are called order-isomorphic. Given a text T(|T|=n)and a set of patterns P̃={P₁,P₂,...,Pk}, the order-preserving multiple pattern matching problem is to find all substrings of T that are order-isomorphic to any pattern in P̃ . Let m be the length of the shortest pattern, m̅ be the length of the longest pattern, and M be the sum of lengths of all the patterns in P̃. When M is polynomial with respect to m, an order-preserving multiple pattern matching algorithm using the Karp-Rabin algorithm was presented whose searching step runs in O(n) time on average. In this paper, we implemented an order-preserving multiple pattern matching algorithm using the Karp-Rabin algorithm in parallel. It runs in O(m̅) time on average using O(M) threads in the preprocessing step and runs in O(m) time on average using O(n) threads in the searching step. The experimental results for random strings as input showed that our parallel implementation performed approximately 201.5 times faster than the existing sequential algorithm when n=1,000,000, k=1,000, m=5.
Parallel Computation of Order-Preserving Periods and Order-Preserving Borders of a Set of Strings
http://doi.org/10.5626/JOK.2019.46.12.1232
Given two strings of the same length over an integer alphabet, those two strings are order-isomorphic when they have the same relative ranks. When strings order-isomorphic to T[1..p](1 ≤ p ≤ n) are periodically repeated in T, a representation of the order relations of T[1..p] is referred to as an order-preserving period of T. When a prefix T[1..q] (1 ≤ q ≤ n) of T is order-isomorphic to a suffix T[n-q+1..n] of T, a representation of the order relations of T[1..q] is called an order-preserving border of T. The lengths of all order-preserving periods (resp. all order-preserving borders) of T can be computed in O(n logn) time using the Z-function. Given a set Ŝ={S₁, S₂,..., Sr}of strings of length n over an integer alphabet, we propose parallel algorithms computing the lengths of all order-preserving periods and all order-preserving borders of Ŝ using O(rn) threads in O(n) time by the Z-function. When compared to each sequential algorithm for Dow Jones Industrial Average, the experimental results show that our parallel algorithm for computing the lengths of all order-preserving periods (resp. all order-preserving borders) of Ŝ runs approximately 3.47 (resp. 3.41) times faster when r =1,000, n =10,000.
Parallel Algorithms for the Boxed-Mesh Permutation Pattern Matching Problem
Jihyo Choi, Youngho Kim, Joong Chae Na, Jeong Seop Sim
http://doi.org/10.5626/JOK.2019.46.4.299
Given a text T(|T|= n) and a pattern P(|P|= m), the boxed-mesh permutation pattern matching problem asks to find all boxed subsequences of T that are order-isomorphic to P. In this paper we present two parallel algorithms for the problem. We first propose an O(nm) -time parallel algorithm using O(n) threads and then propose an O(n)-time algorithm using O(nm) threads. The experimental results for Daw Jones Industrial Average show that our first and second algorithms run approximately 7.2 times and 20.6 times, respectively, faster compared to the sequential algorithm using order-statistics trees when n = 36,364 and m = 30.
An Order-Preserving Pattern Matching Algorithm using Fingerprints of Two q-grams
Gwangmo Yoo, Youngho Kim, Jeong Seop Sim
http://doi.org/10.5626/JOK.2018.45.11.1111
Given a text T of length n and a pattern P of length m, the order-preserving pattern matching problem is to find all substrings in T that have the same relative orders as P. Recently, an O(nm+nqlogq+q!)-time order-preserving pattern matching algorithm was proposed that uses fingerprints of q-grams. In this paper, we propose an order-preserving pattern matching algorithm using the fingerprints of two q-grams to improve execution times. The experimental results for randomly generated T (n = 5,000,000) and P (m = 5,10,15) show that our algorithm runs up to approximately 12% faster than the previous algorithm. Also, for T using Dow Jones Industrial Average (n = 34,658) and P (m = 5,10,15) randomly extracted from T, our algorithm runs up to approximately 10% faster than the previous algorithm.
Parallel Computation of Z-Function for Order-Preserving Pattern Matching and Order-Preserving Multiple Pattern Matching
Youkun Shin, Youngho Kim, Jeong Seop Sim
http://doi.org/10.5626/JOK.2018.45.8.778
Given a text T of length n and a pattern P of length m, the order-preserving pattern matching problem is to find all substrings in T which are order-isomorphic to P. Given a text T of length n and a set of patterns W={P₁, P₂,…, Pb}, the order-preserving multiple pattern matching problem is to find all substrings in T which are order-isomorphic to patterns of W. In this paper, we present two parallel algorithms based on the Z-function. The first algorithm for the order-preserving pattern matching problem runs in O(m) time using O(n+hm) threads and the second algorithm for the order-preserving multiple pattern matching problem runs in O(n+M) time using O(b(n+M)) threads, where h is the number of blocks and M is the length of the longest pattern in W. Experimental results show that our parallel algorithm for the order-preserving pattern matching problem is approximately 71.2 times faster than the sequential algorithm when m=10 and n=1,000,000, and that our parallel algorithm for the order-preserving multiple pattern matching problem is approximately 12.2 times faster than the sequential algorithm when b=1,000, m=10, and n=1,000.
Parallel Algorithms for Finding δ-approximate Periods and γ-approximate Periods of Strings over Integer Alphabets
http://doi.org/10.5626/JOK.2017.44.8.760
Repetitive strings have been studied in diverse fields such as data compression, bioinformatics and so on. Recently, two problems of approximate periods of strings over integer alphabets were introduced, finding minimum δ-approximate periods and finding minimum γ-approximate periods. Both problems can be solved in O(n²) time when n is the length of the string. In this paper, we present two parallel algorithms for solving the above two problems in O(n²) time using O(n²) threads, respectively. The experimental results show that our parallel algorithms for finding minimum δ-approximate (resp. γ-approximate) periods run approximately 19.7 (resp. 40.08) times faster than the sequential algorithms when n = 10,000.
δ-approximate Periods and γ-approximate Periods of Strings over Integer Alphabets
(δ, γ)-matching for strings over integer alphabets can be applied to such fields as musical melody and share prices on stock markets. In this paper, we define δ-approximate periods and γ-approximate periods of strings over integer alphabets. We also present two O(n²) - time algorithms, each of which finds minimum δ-approximate periods and minimum γ-approximate periods, respectively. Then, we provide the experimental results of execution times of both algorithms.
A Hashing-Based Algorithm for Order-Preserving Multiple Pattern Matching
Munseong Kang, Sukhyeun Cho, Jeong Seop Sim
Given a text Tand a pattern P, the order-preserving pattern matching problem is to find all substrings in T which have the same relative orders as P. The order-preserving pattern matching problem has been studied in terms of finding some patterns affected by relative orders, not by their absolute values. Given a text T and a pattern set ℙ, the order-preserving multiple pattern matching problem is to find all substrings in T which have the same relative orders as any pattern in ℙ. In this paper, we present a hashing-based algorithm for the order-preserving multiple pattern matching problem.
Search

Journal of KIISE
- ISSN : 2383-630X(Print)
- ISSN : 2383-6296(Electronic)
- KCI Accredited Journal
Editorial Office
- Tel. +82-2-588-9240
- Fax. +82-2-521-1352
- E-mail. chwoo@kiise.or.kr