Volume 6 Number 12 (Dec. 2011)
Home > Archive > 2011 > Volume 6 Number 12 (Dec. 2011) >
JSW 2011 Vol.6(12): 2391-2398 ISSN: 1796-217X
doi: 10.4304/jsw.6.12.2391-2398

Multi-pattern Matching with Wildcards

Meng Zhang1, Yi Zhang2, Jijun Tang3, Xiaolong Bai1

1College of Computer Science and Technology, Jilin University, Changchun, China
2Department of Computer Science, Jilin Business and Technology College, Changchun, China
3Department of Computer Science & Engineering, Univ. of South Carolina, USA


Abstract—Multi-pattern matching with wildcards is to find all the occurrences of a set of patterns with wildcards in a text. This problem arises in various fields, such as computational biology and network security. But the problem is not extensively studied as the single pattern case and there is no efficient algorithm for this problem. In this paper, we present efficient algorithms based on the fast Fourier transform. Let P = {p1, . . . , pk} be a set of patterns with wildcards where the total length of patterns is |P|, and a text t of length n over alphabet a1, . . . , a. We present three algorithms for this problem where patterns are matched simultaneously. The first algorithm finds the matches of a small set of patterns in the text in O(n log |P| + occ log k) time where occ is the total number of occurrences of P in t. The words used in the algorithm are of size kd2 lg e+Pk i=1dlg |pi|e bits. The second algorithm is based on a prime number encoding. It runs in time O(n logm + occ log k) where m is the length of the longest pattern in P. The algorithm uses words with kdlg(2m2 +k2)e bits. The third one finds the occurrences of patterns in the text in time O(n log |P| log  + occ log k) by computing the Hamming distance between patterns and the text. The algorithm uses words with Pk i=1dlg |pi|e bits. Moreover, we demonstrate an FFT implementation based on the modular arithmetic for machines with 64-bit word. Finally, we show that these algorithms can be easily parallelized, and the parallelized algorithms are given as well.

Index Terms—Algorithm; Multi-pattern matching; Wildcards; FFT.

[PDF]

Cite: Meng Zhang, Yi Zhang, Jijun Tang, Xiaolong Bai, "Multi-pattern Matching with Wildcards," Journal of Software vol. 6, no. 12, pp. 2391-2398, 2011.

General Information

  • ISSN: 1796-217X (Online)

  • Abbreviated Title: J. Softw.

  • Frequency:  Quarterly

  • APC: 500USD

  • DOI: 10.17706/JSW

  • Editor-in-Chief: Prof. Antanas Verikas

  • Executive Editor: Ms. Cecilia Xie

  • Abstracting/ Indexing: DBLP, EBSCO,
           CNKIGoogle Scholar, ProQuest,
           INSPEC(IET), ULRICH's Periodicals
           Directory, WorldCat, etc

  • E-mail: jsweditorialoffice@gmail.com

  • Oct 22, 2024 News!

    Vol 19, No 3 has been published with online version   [Click]

  • Jan 04, 2024 News!

    JSW will adopt Article-by-Article Work Flow

  • Apr 01, 2024 News!

    Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec)     [Click]

  • Apr 01, 2024 News!

    Papers published in JSW Vol 18, No 1- Vol 18, No 6 have been indexed by DBLP   [Click]

  • Jun 12, 2024 News!

    Vol 19, No 2 has been published with online version   [Click]