Find All Occurrences of Multiple Patterns in Java

In this article, we'll look at how to find all occurrences of multiple patterns in Java. In Java, the 'Matcher' class applies a constructed regular expression pattern to input text and performs different matching operations. It is retrieved by using the 'matcher()' method with a 'Pattern' object. The 'Matcher' class includes methods like 'find()','start()', 'end()', and others that allow developers to determine the occurrences and positions of pattern matches in the supplied text.

Example 1:

Input: String text = "This is a sample text with multiple patterns to match."

String[] patterns = {"sample", "patterns", "match"}

Output: [1, 1, 1]

Explanation:

"sample" appears at index 10-16.

The pattern "sample" appears in the input text.

The match begins with index 10, which corresponds to the letter 's' in the word "sample."

The match ends at index 16, which corresponds to the letter 'e' in the word "sample."

As a result, the input text's substring from index 10 to 16 is "sample."

The pattern "sample" appears once in the input text.

"patterns" appears at index 28-36.

The pattern "patterns" appears in the input text.

The match begins with index 28, which corresponds to the letter 'p' in the word "patterns."

The match ends at index 36, which corresponds to the letter 's' in the word "patterns."

As a result, the input text's substring from index 28 to 36 is "patterns."

The pattern "patterns" appears once in the input text.

"match" appears at index 42-47.

The pattern "match" appears in the input text.

The match begins with index 42, which corresponds to the letter 'm' in the word "match."

The match ends at index 47, which corresponds to the letter 'h' in the word "match."

As a result, the input text's substring from index 42 to 47 is "match."

The pattern "match" appears once in the input text.

As a result, Final output is [1,1,1].

Approach: Iterative

The code iterates through a series of predetermined patterns, converts each pattern into a regular expression, and then counts the number of times these patterns appear in a given text. The counts are saved in an array.

FileName: MultiPatternMatcher.java

import java.util.regex.Matcher;

import java.util.regex.Pattern;

public class MultiPatternMatcher {

    public static void main(String[] args) {

        String text = "This is a sample text with multiple patterns to match.";

        // Define multiple patterns

        String[] patterns = {"sample", "patterns", "match"};

        // Initialize an array to store counts

        int k = patterns.length;  // Number of patterns

        int[] patternCounts = new int[k];

        // Length of the input text

        int n = text.length();

        // Loop through each pattern

        for (int i = 0; i < k; i++) {

            // Length of the current pattern

            int m = patterns[i].length();

            Pattern pattern = Pattern.compile(patterns[i]);

            Matcher matcher = pattern.matcher(text);

            //iterator approach to count occurrences

            while (matcher.find()) {

                patternCounts[i]++;

            }

        }

        // Display the counts as an array

        System.out.print("[");

        for (int i = 0; i < k; i++) {

            System.out.print(patternCounts[i]);

            if (i < k - 1) {

                System.out.print(", ");

            }

        }

        System.out.println("]");

    }

}

Output:

[1, 1, 1]

Complexity Analysis:

The provided Java code has a time complexity of O(k * m * n), where k represents the number of patterns, m the length of each pattern, and n the length of the input text. The space complexity is O(k + m + n), owing mostly to the array patternCounts and the storage requirements for Pattern and Matcher objects.

Approach: Parallel Processing

Parallel processing is used to count the occurrences of patterns in the input text. IntStream.range(0, patterns.length).parallel().forEach(...) method parallelizes the loop, allowing many threads to process various patterns at the same time. When occurrences of a certain pattern are detected, each thread generates a Matcher for it and increments the associated count in the patternCounts array.

FileName: MultiPatternMatcher1.java

import java.util.regex.Matcher;

import java.util.regex.Pattern;

import java.util.concurrent.atomic.AtomicInteger;

import java.util.stream.IntStream;

public class MultiPatternMatcher1 {

    public static void main(String[] args) {

        String text = "This is a sample text with multiple patterns to match.";

        // Define multiple patterns

        String[] patterns = {"sample", "patterns", "match"};

        // Initialize an array to store counts

        AtomicInteger[] patternCounts = new AtomicInteger[patterns.length];

        IntStream.range(0, patterns.length).forEach(i -> patternCounts[i] = new AtomicInteger(0));

        // Compile patterns outside the loop

        Pattern[] compiledPatterns = IntStream.range(0, patterns.length)

                .mapToObj(i -> Pattern.compile(patterns[i]))

                .toArray(Pattern[]::new);

        // Loop through each pattern using parallel processing

        IntStream.range(0, patterns.length).parallel().forEach(i -> {

            Matcher matcher = compiledPatterns[i].matcher(text);

            while (matcher.find()) {

                patternCounts[i].incrementAndGet();

            }

        });

        // Display the counts as an array

        System.out.print("[");

        IntStream.range(0, patterns.length).forEach(i -> {

            System.out.print(patternCounts[i].get());

            if (i < patterns.length - 1) {

                System.out.print(", ");

            }

        });

        System.out.println("]");

    }

}

Output:

[1, 1, 1]

Complexity Analysis:

The code has a time complexity of O(k * n), where k is the number of patterns and n is the input text's length. The space complexity is O(k + k * m), with m representing the average length of the patterns. The method effectively counts the occurrences of numerous patterns through parallel processing, which can improve speed by exploiting several processor cores.