Register for our webinar

How to Nail your next Technical Interview

1 hour
Loading...
1
Enter details
2
Select webinar slot
*Invalid Name
*Invalid Name
By sharing your contact details, you agree to our privacy policy.
Step 1
Step 2
Congratulations!
You have registered for our webinar
check-mark
Oops! Something went wrong while submitting the form.
1
Enter details
2
Select webinar slot
*All webinar slots are in the Asia/Kolkata timezone
Step 1
Step 2
check-mark
Confirmed
You are scheduled with Interview Kickstart.
Redirecting...
Oops! Something went wrong while submitting the form.
close-icon
Iks white logo

You may be missing out on a 66.5% salary hike*

Nick Camilleri

Head of Career Skills Development & Coaching
*Based on past data of successful IK students
Iks white logo
Help us know you better!

How many years of coding experience do you have?

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Iks white logo

FREE course on 'Sorting Algorithms' by Omkar Deshpande (Stanford PhD, Head of Curriculum, IK)

Thank you! Please check your inbox for the course details.
Oops! Something went wrong while submitting the form.
closeAbout usWhy usInstructorsReviewsCostFAQContactBlogRegister for Webinar
Our June 2021 cohorts are filling up quickly. Join our free webinar to Uplevel your career
close

Indices Of Words In Text String Problem

Indices Of Words In Text String Problem Statement

Given some text and a bunch of words, find where each of the words appear in the text.

Function must return a list of lists of integers. One list for each one of the words. i-th list must contain all indices of characters in text where i-th word starts, in the ascending order. If i-th word isn’t found in the text at all, then i-th list must be [-1].

Example

{
"text": "you are very very smart",
"words": ["you", "are", "very", "handsome"]
}

Output:

[ [0], [4], [8, 13], [-1] ]

"you" is found in the given text starting at the index 0.\ "are" is found in the given text starting at the index 4.\ "very" is found in the given text two times, starting at the indices 8 and 13.\ "handsome" isn’t found in the given text.

Notes

Constraints:

  • text and words may contain a-z, A-Z, 0-9, "\$", "#", "@", "?", ";".
  • text may contain spaces, too, but never two or more spaces consecutively. Spaces separate words in the text string.
  • text won’t start or end with a space.
  • Indexing of characters in text is zero-based.
  • words list will contain unique strings.
  • 1 <= number of characters in text <= 1000000
  • 1 <= number of words <= 100000
  • 1 <= length of any word in words or in text <= 10

We provided three solutions. We will refer to the number of words in text as n, number of words in words as w and the maximum length of a word as l.

Indices Of Words In Text String Solution 1: Brute Force

We literally compare each word from words with every word from the text. When the two are equal we take note of the start index of the word in the text.

Time Complexity

O(n * w * l).

Processing each pair of words takes O(l) time as we compare them character by character. We compare each of n words with every one of w words for the total time complexity of O(n *w * l).

Auxiliary Space Used

O(n + w).

The list of lists that we return takes O(w + n) space. Since words are unique, any given word from text can match at most one word from words so the total number of indexes in the returned list of lists won’t exceed n and we know that the outer list has exactly w lists, giving us a total of O(w + n).

Space Complexity

O((n + w) * l).

Adding up O(w * l) of words, O(n * l) of text and O(w + n) of the auxiliary space we get O((n + w) * l).

Code For Indices Of Words In Text String Solution 1: Brute Force

    /*
    * Asymptotic complexity in terms of number of words in \`text\` \`n\`, number of words in \`words\` \`w\`,
        and the maximum length of a word \`l\`:
    * Time: O(n * w * l).
    * Auxiliary space: O(n + w).
    * Total space: O((n + w) * l).
    */

    static ArrayList<ArrayList<Integer>> find_words(String text, ArrayList<String> words) {
        ArrayList<ArrayList<Integer>> answer = new ArrayList<>();
        String[] wordsInText = text.split(" ");
        for (String word : words) {
            ArrayList<Integer> indexes = new ArrayList<>();
            int index = 0;
            for (String s : wordsInText) {
                if (s.compareTo(word) == 0) {
                    indexes.add(index);
                }
                index += s.length() + 1;
            }
            if (indexes.isEmpty()) {
                indexes.add(-1);
            }
            answer.add(indexes);
        }
        return answer;
    }

Indices Of Words In Text String Solution 2: 1St Optimal

In this solution we preprocess text and create its index, see textIndex variable. By the time that’d done, each word from the text has the list of its starting indices pre-compiled. All that’s left is to look up those lists of indexes for every word from words.

Time Complexity

O((n + w) * l).

It takes O(I) time to calculate hashcode or to compare two strings up to l characters long. Thus populating the hashmap with n words will take O(n * l), making w searches in that hashmap will take O(w * l). Total time is the sum of those: O(n * l) + O(w * l) = O((n + w) * l).

Auxiliary Space Used

O((n * l) + w).

Hashmap which we pre-compute takes O(n * l) space.

The list of lists that we return takes O(w + n) space (see explanation in Auxiliary space section for bruteforcesolution.java). Summing up the two gives O(n * l) + O(w + n) = O((n * l) + w).

Space Complexity

O((n + w) * l).

text input takes O(n * l) and words take O(w * l). Adding up those two and the auxiliary space we get the total space complexity: O(n * l) + O(w * l) + O((n * l) + w) = O((n + w) * l).

Code For Indices Of Words In Text String Solution 2: 1St Optimal

    /*
    * Asymptotic complexity in terms of number of words in \`text\` \`n\`, number of words in \`words\` \`w\`,
        and the maximum length of a word \`l\`:
    * Time: O((n + w) * l).
    * Auxiliary space: O((n * l) + w).
    * Total space: O((n + w) * l).
    */

    static ArrayList<ArrayList<Integer>> find_words(String text, ArrayList<String> words) {
        String[] wordsInText = text.split(" ");

        // {word -> [index1, index2]}
        HashMap<String, ArrayList<Integer>> textIndex = new HashMap<>();
        int currentIndex = 0;
        for (String word : wordsInText) {
            ArrayList<Integer> indexes = textIndex.get(word);
            if (indexes == null) {
                indexes = new ArrayList<>();
            }
            indexes.add(currentIndex);
            textIndex.put(word, indexes);
            currentIndex += word.length() + 1;
        }

        ArrayList<ArrayList<Integer>> answer = new ArrayList<>();
        for (String word : words) {
            ArrayList<Integer> indexes = textIndex.get(word);
            if (indexes == null) {
                indexes = new ArrayList<>(Collections.singleton(-1));
            }
            answer.add(indexes);
        }

        return answer;
    }

Indices Of Words In Text String Solution 3: 2Nd Optimal

In this solution we use a trie (prefix tree). First we insert all words from the text into the trie. Then we look up every word from words in the trie.

Although this solution has the same worst case time and space complexity as the hashmap based optimal_solution1.java, it will utilize less space when many words share common prefixes.

In the actual interview many interviewers will prefer to hear the trie based solution to the hashmap based one.

Time Complexity

O((n + w) * l).

Insert and search operations in the trie take O(l) time each.

The algorithm makes n insertions and w searches.

Auxiliary Space Used

O(n * l + w).

Trie takes O(n * l) space and the list of lists that we return takes O(w + n) (see explanation in Auxiliary space section for bruteforcesolution.java). Adding that up we get O(n * l + w).

Space Complexity

O((n + w) * l).

text input takes O(n * l) and words take O(w * l). Adding auxiliary space to that we get the total space complexity: O(n * l) + O(w * l) + O(n * l + w) = O((n + w) * l).

Code For Indices Of Words In Text String Solution 3: 2Nd Optimal

    /*
    * Asymptotic complexity in terms of number of words in \`text\` \`n\`, number of words in \`words\` \`w\`,
        and the maximum length of a word \`l\`:
    * Time: O((n + w) * l).
    * Auxiliary space: O(n * l + w).
    * Total space: O((n + w) * l).
    */

    static class TrieNode {
        final HashMap<Character, TrieNode> children = new HashMap<>();

        /**
         * For every word in the text that *ends* in this node,
         * index of where that word *starts* in the text will be in this list.
         */
        final ArrayList<Integer> indexes = new ArrayList<>();
    }

    static ArrayList<ArrayList<Integer>> find_words(String text, ArrayList<String> words) {
        TrieNode root = new TrieNode(); // Root ~ empty string => no associated character.
        String[] wordsInText = text.split(" ");
        int index = 0;
        for (String word : wordsInText) {
            insertIntoTrie(root, word, index);
            index += word.length() + 1;
        }

        ArrayList<ArrayList<Integer>> answer = new ArrayList<>();
        for (String word : words) {
            answer.add(findWord(root, word));
        }
        return answer;
    }

    static void insertIntoTrie(TrieNode root, String word, int index) {
        TrieNode head = root;
        for (char c : word.toCharArray()) {
            TrieNode node = head.children.get(c);
            if (node != null) {
                head = node;
            } else {
                node = new TrieNode();
                head.children.put(c, node);
                head = node;
            }
        }
        head.indexes.add(index);
    }

    static ArrayList<Integer> findWord(TrieNode root, String word) {
        TrieNode head = root;
        for (char c : word.toCharArray()) {
            TrieNode node = head.children.get(c);
            if (node != null) {
                head = node;
            } else {
                return new ArrayList<>(Collections.singleton(-1));
            }
        }
        return head.indexes.isEmpty() ?
                new ArrayList<>(Collections.singleton(-1)) : head.indexes;
    }

We hope that these solutions to indices of words in text string problem have helped you level up your coding skills. You can expect problems like these at top tech companies like Amazon and Google.

If you are preparing for a tech interview at FAANG or any other Tier-1 tech company, register for Interview Kickstart's FREE webinar to understand the best way to prepare.

Interview Kickstart offers interview preparation courses taught by FAANG+ tech leads and seasoned hiring managers. Our programs include a comprehensive curriculum, unmatched teaching methods, and career coaching to help you nail your next tech interview.

We offer 18 interview preparation courses, each tailored to a specific engineering domain or role, including the most in-demand and highest-paying domains and roles, such as:

‍To learn more, register for the FREE webinar.

Try yourself in the Editor

Note: Input and Output will already be taken care of.

Indices Of Words In Text String Problem

Indices Of Words In Text String Problem Statement

Given some text and a bunch of words, find where each of the words appear in the text.

Function must return a list of lists of integers. One list for each one of the words. i-th list must contain all indices of characters in text where i-th word starts, in the ascending order. If i-th word isn’t found in the text at all, then i-th list must be [-1].

Example

{
"text": "you are very very smart",
"words": ["you", "are", "very", "handsome"]
}

Output:

[ [0], [4], [8, 13], [-1] ]

"you" is found in the given text starting at the index 0.\ "are" is found in the given text starting at the index 4.\ "very" is found in the given text two times, starting at the indices 8 and 13.\ "handsome" isn’t found in the given text.

Notes

Constraints:

  • text and words may contain a-z, A-Z, 0-9, "\$", "#", "@", "?", ";".
  • text may contain spaces, too, but never two or more spaces consecutively. Spaces separate words in the text string.
  • text won’t start or end with a space.
  • Indexing of characters in text is zero-based.
  • words list will contain unique strings.
  • 1 <= number of characters in text <= 1000000
  • 1 <= number of words <= 100000
  • 1 <= length of any word in words or in text <= 10

We provided three solutions. We will refer to the number of words in text as n, number of words in words as w and the maximum length of a word as l.

Indices Of Words In Text String Solution 1: Brute Force

We literally compare each word from words with every word from the text. When the two are equal we take note of the start index of the word in the text.

Time Complexity

O(n * w * l).

Processing each pair of words takes O(l) time as we compare them character by character. We compare each of n words with every one of w words for the total time complexity of O(n *w * l).

Auxiliary Space Used

O(n + w).

The list of lists that we return takes O(w + n) space. Since words are unique, any given word from text can match at most one word from words so the total number of indexes in the returned list of lists won’t exceed n and we know that the outer list has exactly w lists, giving us a total of O(w + n).

Space Complexity

O((n + w) * l).

Adding up O(w * l) of words, O(n * l) of text and O(w + n) of the auxiliary space we get O((n + w) * l).

Code For Indices Of Words In Text String Solution 1: Brute Force

    /*
    * Asymptotic complexity in terms of number of words in \`text\` \`n\`, number of words in \`words\` \`w\`,
        and the maximum length of a word \`l\`:
    * Time: O(n * w * l).
    * Auxiliary space: O(n + w).
    * Total space: O((n + w) * l).
    */

    static ArrayList<ArrayList<Integer>> find_words(String text, ArrayList<String> words) {
        ArrayList<ArrayList<Integer>> answer = new ArrayList<>();
        String[] wordsInText = text.split(" ");
        for (String word : words) {
            ArrayList<Integer> indexes = new ArrayList<>();
            int index = 0;
            for (String s : wordsInText) {
                if (s.compareTo(word) == 0) {
                    indexes.add(index);
                }
                index += s.length() + 1;
            }
            if (indexes.isEmpty()) {
                indexes.add(-1);
            }
            answer.add(indexes);
        }
        return answer;
    }

Indices Of Words In Text String Solution 2: 1St Optimal

In this solution we preprocess text and create its index, see textIndex variable. By the time that’d done, each word from the text has the list of its starting indices pre-compiled. All that’s left is to look up those lists of indexes for every word from words.

Time Complexity

O((n + w) * l).

It takes O(I) time to calculate hashcode or to compare two strings up to l characters long. Thus populating the hashmap with n words will take O(n * l), making w searches in that hashmap will take O(w * l). Total time is the sum of those: O(n * l) + O(w * l) = O((n + w) * l).

Auxiliary Space Used

O((n * l) + w).

Hashmap which we pre-compute takes O(n * l) space.

The list of lists that we return takes O(w + n) space (see explanation in Auxiliary space section for bruteforcesolution.java). Summing up the two gives O(n * l) + O(w + n) = O((n * l) + w).

Space Complexity

O((n + w) * l).

text input takes O(n * l) and words take O(w * l). Adding up those two and the auxiliary space we get the total space complexity: O(n * l) + O(w * l) + O((n * l) + w) = O((n + w) * l).

Code For Indices Of Words In Text String Solution 2: 1St Optimal

    /*
    * Asymptotic complexity in terms of number of words in \`text\` \`n\`, number of words in \`words\` \`w\`,
        and the maximum length of a word \`l\`:
    * Time: O((n + w) * l).
    * Auxiliary space: O((n * l) + w).
    * Total space: O((n + w) * l).
    */

    static ArrayList<ArrayList<Integer>> find_words(String text, ArrayList<String> words) {
        String[] wordsInText = text.split(" ");

        // {word -> [index1, index2]}
        HashMap<String, ArrayList<Integer>> textIndex = new HashMap<>();
        int currentIndex = 0;
        for (String word : wordsInText) {
            ArrayList<Integer> indexes = textIndex.get(word);
            if (indexes == null) {
                indexes = new ArrayList<>();
            }
            indexes.add(currentIndex);
            textIndex.put(word, indexes);
            currentIndex += word.length() + 1;
        }

        ArrayList<ArrayList<Integer>> answer = new ArrayList<>();
        for (String word : words) {
            ArrayList<Integer> indexes = textIndex.get(word);
            if (indexes == null) {
                indexes = new ArrayList<>(Collections.singleton(-1));
            }
            answer.add(indexes);
        }

        return answer;
    }

Indices Of Words In Text String Solution 3: 2Nd Optimal

In this solution we use a trie (prefix tree). First we insert all words from the text into the trie. Then we look up every word from words in the trie.

Although this solution has the same worst case time and space complexity as the hashmap based optimal_solution1.java, it will utilize less space when many words share common prefixes.

In the actual interview many interviewers will prefer to hear the trie based solution to the hashmap based one.

Time Complexity

O((n + w) * l).

Insert and search operations in the trie take O(l) time each.

The algorithm makes n insertions and w searches.

Auxiliary Space Used

O(n * l + w).

Trie takes O(n * l) space and the list of lists that we return takes O(w + n) (see explanation in Auxiliary space section for bruteforcesolution.java). Adding that up we get O(n * l + w).

Space Complexity

O((n + w) * l).

text input takes O(n * l) and words take O(w * l). Adding auxiliary space to that we get the total space complexity: O(n * l) + O(w * l) + O(n * l + w) = O((n + w) * l).

Code For Indices Of Words In Text String Solution 3: 2Nd Optimal

    /*
    * Asymptotic complexity in terms of number of words in \`text\` \`n\`, number of words in \`words\` \`w\`,
        and the maximum length of a word \`l\`:
    * Time: O((n + w) * l).
    * Auxiliary space: O(n * l + w).
    * Total space: O((n + w) * l).
    */

    static class TrieNode {
        final HashMap<Character, TrieNode> children = new HashMap<>();

        /**
         * For every word in the text that *ends* in this node,
         * index of where that word *starts* in the text will be in this list.
         */
        final ArrayList<Integer> indexes = new ArrayList<>();
    }

    static ArrayList<ArrayList<Integer>> find_words(String text, ArrayList<String> words) {
        TrieNode root = new TrieNode(); // Root ~ empty string => no associated character.
        String[] wordsInText = text.split(" ");
        int index = 0;
        for (String word : wordsInText) {
            insertIntoTrie(root, word, index);
            index += word.length() + 1;
        }

        ArrayList<ArrayList<Integer>> answer = new ArrayList<>();
        for (String word : words) {
            answer.add(findWord(root, word));
        }
        return answer;
    }

    static void insertIntoTrie(TrieNode root, String word, int index) {
        TrieNode head = root;
        for (char c : word.toCharArray()) {
            TrieNode node = head.children.get(c);
            if (node != null) {
                head = node;
            } else {
                node = new TrieNode();
                head.children.put(c, node);
                head = node;
            }
        }
        head.indexes.add(index);
    }

    static ArrayList<Integer> findWord(TrieNode root, String word) {
        TrieNode head = root;
        for (char c : word.toCharArray()) {
            TrieNode node = head.children.get(c);
            if (node != null) {
                head = node;
            } else {
                return new ArrayList<>(Collections.singleton(-1));
            }
        }
        return head.indexes.isEmpty() ?
                new ArrayList<>(Collections.singleton(-1)) : head.indexes;
    }

We hope that these solutions to indices of words in text string problem have helped you level up your coding skills. You can expect problems like these at top tech companies like Amazon and Google.

If you are preparing for a tech interview at FAANG or any other Tier-1 tech company, register for Interview Kickstart's FREE webinar to understand the best way to prepare.

Interview Kickstart offers interview preparation courses taught by FAANG+ tech leads and seasoned hiring managers. Our programs include a comprehensive curriculum, unmatched teaching methods, and career coaching to help you nail your next tech interview.

We offer 18 interview preparation courses, each tailored to a specific engineering domain or role, including the most in-demand and highest-paying domains and roles, such as:

‍To learn more, register for the FREE webinar.

Worried About Failing Tech Interviews?

Attend our free webinar to amp up your career and get the salary you deserve.

Ryan-image
Hosted By
Ryan Valles
Founder, Interview Kickstart
blue tick
Accelerate your Interview prep with Tier-1 tech instructors
blue tick
360° courses that have helped 14,000+ tech professionals
blue tick
100% money-back guarantee*
Register for Webinar
All Blog Posts