Given a string, find the length of its longest substring with no repeating characters.
Input: abcdab
Output: 4
The given string has three substrings of length 4 with no repeating characters. They are the substrings: “abcd”, “bcda” and “cdab”. It also does not have any substring of a greater length consisting of unique characters.
Example Two
Input: aaaaa
Output: 1
• 1
• The string may contain the following ASCII characters only: a..z, A..Z, 0..9
We've provided three solutions.
We will first start with a brute force idea and will then try improving upon it using some general observations. Throughout this editorial, we are going to use str to name the input string and n for its length.
For all the solutions, we will need a container to track whether a given character has occurred in the substring or not. We will need this to make sure that no character is repeated. For this, we can use a boolean array to store the occurrences of each character in the substring (other alternatives could be unordered_map/unordered_set) as it will facilitate the O(1) look-up and update for each character.
We can one-by-one consider all the substrings of the given string. Let us keep track of the length of the longest substring with no repetitions and call it max_len.
If the length of the input string is n, then there will be n - start number of substrings starting from each index start.
Then, for every substring starting at an index start and ending at an index end (start
For example, suppose we have been given a string str = “abcdab”. Then, we will try generating all of its substrings “a”, “ab”, abc”, “abcd”, “abcda” and so on. Then, for each substring we will check if all the characters are unique in the current substring and if they are, we will check and update the value of max_len.
O(n^3).
As you might have noted, there are O(n^2) numbers of substrings for any given string of length n and traversing through each substring and checking for the unique characters using a boolean array takes O(n) time. Hence, the overall time complexity of this approach is O(n^3).
O(1).
Although we used a boolean array to store the occurrences of characters in the string, the size of that array is restricted to 256 irrespective of the size of the input string.
O(n).
Space used for input: O(n).
Auxiliary space used: O(1).
Space used for output: O(1).
So, total space complexity: O(n).
We can improve this O(n^3) solution to O(n) by observing the fact that the answer cannot be greater than 256. Reason being, the number of unique characters in any substring cannot be greater than 256. The end index of any substring will now be less than or equal to min (n - 1, start + 256 - 1). So, starting from any index start, we will just have to iterate 256 characters in the worst case. Note that we are not using the index end here. Thus, from any index start, we will do 256 iterations at most. Here we can again use a boolean array to store the occurrences of each character while iterating through any substring.
O(n).
Starting at any index start, we will iterate through 256 characters in the worst case. Therefore, the time complexity of this approach is O(n*256) = O(n).
O(1).
Space used by the Boolean array: O(1).
O(n).
Space used for input: O(n).
Auxiliary space used: O(1).
Space used for output: O(1).
So, total space complexity: O(n).
We can also use the sliding window approach to solve this problem in O(n) time. The previous solution had a constant factor of around 256 in the time complexity. We can reduce this constant factor considerably using the sliding window technique.
We maintain two endpoints start and end of the substring under consideration (start = end = 0 initially).
We move the index end to its right while there is no repetition (we will check for the repetition through the boolean array that we have been maintaining similar to the first two solutions).
Once there is a repetition, we will move the start index until that repetition is removed. Basically, here, str[end] will point to a character that is found elsewhere in the current substring. Therefore, to move this repetition out of our substring, we will squeeze the substring from the left until we remove the first occurrence of str[end] from our substring. Reason being, without removing this repetition, we cannot move ahead as we consider only the substring with all unique characters.
Let us now see a simulation of this idea for str = “abcdab”.
For the sake of simplicity, we will not show the complete boolean array of size 256. Let us call that array character_set. Here, we will only denote the characters for which the character_set holds a true value.
Step 0: Initially, we have start = 0 and end = 0. Also, the character set is initially empty as it has all the values as false. Hence, character_set = {}. Also, the result max_len = 0 initially.
Step 1: We check that str[end] is not present in the character set, hence we include it in the character set and widen up our window from the right. So, we increment end by 1.
Currently, we have:
[ab]cdab
character_set = {‘a’}
max_len = 1
Here, we represent the sliding window with a set of matching brackets. The starting bracket is placed before the index start and ending bracket is placed after index end.
Step 2:
[abc]dab
character_set = {‘a’, ‘b’}
max_len = 2
Again, str[end] is not present in the character_set, hence we include it in the character_set and widen up our window from the right.
Step 3:
[abcd]ab
character_set = {‘a’, ‘b’, ‘c’}
max_len = 3
Again, str[end] is not present in the character_set, hence we include it in the character_set and widen up our window from the right.
Step 4:
[abcda]b
character_set = {‘a’, ‘b’, ‘c’, ‘d’}
max_len = 4
Again, str[end] is not present in the character_set, hence we include it in the character_set and widen up our window from the right.
Step 5:
a[bcdab]
character_set = {‘a’, ‘b’, ‘c’, ‘d’}
max_len = 4
Here, str[end] = ‘a’ was already present in the character_set. So, we squeeze our window from the left and its first occurrence is removed from the string. After that, we again widen up our window from the right to check for the next character.
Step 6:
ab[cdab]
character_set = {‘a’, ‘b’, ‘c’, ‘d’}
max_len = 4
Here, str[end] = ‘b’ was already present in the character_set. So, we squeeze our window from the left and its first occurrence is removed from the string.
Now, end = 6 and thus, we have considered the complete string as we are no longer left with a different ending index of the substring. Therefore, we stop and report the final result as max_len = 4.
O(n).
As we will iterate through any character at most twice. Once, via start and once via end.
O(1).
As we used only some additional integer variables.
O(n).
Space used for input: O(n).
Auxiliary space used: O(1).
Space used for output: O(1).
So, total space complexity: O(n).