Round 1:
IBSSoftware Interview question: Convert String = “jaavvaaaawooord” You need to count the frequency of each character using java stream
To count the frequency of each character in a string like "jaavvaaaawooord"
using Java streams, you can convert the string to a stream of characters, and then collect the results into a map where each key is a character, whose value is the frequency of that character. Here’s how you can do it:
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;
public class CharacterFrequency {
public static void main(String[] args) {
String input = "jaavvaaaawooord";
Map<Character, Long> frequencyMap = input.chars() // Creates an IntStream
.mapToObj(c -> (char) c) // Converts the IntStream to Stream<Character>
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting())); // Groups by character and counts them
frequencyMap.forEach((character, count) -> System.out.println(character + ": " + count));
}
}
In this code:
input.chars()
creates an IntStream
representing the string characters as integers.
.mapToObj(c -> (char) c)
converts each integer to its corresponding character, resulting in a Stream<Character>
.
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
collects the characters into a Map
, where each key is a character, and its value is the count of that character in the string.
Function.identity()
is used as the classifier function which returns its input argument, and Collectors.counting()
counts the number of elements in each group.
Map<Character, Long> frequencyMap = input.chars() // Creates an IntStream
.mapToObj(c -> (char) c) // Converts the IntStream to Stream<Character>
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting())); //
Why Not using Map<Character, Integer> frequencyMap
In the provided code snippet, Long
is used as the value type for the frequencyMap
because the Collectors.counting()
collector used in the groupingBy
operation returns a Long
value for each key (character).
The Collectors.counting()
collector is designed to count the number of elements in a stream, and since the stream can potentially be very large, it uses a Long
data type to avoid integer overflow issues. Even if the input stream is small enough to fit within the range of an Integer
, the collector still uses Long
to maintain consistency and avoid potential issues with larger inputs.
Using Integer
instead of Long
for the value type could potentially lead to arithmetic overflow errors if the count for a particular character exceeds the maximum value that an Integer
can hold (2^31 – 1 or approximately 2.1 billion). By using Long
, which has a much larger range (2^63 – 1 or approximately 9 quintillion), the risk of overflow is greatly reduced, and the code can handle larger inputs without issues.
In general, it is recommended to use Long
for counting operations, especially when dealing with potentially large data sets or streams, as it provides a wider range and helps avoid arithmetic overflow errors. Additionally, using Long
for counting operations is a common practice in many libraries and frameworks, as it promotes consistency and interoperability.
Running this code will print the frequency of each character in the string "jaavvaaaawooord"
.
Why we are using mapToObj instead of Map we can also use Map ?
Let’s address your question about using mapToObj
 instead of directly using map
. The choice between map
 and mapToObj
 depends on the desired output type:
mapToObj
:
- Converts each integer value (Unicode code point) to aÂ
Character
 object (boxed type). - Useful when you want to work with objects (e.g., for further stream operations or collecting results).
- In this case, we need aÂ
Stream<Character>
 to collect the characters into a map.
map
:
- Converts each integer value to a primitive type (e.g.,Â
int
,Âdouble
, etc.). - Useful when you want to work with primitive values directly.
- If you usedÂ
map
, you would get anÂIntStream
 of Unicode code points, which wouldn’t be suitable for collecting into a map by character.
In summary, we use mapToObj
 here because we want to work with a Stream<Character>
 to create the frequency map. If we used map
, we’d end up with an IntStream
, which wouldn’t serve our purpose in this case.
Another Approach
str.chars().mapToObj(c -> (char)c).collect(Collectors.groupingBy(c ->c , Collectors.counting() )).forEach((character, count) -> System.out.println(character+":"+count));
In Java, the primary reason you can’t directly use .stream()
on a String
to achieve the same result is because String
does not directly provide a stream of its characters. The String
class in Java does not implement the Streamable
interface, and thus does not have a .stream()
method that directly produces a Stream<Character>
.
However, the chars()
method of the String
class returns an IntStream
consisting of the integer values of the characters. This is essentially a stream of the Unicode code points of the characters in the string. The mapToObj
operation can then be used to convert these code points back into character objects, creating a Stream<Character>
.
If you’re looking to use a method that more explicitly represents a sequence of characters as a stream, you can use the chars()
method as shown in the previous example or use third-party libraries or utility methods that might offer a more straightforward way to obtain a Stream<Character>
from a string.
For completeness, here is how you might implement a utility method to directly obtain a Stream<Character>
from a String
, which under the hood still utilizes the chars()
method:
import java.util.stream.Stream;
public class StreamUtils {
public static Stream<Character> characterStream(String s) {
return s.chars().mapToObj(c -> (char) c);
}
}
And then use it in your code:
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;
public class CharacterFrequencyWithStream {
public static void main(String[] args) {
String input = "jaavvaaaawooord";
Map<Character, Long> frequencyMap = StreamUtils.characterStream(input)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
frequencyMap.forEach((character, count) -> System.out.println(character + ": " + count));
}
}
This approach encapsulates the conversion logic but essentially does the same thing under the hood, using chars()
to get the stream of character codes and then converting those codes into a Stream<Character>
.