Java 8 Streams for String manipulation - java-8

I want to perform multiple tasks on a single string.
I need to get a string and extract different sub-strings using a delimiter ("/"), then reverse the list of sub-strings and finally join them using another delimiter (".") such that /tmp/test/hello/world/ would turn into: world.hello.test.tmp
Using Java 7 the code is as follows:
String str ="/tmp/test/";
List<String> elephantList = new ArrayList<String>(Arrays.asList(str.split("/")));
StringBuilder sb = new StringBuilder();
for (int i=elephantList.size()-1; i>-1; i--) {
String a = elephantList.get(i);
if (a.equals(""))
{
elephantList.remove(i);
}
else
{
sb.append(a);
sb.append('.');
}
}
sb.setLength(sb.length() - 1);
System.out.println("result" + elephantList + " " + sb.toString());
I was wondering how I could do the same thing using Java 8 streams and the join function it has for Strings

The most straightforward way is to collect the terms into a list, reverse the list and join on the new delimiter:
import static java.util.stream.Collectors.toCollection;
List<String> terms = Pattern.compile("/")
.splitAsStream(str)
.filter(s -> !s.isEmpty())
.collect(toCollection(ArrayList::new));
Collections.reverse(terms);
String result = String.join(".", terms);
You can do it without collecting into an intermediate list but it will be less readable and not worth the trouble for practical purposes.
Another issue to consider is that your strings appear to be paths. It is usually better to use Path class rather than splitting by "/" manually. Here's how you would do this (this approach also demonstrates how to use IntStream over indexes to stream over a list backwards):
Path p = Paths.get(str);
result = IntStream.rangeClosed(1, p.getNameCount())
.map(i -> p.getNameCount() - i) // becomes a stream of count-1 to 0
.mapToObj(p::getName)
.map(Path::toString)
.collect(joining("."));
This will have the advantage of being OS-independent.

If you do not want an intermediate list and just want to join the String reversely:
String delimiter = ".";
Optional<String> result = Pattern.compile("/")
.splitAsStream(str)
.filter(s -> ! s.isEmpty())
.reduce((s, s2) -> String.join(delimiter, s2, s));
Or just use .reduce((s1, s2) -> s2 + '.' + s1); as it is probably as readable as String.join(".", s2, s1); (thanks Holger for the suggestion).
From then on you could do one of the following:
result.ifPresent(System.out::println); // print the result
String resultAsString = result.orElse(""); // get the value or default to empty string
resultAsString = result.orElseThrow(() -> new RuntimeException("not a valid path?")); // get the value or throw an exception
Another way using StreamSupport and Spliterator (inspired by Mishas suggestion to use a Path):
Optional<String> result = StreamSupport.stream(Paths.get(str).spliterator(), false)
.map(Path::getFileName)
.map(Path::toString)
.reduce((s, s2) -> s2 + '.' + s);
Of course you can simplify it by omitting the intermediate Optional-object and just call your desired method immediately:
stream(get(str).spliterator(), false)
.map(Path::getFileName)
.map(Path::toString)
.reduce((s, s2) -> s2 + '.' + s)
.ifPresent(out::println); // orElse... orElseThrow
in the last example you would add the following static imports:
import static java.lang.System.out;
import static java.nio.file.Paths.get;
import static java.util.stream.StreamSupport.stream;

Your Java 7 code isn’t what I’d call a straight-forward solution.
This is, how I would implement it in Java 7:
String str = "/tmp/test/";
StringBuilder sb = new StringBuilder(str.length()+1);
for(int s=str.lastIndexOf('/'), e=str.length(); e>=0; e=s, s=str.lastIndexOf('/', e-1)) {
if(s+1<e) sb.append(str, s+1, e).append('.');
}
if(sb.length()>0) sb.setLength(sb.length() - 1);
System.out.println("result " + sb);
and thinking about it again, this is also how I’d implement it in Java 8, as using the Stream API doesn’t really improve this operation.

You can write like this :
String newPath = Arrays.asList(path.split("/")).stream()
.filter(x -> !x.isEmpty())
.reduce("",(cc,ss)->{
if(!cc.isEmpty())
return ss+"."+cc;
else return ss;
},(s1,s2)->s2+s1);
The filter eliminates first backslash and reduce method has to control if there are any other last empty strings.

Related

Building binary tree using Java Stream. Is it possible in Java Stream sorting while reduce?

I want to build a Huffman tree from input string using Java Stream.
This is how I do it right now.
Class MyNode with all needed Constructors:
public static class MyNode {
Character value;
MyNode left;
MyNode right;
long freq;
...
}
Reading a line and getting List of MyNodes:
Scanner scan = new Scanner(System.in);
String input = scan.next();
List<MyNode> listOfNodes = input.chars().boxed()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet()
.stream().sorted(Comparator.comparingLong(Map.Entry::getValue))
.map(x -> new MyNode((char)x.getKey().intValue(), x.getValue()))
.collect(Collectors.toList());
This while loop I want to replace with something from Stream:
while (listOfNodes.size() > 1) {
MyNode first = listOfNodes.get(0);
MyNode second = listOfNodes.get(1);
listOfNodes.remove(first);
listOfNodes.remove(second);
listOfNodes.add(new MyNode(first.freq + second.freq, first, second));
listOfNodes.sort(Comparator.comparingLong(MyNode::getFreq));
}
In while loop I build tree like this
The first idea was to use Stream reduce, but then I need to sort resulting list after every reduce.
This is not a task that benefits from using the Stream API. Still, there are ways to improve it.
Sorting the entire list just for the sake of inserting a single element, bear an unnecessary overhead. Since the list is sorted to begin with, you can use binary search to efficiently find the correct insertion position so that the list stays sorted:
while(listOfNodes.size() > 1) {
MyNode first = listOfNodes.remove(0), second = listOfNodes.remove(0);
MyNode newNode = new MyNode(first.freq + second.freq, first, second);
int pos = Collections.binarySearch(listOfNodes, newNode,
Comparator.comparingLong(MyNode::getFreq));
listOfNodes.add(pos<0? -pos-1: pos, newNode);
}
Note that you could make this code more efficient by reversing the order so that you will remove from the end of the list (which will be an ArrayList in practice).
But the better alternative is to use a data structure which is sorted to begin with, e.g.
PriorityQueue<MyNode> queueOfNodes = input.chars().boxed()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream()
.map(x -> new MyNode((char)x.getKey().intValue(), x.getValue()))
.collect(Collectors.toCollection(
() -> new PriorityQueue<>(Comparator.comparingLong(MyNode::getFreq))));
MyNode result = queueOfNodes.remove();
while(!queueOfNodes.isEmpty()) {
MyNode second = queueOfNodes.remove();
queueOfNodes.add(new MyNode(result.freq + second.freq, result, second));
result = queueOfNodes.remove();
}

Producing histogram Map for IntStream raises compile-time-error

I'm interested in building a Huffman Coding prototype. To that end, I want to begin by producing a histogram of the characters that make up an input Java String. I've seen many solutions on SO and elsewhere (e.g:here that depend on using the collect() methods for Streams as well as static imports of Function.identity() and Collectors.counting() in a very specific and intuitive way.
However, when using a piece of code eerily similar to the one I linked to above:
private List<HuffmanTrieNode> getCharsAndFreqs(String s){
Map<Character, Long> freqs = s.chars().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
return null;
}
I receive a compile-time-error from Intellij which essentially tells me that there is no arguments to collect that conforms to a Supplier type, as required by its signature:
Unfortunately, I'm new to the Java 8 Stream hierarchy and I'm not entirely sure what the best course of action for me should be. In fact, going the Map way might be too much boilerplate for what I'm trying to do; please advise if so.
The problem is that s.chars() returns an IntStream - a particular specialization of Stream and it does not have a collect that takes a single argument; it's collect takes 3 arguments. Obviously you can use boxed and that would transform that IntStream to Stream<Integer>.
Map<Integer, Long> map = yourString.codePoints()
.boxed()
.collect(Collectors.groupingBy(
Function.identity(),
Collectors.counting()));
But now the problem is that you have counted code-points and not chars. If you absolutely know that your String is made from characters in the BMP, you can safely cast to char as shown in the other answer. If you are not - things get trickier.
In that case you need to get the single unicode code point as a character - but it might not fit into a Java char - that has 2 bytes; and a unicode character can be up to 4 bytes.
In that case your map should be Map<String, Long> and not Map<Character, Long>.
In java-9 with the introduction of supported \X (and Scanner#findAll) this is fairly easy to do:
String sample = "A" + "\uD835\uDD0A" + "B" + "C";
Map<String, Long> map = scan.findAll("\\X")
.map(MatchResult::group)
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(map); // {A=1, B=1, C=1, 𝔊=1}
In java-8 this would be a bit more verbose:
String sample = "AA" + "\uD835\uDD0A" + "B" + "C";
Map<String, Long> map = new HashMap<>();
Pattern p = Pattern.compile("\\P{M}\\p{M}*+");
Matcher m = p.matcher(sample);
while (m.find()) {
map.merge(m.group(), 1L, Long::sum);
}
System.out.println(map); // {A=2, B=1, C=1, 𝔊=1}
The String.chars() method returns an IntStream. You probably want to convert it to a Stream<Character> via:
s.chars().mapToObj(c -> (char)c)
As already pointed, you could transform the stream to primitive types to Object types.
s.chars().boxed()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

Filtering a non integer in a list of strings

I am reading a file and forming a integer list.
Example file:
1 1 2 3 4
2 2 5 abc
4 2 8
On running the below code it fails because of "abc"cannot be converted to an Integer.
Could you please let me know if it is possible to filter out the non integer fields in a cleaner way in Java 8 Eg: Using filters?
try (BufferedReader br = new BufferedReader(new InputStreamReader(
new FileInputStream(file)))) {
List<Integer> allValues = new ArrayList<>();
br.lines().forEach(
strLine -> {
List<String> wordsList = Arrays.asList(strLine.trim().split(" "));
List<Integer> routes = wordsList.stream()
.filter(e -> e != null && !e.isEmpty())
.map(Integer::valueOf)
.collect(Collectors.toList());
allValues.addAll(routes);
});
allValues.forEach(str -> System.out.print(str));
}
You don’t need the FileInputStream > InputStreamReader > BufferedReader detour to get a stream of lines. Even if you need a BufferedReader, there’s Files.newBufferedReader
Don’t manipulate an existing collection within forEach; if you fall back to that, you better stay with the ordinary loop. For Stream’s, there is flatMap to process nested items, e.g. tokens within a line
The tokens itself can be filtered with a simple regular expression, [0-9]+ implies that there must be at least one digit, which also sorts out empty strings, but using " *" as split pattern rather than " ", empty strings are not even created in the first place. null never occur as a result of the split operation
…
List<Integer> allValues;
try(Stream<String> lines=Files.lines(file.toPath())) {
allValues=lines.flatMap(Pattern.compile(" *")::splitAsStream)
.filter(s -> s.matches("[0-9]+"))
.map(Integer::valueOf)
.collect(Collectors.toList());
}

Aggregate pipes together in Hadoop

So I have a giant pipe in my cascade that looks like this:
K1 - V1
K1 - V2
K1 -V3
K2 - V4
K2- V5
K2-V6
Is there anyway to aggregate them using an Every pipe such that the output looks something along these lines:
K1 - {V1, V2, V3}
K2 - {V4, V5, V6}
Thank you!
Edit:
My code so far:
I am calling the Every Pipe
OutputPipe = new Every(OutputPipe, Fields.ALL, SomeBuffer());
And I am overriding the operate method in the buffer:
#Override
public void operate( FlowProcess flowProcess, BufferCall bufferCall )
{
TupleEntry group = bufferCall.getGroup();
// get all the current argument values for this grouping
Iterator<TupleEntry> arguments = bufferCall.getArgumentsIterator();
// create a Tuple to hold our result values
String result = "";
String key = "";
if (arguments.hasNext()) {
TupleEntry argument = arguments.next();
key = argument.getString("key") + "\t";
}
while (arguments.hasNext()) {
TupleEntry argument = arguments.next();
result += argument.getString("value") + "\t";
}
bufferCall.getOutputCollector().add(new Tuple(key, result));
}
The output I get is kind of strange though. I keep getting strange results from reading in the file, so I'm guessing my logic in the every pipe is wrong.

Generate unique id with 6 of length

I need to generate a unique id with a limit of 6 blocks. That id should contains letters in upper case and number.
How can I do it? I thougth in use the date, but I'm failed.
More details...
That Id just need not repeat, but should be generate alone, whithout base in a last sequence.
I can do this in any language.
I made a blog post to generate a strong SQL Server password. You can adapt this to what you need. For a better format blog post here.
We have the requirement that our passwords have to change every 90 days. I wanted to automate this and it sounded pretty easy to do, but it wasn’t that easy. Why? Because there are a lot of rules for passwords.
First, I have to store it in the web.config. So no XML special characters.
quot "
amp &
apos '
lt <
gt >
Next, If used in an OLE DB or ODBC connection string, a password must not contain the following characters: [] {}() , ; ? * ! #.
Finally, strong passwords must contains characters from at least three of the following categories:
English uppercase characters (A through Z)
English lowercase characters (a through z)
Base 10 digits (0 through 9)
Nonalphabetic characters (for example: !, $, #, %)
So keeping all of these rules in mind I created a simple class. that I can just call
public class PasswordGenerator
{
private static string CHARS_LCASE = "abcdefgijkmnopqrstwxyz";
private static string CHARS_UCASE = "ABCDEFGHJKLMNPQRSTWXYZ";
private static string CHARS_NUMERIC = "23456789";
private static string CHARS_SPECIAL = "*-+_%/";
private static string CHARS_ALL = CHARS_LCASE + CHARS_UCASE + CHARS_NUMERIC + CHARS_SPECIAL;
public static string GeneratePassword(int length)
{
char[] chars = new char[length];
Random rand = new Random();
for (int i = 0; i < length; i++)
{
switch (i)
{
case 0:
chars[i] = CHARS_LCASE[rand.Next(0, CHARS_LCASE.Length)];
break;
case 1:
chars[i] = CHARS_UCASE[rand.Next(0, CHARS_UCASE.Length)];
break;
case 2:
chars[i] = CHARS_NUMERIC[rand.Next(0, CHARS_NUMERIC.Length)];
break;
case 3:
chars[i] = CHARS_SPECIAL[rand.Next(0, CHARS_SPECIAL.Length)];
break;
default:
chars[i] = CHARS_ALL[rand.Next(0, CHARS_ALL.Length)];
break;
}
}
return new string(chars);
}
}
So now I just simply call this for a new password:
PasswordGenerator.GeneratePassword(13)

Resources