How to increment by group - java-8

There is a table and now add a new column -- sort_num int default 0
id level sort_num
1 1 0
2 1 0
3 2 0
4 2 0
5 2 0
6 3 0
7 3 0
8 3 0
9 3 0
Now I want to set sort_num values like below
id level sort_num
1 1 1
2 1 2
3 2 1
4 2 2
5 2 3
6 3 1
7 3 2
8 3 3
9 3 4
The Java code implement above requirement is
int sortNum = 0;
int currentLevel = fooList.get(0).getLevel();
for (RuleConf foo : fooList) {
if(currentLevel != foo.getLevel()){
sortNum = 0;
currentLevel = foo.getLevel();
}
foo.setSortNum(++sortNum);
}
I want to know if Java8 could simplify above code?
PS. Use mysql to implement this requirement
set #index:=0; update t set sort_num = (#index:=#index+1) where level = 1 order by id;
set #index:=0; update t set sort_num = (#index:=#index+1) where level = 2 order by id;
set #index:=0; update t set sort_num = (#index:=#index+1) where level = 3 order by id;

The best approach is to stick to your plain enhanced for loop. I don't think it is possible to come up with a single Stream solution, since you need to have intermediate values. Like:
Map<Integer, List<RuleConf>> levels = fooList.stream()
.collect(Collectors.groupingBy(RuleConf::getLevel));
levels.values().forEach(v ->
IntStream.range(0, v.size()).forEach(i -> v.get(i).setSortNum(i + 1))
);

If you keep track of the next order numbers yourself, you may do it with one stream. This solution is thread safe as well, hence should work with parallel streams:
Map<Integer, AtomicInteger> orders = new ConcurrentHashMap<>();
fooList.stream().forEachOrdered(foo -> {
orders.putIfAbsent(foo.getLevel(), new AtomicInteger());
foo.setOrder(orders.get(foo.getLevel()).incrementAndGet());
});
It should outperform the other stream-solutions, because it requires to iterate over the list only ones.

Related

How to make SORTKEY for irregular observations

I'd like to make "SORTKEY" like the below. It's not the same observations for each one.
Basically, each one is 3 obs but if flg=1 then "SORTKEY" includes that observation.
In this example, it means SORTKEY = 2 is 4 obs, SORTKEY ^=2 is 3 obs.
Is there the way to make the SORTKEY manually?. If you have a good idea, please give me some advice.
I want the following dataset, using the "test" dataset.
/*
SORTKEY NO FLG
1    1  0
1    2  0
1    3  0
2    4  0
2    5  0
2    6  0
2    7  1
3    8  0
3    9  0
3    10 0
*/
data test;
input no flg;
cards;
1 0
2 0
3 0
4 0
5 0
6 0
7 1
8 0
9 0
10 0
;
run;
Use a sequence counter to track the 3-rows-per-sortkey requirement.
Example:
data want;
set have;
retain sortkey 1;
seq+1;
if seq > 3 and flag ne 1 then do;
seq = 1;
sortkey+1;
end;
run;

How can I improve my Countdown Numbers Solver algorithm to find more solutions?

As my school Project, I need to build a solver for Countdown Numbers & Letters rounds. I wanted to develop a structure which I can use to build both solvers, and I first developed a Numbers solver. However, before using this solution for Letters, I need to improve my current algorithm. I think I'm wrong somewhere, because I don't get the same results with other tools I am using to compare my program. Here is program for my solver;
/// numbers_game_solver.dart
import 'dart:collection';
import 'package:trotter/trotter.dart';
/* Import statements was package-based, I turned them into relative paths for question. */
import 'number_generator.dart';
import 'operation.dart';
import 'solution.dart';
import 'solutions.dart';
/* Will try to combine numbers with operations, as shown below;
* List<List<Operation>> operations = <Operations>[a, ,b, ,c, ,d, ,e, ,f
* + - + * / ]
* Then if last operations result is equal to target, will result it.
* If not will show closest result.
*/
const List<String> kOperators = const <String>[kOpAdd, kOpDiv, kOpMul, kOpSub];
class NumbersGameSolver {
NumbersGameSolver()
: this.solutions = Solutions(_expectedResult);
/* TODO: Do tests with smaller numbers and targets. */
final List<int> _numbers = const <int>[1, 2, 3, 4]; // NumberGenerator.numbers;
static final int _expectedResult = 15; //NumberGenerator.expectedResult;
final Solutions solutions;
void solve() {
/* All permutations of operators with replacement, which will be inserted between numbers. */
final Set<List<String>> amalgamsOperators = Amalgams<String>(_numbers.length - 1, kOperators)().toSet();
/* There may duplicates occur in numbers list, because of this, numbers will be mapped
using permutations of indices. */
final List<int> indices = List<int>.generate(_numbers.length, (int index) => index);
final Iterable<List<int>> permutationsIndices = Permutations<int>(indices.length, indices)();
final Set<List<int>>
permutationsNumbers = permutationsIndices.map(
(List<int> listPerm) => listPerm.map(
(int index) => _numbers[index]
).toList()
).toSet();
for (final List<int> numbers in permutationsNumbers) {
for (final List<String> operators in amalgamsOperators) {
Queue<int> stackNums = Queue<int>.from(numbers);
Queue<String> stackOprts = Queue<String>.from(operators);
Solution tempSolution = Solution(_expectedResult);
do {
int left = stackNums.removeFirst(), right = stackNums.removeFirst();
Operation tempOperation = Operation(stackOprts.removeFirst(), left, right);
/* Record solutions current state. */
SolutionState solutionState = tempSolution.addOperation(tempOperation);
if (solutionState == SolutionState.currentlyValid) {
/* If valid, add result to the current numbers stack. */
stackNums.addFirst(tempOperation.result);
} else if (solutionState == SolutionState.lastOperationRedundant) {
/* If operation is redundant, dispose it and continue. */
continue;
} else if (solutionState == SolutionState.lastResultInvalid) {
/* If results is invalid at any stage, dispose whole solution. */
break;
}
if (solutions.addSolution(tempSolution) == true) break;
} while (stackNums.length > 1);
}
}
/* Will show only accurate solutions.
* If there is no accurate solutions, will show solutions which results
* are closest to the expected result.
*/
solutions.showSolutions();
}
}
There are 5 classes, to shorten the question I added them in this Gist.
My algorithm is as follows;
Rules for this Project are; program must randomly generate 5 single digit number and 1 two digit number where twoDigitNumber % 10 == 0 and a three digit number as target.
I get permutations of 4 operators and numbers that will be used in operations (Using trotter package.)
For each permutation of numbers, I apply each permutation of operators; using Operation class and add them into a Solution instance for each permutation.
I pass some redundant operations in each iteration, and if there is an invalid result at any stage, I dispose that solution and continue. (I'm taking this DataGenetics blog about this topic as a reference.)
To test my algorithm I used numbers 1, 2, 3, 4 and set target as 15. The results from dcode.fr Solver are as is;
15 (2 op.)
4 + 1 = 5
5 x 3 = 15
15 (3 op.)
4 + 3 = 7
7 x 2 = 14
14 + 1 = 15
15 (3 op.)
4 x 3 = 12
12 + 2 = 14
14 + 1 = 15
15 (3 op.)
4 x 3 = 12
2 + 1 = 3
12 + 3 = 15
15 (3 op.)
3 + 2 = 5
4 - 1 = 3
5 x 3 = 15
15 (3 op.)
4 x 3 = 12
12 + 1 = 13
13 + 2 = 15
15 (3 op.)
4 - 1 = 3
3 + 2 = 5
5 x 3 = 15
15 (3 op.)
4 + 2 = 6
6 - 1 = 5
5 x 3 = 15
15 (3 op.)
2 + 1 = 3
4 x 3 = 12
12 + 3 = 15
15 (3 op.)
2 - 1 = 1
4 + 1 = 5
5 x 3 = 15
(A total of 10 solutions.)
and the solutions my program found are as is;
> SOLUTION 1 ~
4 - 1 = 3
3 + 2 = 5
5 x 3 = 15
> SOLUTION 2 ~
4 + 1 = 5
5 x 3 = 15
(A total of 2 solutions.)
Can you tell me what am I thinking wrongly; Why can't I find all solutions? What are alternative approaches I can take to solve this problem? Is there anything I'm missing?
TY for taking time.

How do we Construct LCP-LR array from LCP array?

To find the number of occurrences of a given string P ( length m ) in a text T ( length N )
We must use binary search against the suffix array of T.
The issue with using standard binary search ( without the LCP information ) is that in each of the O(log N) comparisons you need to make, you compare P to the current entry of the suffix array, which means a full string comparison of up to m characters. So the complexity is O(m*log N).
The LCP-LR array helps improve this to O(m+log N).
know more
How we precompute LCP-LR array from LCP array?
And How does LCP-LR help in finding the number of occurrences of a pattern?
Please Explain the Algorithm with Example
Thank you
// note that arrSize is O(n)
// int arrSize = 2 * 2 ^ (log(N) + 1) + 1; // start from 1
// LCP = new int[N];
// fill the LCP...
// LCP_LR = new int[arrSize];
// memset(LCP_LR, maxValueOfInteger, arrSize);
//
// init: buildLCP_LR(1, 1, N);
// LCP_LR[1] == [1..N]
// LCP_LR[2] == [1..N/2]
// LCP_LR[3] == [N/2+1 .. N]
// rangeI = LCP_LR[i]
// rangeILeft = LCP_LR[2 * i]
// rangeIRight = LCP_LR[2 * i + 1]
// ..etc
void buildLCP_LR(int index, int low, int high)
{
if(low == high)
{
LCP_LR[index] = LCP[low];
return;
}
int mid = (low + high) / 2;
buildLCP_LR(2*index, low, mid);
buildLCP_LR(2*index+1, mid + 1, high);
LCP_LR[index] = min(LCP_LR[2*index], LCP_LR[2*index + 1]);
}
Reference: https://stackoverflow.com/a/28385677/1428052
Not having enough reps to comment so posting. Is anybody able to create the LCP-LR using #Abhijeet Ashok Muneshwar solution. For ex for text- mississippi the Suffix array-
0 1 2 3 4 5 6 7 8 9 10
10 7 1 4 0 9 8 3 6 2 5
The LCP array will be
0 1 2 3 4 5 6 7 8 9 10
1 1 4 0 0 1 0 2 1 3 0
And LCP-LR will be
0 1 2 3 4 5 6 7 8 9 10
1 1 0 4 0 0 0 0 0 1 3
But the LCP-LR obtained using the code is not same as above.
To the method buildLCP_LR i am passing index=0, low=0, high=n

Count the frequency of matrix values including 0

I have a vector
A = [ 1 1 1 2 2 3 6 8 9 9 ]
I would like to write a loop that counts the frequencies of values in my vector within a range I choose, this would include values that have 0 frequencies
For example, if I chose the range of 1:9 my results would be
3 2 1 0 0 1 0 1 2
If I picked 1:11 the result would be
3 2 1 0 0 1 0 1 2 0 0
Is this possible? Also ideally I would have to do this for giant matrices and vectors, so the fasted way to calculate this would be appreciated.
Here's an alternative suggestion to histcounts, which appears to be ~8x faster on Matlab 2015b:
A = [ 1 1 1 2 2 3 6 8 9 9 ];
maxRange = 11;
N = accumarray(A(:), 1, [maxRange,1])';
N =
3 2 1 0 0 1 0 1 2 0 0
Comparing the speed:
K>> tic; for i = 1:100000, N1 = accumarray(A(:), 1, [maxRange,1])'; end; toc;
Elapsed time is 0.537597 seconds.
K>> tic; for i = 1:100000, N2 = histcounts(A,1:maxRange+1); end; toc;
Elapsed time is 4.333394 seconds.
K>> isequal(N1, N2)
ans =
1
As per the loop request, here's a looped version, which should not be too slow since the latest engine overhaul:
A = [ 1 1 1 2 2 3 6 8 9 9 ];
maxRange = 11; %// your range
output = zeros(1,maxRange); %// initialise output
for ii = 1:maxRange
tmp = A==ii; %// temporary storage
output(ii) = sum(tmp(:)); %// find the number of occurences
end
which would result in
output =
3 2 1 0 0 1 0 1 2 0 0
Faster and not-looping would be #beaker's suggestion to use histcounts:
[N,edges] = histcounts(A,1:maxRange+1);
N =
3 2 1 0 0 1 0 1 2 0
where the +1 makes sure the last entry is included as well.
Assuming the input A to be a sorted array and the range starts from 1 and goes until some value greater than or equal to the largest element in A, here's an approach using diff and find -
%// Inputs
A = [2 4 4 4 8 9 11 11 11 12]; %// Modified for variety
maxN = 13;
idx = [0 find(diff(A)>0) numel(A)]+1;
out = zeros(1,maxN); %// OR for better performance : out(maxN) = 0;
out(A(idx(1:end-1))) = diff(idx);
Output -
out =
0 1 0 3 0 0 0 1 1 0 3 1 0
This can be done very easily with bsxfun.
Let the data be
A = [ 1 1 1 2 2 3 6 8 9 9 ]; %// data
B = 1:9; %// possible values
Then
result = sum(bsxfun(#eq, A(:), B(:).'), 1);
gives
result =
3 2 1 0 0 1 0 1 2

R - Making loops faster

This little code snippet is supposed to loop through a sorted data frame. It keeps a count of how many successive rows have the same information in columns aIndex and cIndex and also bIndex and dIndex. If these are the same, it deposits the count and increments it for the next time around, and if they differ, it deposits the count and resets it to 1 for the next time around.
for (i in 1:nrow(myFrame)) {
if (myFrame[i, aIndex] == myFrame[i, cIndex] &
myFrame[i, bIndex] == myFrame[i, dIndex]) {
myFrame[i, eIndex] <- count
count <- (count + 1)
} else {
myFrame[i, eIndex] <- count
count <- 1
}
}
It's been running for a long time now. I understand that I'm supposed to vectorize whenever possible, but I'm not really seeing it here. What am I supposed to do to make this faster?
Here's what an example few rows should look like after running:
aIndex bIndex cIndex dIndex eIndex
1 2 1 2 1
1 2 1 2 2
1 2 4 8 3
4 8 1 4 1
1 4 1 4 1
I think this will do what you want; the tricky part is that the count resets after the difference, which effectively puts a shift on the eIndex.
There (hopefully) is an easier way to do this, but this is what I came up with.
tmprle <- rle(((myFrame$aIndex == myFrame$cIndex) &
(myFrame$bIndex == myFrame$dIndex)))
myFrame$eIndex <- c(1,
unlist(ifelse(tmprle$values,
Vectorize(seq.default)(from = 2,
length = tmprle$lengths),
lapply(tmprle$lengths,
function(x) {rep(1, each = x)})))
)[-(nrow(myFrame)+1)]
which gives
> myFrame
aIndex bIndex cIndex dIndex eIndex
1 1 2 1 2 1
2 1 2 1 2 2
3 1 2 4 8 3
4 4 8 1 4 1
5 1 4 1 4 1
Maybe this will work. I have reworked the rle and sequence bits.
dat <- read.table(text="aIndex bIndex cIndex dIndex
1 2 1 2
1 2 1 2
1 2 4 8
4 8 1 4
1 4 1 4", header=TRUE, as.is=TRUE,sep = " ")
dat$eIndex <-NA
#identify rows where a=c and b=d, multiply by 1 to get a numeric vector
dat$id<-(dat$aIndex==dat$cIndex & dat$bIndex==dat$dIndex)*1
#identify sequence
runs <- rle(dat$id)
#create sequence, multiply by id to keep only identicals, +1 at the end
count <-sequence(runs$lengths)*dat$id+1
#shift sequence down one notch, start with 1
dat$eIndex <-c(1,count[-length(count)])
dat
aIndex bIndex cIndex dIndex eIndex id
1 1 2 1 2 1 1
2 1 2 1 2 2 1
3 1 2 4 8 3 0
4 4 8 1 4 1 0
5 1 4 1 4 1 1

Resources