How to configure dinamically a processor in Apache NiFi by a flow file? - apache-nifi

I have a use case like in the following pseudo code (attached image):
Generate a flow file (#1) with n attributes;
Process the flow file (#1) through an initial route, A; nodes: 1, 2, 3, 4, 5, 6, 7, 8, 3, 4, 5, 9;
Generate new flow files (#2, #3, #x) with n attributes;
Process the new flow files (#2, #3, #x) through an alternate route, B; nodes: 1, 2, 3, 4, 5, 9;
Other info:
Node 4: UpdateAttribute, contains number attribute (property) which is set to 0.
Node 4: contains an advanced rule (attached image):
Conditions:
Expression:
${execution.status:le(0)};
${number:le(0)};
Actions:
Attribute:
number;
Value:
${number:plus(1)};
Node 5: RouteOnAttribute, routes based on number value, 0 / 1;
Node 7: add 0 in 'execution.status' attribute (flow file #1);
I'm able to route the flow file #1 as in step 4, but after that all new flow files (#2, #3, #x) are not routed like as I expected in step 4.
My question is, how can I configure dinamically a processor in Apache NiFi by a flow file?
Thank you.

Related

How to remap group indices of a sparse set to a compact set?

Assume we have a list of data, for example
data [0] = /* employee 0 name, age, DOB, position */
data [1] = /* employee 1 name, age, DOB, position */
...
data [n-1] = /* employee n-1 name, age, DOB, position */
We also have a list of groups/teams, which is a list of lists:
group [0] = {0, 1, 72}
group [1] = {38, 1, 40}
...
group [k] = {0, 70, 72, 90}
Groups can have any non zero number of indices. Indices can be repeated as many times as possible.
The input is guaranteed that every index from [1-n] is present in at least one group.
A random list of teams to be deleted is given to you, for example remove {1, 6,7,8} means to remove the groups on the group list at indices 1,6,7,8.
Assume you do remove the groups.
You now potentially have indices in the data that belong to no group.
You want to remove any such datum, but you also want to keep indices contiguous. So for example if the input is
data has 4 elements
group[0] = {0, 1, 2}
group[1] = {0, 2, 3}
group[2] = {2, 3}
And you are to remove group 0, then datum 1 is to be removed. meaning you must shift the indices of groups 0 and 2.
The new data would look like:
data has 3 elements
group[1] = {0, 1, 2}
group[2] = {1, 2}
I want to implement this in an efficient way.
My current solution is to delete all elements, iterate over the list checking for data without an assigned group, and creating a permutation map for each index.
Then copy all surviving groups using the permutation map.
This is, very, very, very slow for large data. Is there a way to do this without using O(n) additional memory? Or at the bare minimum a data structure with better cache performance than a map?

What number of probes are needed to avoid collision in hashing?

I have placed even values (i.e., 0, 2, 4, 6, .., 19996, 19998) in my hash table such that:
Value 0 is stored at home address 0, value 2 is stored at home address 2. Similarly, value 16,382 is stored at home address 16,382, but values 16,384 to 19,998 will have collisions.
Now to avoid collisions what number of probes needed to search for the target values from 0, 1, 2, 3, 4, till 19999 ?

Interview Question: How to fulfill max number of Moving Requests

There are N Buildings on the site ranging from 0 to N-1. Every employee has an office space in one of the buildings. A new employee may make request to move from current building X to another building Y. A moving request is noted by
class Request {
String employeeName;
int fromBuilding;
int toBuilding;
}
Initially all buildings are full. A request from building X to building Y is achievable only if someone in Building Y makes an achievable request to move therefore creating a vacancy. Given a wishlist of requests help us plan for the best way of building swaps. A plan that fulfills maximum number of requests is considered the best.
Example 1:
Input:
["Alex", 1, 2]
["Ben", 2, 1]
["Chris", 1, 2]
["David", 2, 3]
["Ellen", 3, 1]
["Frank", 4, 5]
Output: [["Alex", "Bem"], ["Chris", "David", "Ellen"]]
Example 2:
Input:
["Adam", 1, 2]
["Brian", 2, 1]
["Carl", 4, 5]
["Dan", 5, 1]
["Eric", 2, 3]
["Fred", 3, 4]
Output: [["Adam", "Eric", "Fred", "Carl", "Dan"]]
This question was taken from leet code here:
https://leetcode.com/discuss/interview-question/325840/amazon-phone-screen-moving-requests
I am trying to do this in python and I figured that creating a dictionary that represents the graph would be a good start but not sure what to do next.
```
def findMovers(buildReqs):
graph={}
for i in buildReqs:
if i[1] not in graph:
graph[i[1]]=[i[2]]
else:
graph[i[1]].append(i[2])
```
Make a bipartite graph with current offices on one side and future offices on the other.
Draw edges with score 0 for people staying in their current office, and edges with score 1 for people moving into any desired new office.
Find the maximum weight bipartite matching: https://en.wikipedia.org/wiki/Assignment_problem

How to fix the Conquer step of this iterative quicksort implementation following Khan Academy's example?

For educational purposes I am trying to learn the quick sort algorithm. Instead of checking out an implementation on the web or trying to implement directly from the pseudocode on wikipedia, I am trying a "hard way" approach.
I watched this lecture from CS50 https://www.youtube.com/watch?v=aQiWF4E8flQ&t=305s in order to understand how the numbers move while being "quick sorted". My implementation, which I will show bellow, works perfectly for the example provided on the video. The example on the video of an initial unsorted array is this:
That's my code in Python 3:
len_seq = int(input())
print("len_seq",len_seq)
seq_in = list(map(int,(input()).split()))
print("seq_in",seq_in)
def my_quick_sort(seq):
wall_index = 0
pivot_corect_final_index_list = []
while wall_index<len_seq:
pivot = seq[-1]
print("pivot",pivot)
print("wall_index",wall_index)
for i in range(wall_index,len_seq):
print ("seq[i]",seq[i])
if seq[i]<pivot:
print("before inside swap,", seq)
seq[wall_index],seq[i]=seq[i],seq[wall_index]
print("after inside swap,", seq)
wall_index = wall_index + 1
print ("wall_index",wall_index)
print("before outside swap,", seq)
seq[wall_index],seq[-1]=seq[-1],seq[wall_index]
print("after outside swap,", seq)
pivot_corect_final_index = wall_index
print ("pivot correct final index",pivot_corect_final_index)
pivot_corect_final_index_list.append(pivot_corect_final_index)
print ("pivot_corect_final_index_list",pivot_corect_final_index_list)
wall_index = wall_index + 1
print ("wall_index",wall_index)
return seq
print(my_quick_sort(seq_in))
To use harvard's CS50 example in my code you need to input this:
9
6 5 1 3 8 4 7 9 2
The algorithm works fine and returns the correct output:
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Continuing my study, I tried to implement Khan Academy's example: https://www.khanacademy.org/computing/computer-science/algorithms/quick-sort/a/overview-of-quicksort
The unsorted list in this case is:
[9, 7, 5, 11, 12, 2, 14, 3, 10, 6]
You need to input the following in my code in order to run it:
10
9 7 5 11 12 2 14 3 10 6
Differently from the Harvard example, in this case my implementation does not work perfectly. It returns:
[5, 2, 3, 6, 7, 9, 10, 11, 12, 14]
As you see, all the numbers that I treated as pivots end in the correct position. However, some numbers behind the pivots are not right.
Reading khan academy's article it seems that my implementation is right on the partition step. However, it is not right on the conquer step. I am trying to avoid looking to a final solution. I am trying to improve what I have build so far. Not sure if this is the best method, but that's what I am trying right now.
How can I fix the conquer step? Is it necessary to introduce a recursive approach? How can I do that inside my iterative process going on?
And should that step be introduced after successfully treating each pivot?
Thanks for the patience of reading this long post.
Can't comment, not enough reputation.
In the first pass of your algorithm, you correctly place all elements smaller than the pivot to the left of the pivot. However, since your value of wall_index increases (e.g. from 0 to 1), you ignore the leftmost element with index 0 (it might not be in the correct position, so it should not be ignored).
In the Khan academy test case, the number 5 gets placed at the leftmost index in the first pass, and then gets ignored by subsequent passes, thus it gets stuck on the left. Similarly, trying this modification of the harvard example
9
6 5 1 3 8 4 7 2 9
yields
[6, 5, 1, 3, 8, 4, 7, 2, 9]
After the first partitioning, you have to make sure to apply quicksort to both the arrays to the left and to the right of the pivot. For example, after the first pivot (6) is placed in the correct position for the Khan example (what you labeled as the outside swap),
[5, 2, 3, 6, 12, 7, 14, 9, 10, 11]
<--1--> p <--------2--------->
you have to apply the same quicksort to both subarrays 1 and 2 in the diagram above. I suggest you try out the recursive implementation first, which will give you a good idea of the algorithm, then try to implement it iteratively.

Union of two sets given a certain ordering in O(n) time

[Note: I am hoping this problem can be solved in O(n) time. If not, I am looking for a proof that it cannot be solved in O(n) time. If I get the proof, I'll try to implement a new algorithm to reach to the union of these sorted sets in a different way.]
Consider the sets:
(1, 4, 0, 6, 3)
(0, 5, 2, 6, 3)
The resultant should be:
(1, 4, 0, 5, 2, 6, 3)
Please note that the problem of union of sorted sets is easy. These are also sorted sets but the ordering is defined by some other properties from which these indices have been resolved. But the ordering (whatever it is) is valid to both the sets, i.e. for any i, j ∈ Set X if i <= j, then in some other Set Y, for the same i, j, i <= j.
EDIT: I am sorry I have missed something very important that I have covered in one of the comments below — intersection of two sets is not a null set, i.e. the two sets have common elements.
Insert each item in the first set into a hash table.
Go through each item in the second set, looking up that value.
If not found, insert that item into the resulting set.
If found, insert all items from the first set between the last item we inserted up to this value.
At the end, insert all remaining items from the first set into the resulting set.
Running time
Expected O(n).
Side note
With the constraints given, the union is not necessarily unique.
For e.g. (1) (2), the resulting set can be either (1, 2) or (2, 1).
This answer will pick (2, 1).
Implementation note
Obviously looping through the first set to find the last inserted item is not going to result in an O(n) algorithm. Instead we must keep an iterator into the first set (not the hash table), and then we can simply continue from the last position that iterator had.
Here's some pseudo-code, assuming both sets are arrays (for simplicity):
for i = 0 to input1.length
hashTable.insert(input1[i])
i = 0 // this will be our 'iterator' into the first set
for j = 0 to input2.length
if hashTable.contains(input2[j])
do
output.append(input1[i])
i++
while input1[i] != input2[j]
else
output.append(input2[j])
while i < input.length
output.append(input1[i])
The do-while-loop inside the for-loop may look suspicious, but note that each iteration that that loop runs, we increase i, so it can run a total of input1.length times.
Example
Input:
(1, 4, 0, 6, 8, 3)
(0, 5, 2, 6, 3)
Hash table: (1, 4, 0, 6, 8, 3)
Then, go through the second set.
Look up 0, found, so insert 1, 4, 0 into the resulting set
(no item from first set inserted yet, so insert all items from the start until we get 0).
Look up 5, not found, so insert 5 into the resulting set.
Look up 2, not found, so insert 2 into the resulting set.
Look up 6, found, so insert 6 into the resulting set
(last item inserted from first set is 0, so only 6 needs to be inserted).
Look up 3, found, so insert 8, 3 into the resulting set
(last item inserted from first set is 6, so insert all items from after 6 until we get 3).
Output: (1, 4, 0, 5, 2, 6, 8, 3)
We have two ordered sets of indices A and B, which are ordered by some function f(). So we know that f(A[i]) < f(A[j]) iff i < j, and the same holds true for set B.
From here, we got a linear mapping to a "sorted" linear sets, thus reduced to the "problem of union of sorted sets".
This also doesn't have the best space complexity, but you can try:
a = [1,2,3,4,5]
b = [4,2,79,8]
union = {}
for each in a:
union[each]=1
for each in b:
union[each]=1
for each in union:
print each,' ',
Output:
>>> 1 2 3 4 5 8 79

Resources