How to load value from dynamically specified parameter in NiFi - apache-nifi

I have several processes with almost same flow like "Get some parameters, extract data from database according to them and upload them to target". The parameters vary slightly across processes as well as targets but only a bit. Most of the process is the same. I would like to extract those differences to parameter-context and dynamically load them. My idea is to have parameters defined following way and then using them.
So core of question is:
How to dynamically choose which parameter group load and use?
Having several parameter contexts with same-named/different-valued parameters and dynamically switching them would be probably the best, but it is not possible as far as I know.
Also duplicating flows is out-of-the-table. Any error correction would be spread out over several places and maintenance would be a nightmare.
Moreover, I know I can do it like "In GenetrateFlowFile for process A set value1=#{A_value1} and in GenetrateFlowFile for process B set value1=#{B_value1}. But this is tedious, error-prone and scales kinda bad. Not speaking of situation when I can have dozens of parameters and several processes. Also it is a kind of hardcoding, not configuring...
I was hoping for something like defining group=A and then using it like value1=#{ ${ group:append('_value1') } } but this does not work - it is evaluated as parameter literally named ${ group:append('_value1') }.

TL;DR: Use evaluateELString().
The actual solution is to set in GenetrateFlowFile processor group=A and in next UpdateAttribute processor set the following:
value1=${ group:prepend('hash{ '):append('_value1 }'):replace('hash', '#'):evaluateELString() }
The magic being done here is "Take value of group slap around it #{ and _value1 } to make it valid NiFi Expression Language statement and then evaluate it." (Notice - the word hash and function replace is there since I didn´t manage to escape the # char right before {.)
If you would like to have your value1 at the beginning of the statement then you can use following code. The result is same, it is easier to use (often-changed value value1 is at the beginning of the statement) and is less readable "what is really going on?"-wise.
value1=${ literal('value1'):prepend('_'):prepend(${ group }):prepend('hash{ '):append(' }'):replace('hash', '#'):evaluateELString() }

Related

Shell help text syntax for repeatable group of arguments

I'm writing a help output for a Bash script. Currently it looks like this:
dl [m|r]… (<file>|<URL> [m|r|<index>]…)…
The meaning that I'm trying to convey (and elsewhere describe with words) is that (after a potential "m" and/or "r") there can be an endless list of sets of arguments. The first argument in each set is always a file or URL and the further arguments can each be "m", "r" or a number. After that, it starts over with a file or URL and so on.
In my special case, I could just write this:
dl [m|r]… (<file>|<URL>) (<file>|<URL>|m|r|<index>)…
This works, because listing a URL and then another URL with nothing in between is allowed, as well as listing an arbitrarily long chain of "m"s (it's just useless to do so) and pretty much any other combination.
But what if that wasn't the case? What if I had for example a command like this:
change (<from> <to>)…
…which would be used e.g. like this:
change from1 to1 from2 to2 from3 to3
Would the bracket syntax be correct here? I just guessed it based on the grouping of (a|b), but I wasn't able to find any documentation that uses this for multiple, non-exclusive arguments that belong together. Is there even a standard for this?

Referencing Two-Word Variables in Applescript?

I'm attempting to get some data from an app called "Timing", which is local to my computer, and post it to a URL to notify a webhook, from which some process automation will occur.
According to the Applescript integration with Timing,
There is a time summary object that's returned from a command which I've successfully executed. When displayed as an alert, that data looks like this:
Can't get |times per project| of {id:5C6CD8C8-357F-4EE7-890C-5946DC03BBB9", overall total:1.18092493622303E+4, times per project:{Maintenance:81.091759443283, Youtube:4820.38001298904, |self improvement effors|:876.930474758148, Homework:2383.20326805115, |(no project)|:3647.64384698868}, overall total without tasks:1.18092493622303E+4, productivity score:0.388005592511, times per project without tasks:{Maintenance:81.091759443283, Youtube:4820.38001298904, |self improvement efforts|:876.930474758148, Homework:2383.20326805115, |(no project)|:3647.64384698868}, class:time summary}.
As you can see, (above), there is a property called productivity score, which is two words.
When attempting to get this datapoint from the object (which I will use to notify the webhook:)
set newnewVar to productivity score of newvar
display alert newvar
Obviously this wont work, because the variable name is two words. I've tried surrounding the name in quotes and surround it with other characters, but nothing seems to work, and the documentation for getting specific properties only has examples with variables with one word.
What's the solution to this problem?
In AppleScript, user-defined variables cannot generally have spaces. Typically they start with a letter or underscore, and then can contain only letters, numbers, or underscores. A user-defined variable can only contain spaces if it is contained within vertical pipes. So all of the following are valid variable forms: alphaUnit, slideRow3, _tempItem, |my variable|, left_hand_vector
However, any application or script that creates and uses a scripting definition can create commands and classes and properties that have multi-word names. For instance, if you look at the System Events app, you'll see that the Disk-Folder-File Suite has a class named disk item with properties like creation date. The reason this works is that these multi-word names are actually represented by a numeric (four-char) code: disk item is actually 'ditm' and creation date is 'ascd'. You often see these codes pop up in error strings like so:
"cannot make class ≪ditm≫ into..."
Make sure you have the scope right to invoke the dictionary — i.e. be within a tell block for the app or script that invokes the scripting dictionary — and the multi-word names should 'just work'. After compiling, you'll see them highlighted in a purple color that's just a bit different from the red of uncompiled text. You do not need to enclose dictionary terms in vertical pipes; if you do, they will be treated as user-defined variables and lose their special scripting purposes.

Correlating multiple dynamic values

How can I get the value of important id and ValueType?
I have tried using web_save_param_regexp (but unfortunately I don't fully understand how the function works).
I have also tried using web_save_param (with the help of offset and length).
unfortunately once again I cannot get the accurate value some values change in length specially when the total amount values dynamically changes per run.
<important id=\"insertsomevalueshere\" record=\"1\" nucTotal=\"NUC609.40\"><total amount=\"68.75\" currency=\"USD\"/><total amount=\"609.40\" currency=\"USD\"/><out avgsomecost=\"540.65\" ValueType=\"insertsomevalueshere\" containsawesomeness=\"1\" Score=\"-97961\" somedatatype=\"1\" typeofData=\"VAL\" web=\"1\">
Put these lines of code before the line of code which does your web request:
web_reg_save_param_regexp("ParamName=importantid","Regexp=<important id=\\\"(.*?)\\\"",LAST);
web_reg_save_param_regexp("ParamName=ValueType","Regexp= ValueType=\\\"(.*?)\\\"",LAST);
You will then have two stored parameters 'importantid' and 'ValueType'
Dynamic number of elements to correlate? Your path for resubmission is through web_custom_request(). You will need to build the string you need dynamically with the name:value pairs for all of the data which needs to be included.
This path will place a premium on your string manipulation skills in the language of the tool. The default path is through C, but you have other language options if your skills are more refined in another language.

confirm conditional statement applies to >0 observations in Stata

This is something that has puzzled me for some time and I have yet to find an answer.
I am in a situation where I am applying a standardized data cleaning process to (supposedly) similarly structured files, one file for each year. I have a statement such as the following:
replace field="Plant" if field=="Plant & Machinery"
Which was a result of the original code-writing based on the data file for year 1. Then I generalize the code to loop through the years of data. The problem becomes if in year 3, the analogous value in that variable was coded as "Plant and MachInery ", such that the code line above would not make the intended change due to the difference in the text string, but not result in an error alerting the change was not made.
What I am after is some sort of confirmation that >0 observations actually satisfied the condition each instance the code is executed in the loop, otherwise return an error. Any combination of trimming, removing spaces, and standardizing the text case are not workaround options. At the same time, I don't want to add a count if and then assert statement before every conditional replace as that becomes quite bulky.
Aside from going to the raw files to ensure the variable values are standardized, is there any way to do this validation "on the fly" as I have tried to describe? Maybe just write a custom program that combines a count if, assert and replace?
The idea has surfaced occasionally that replace should return the number of observations changed, but there are good reasons why not, notably that it is not a r-class or e-class command any way and it's quite important not to change the way it works because that could break innumerable programs and do-files.
So, I think the essence of any answer is that you have to set up your own monitoring process counting how many values have (or would be) changed.
One pattern is -- when working on a current variable:
gen was = .
foreach ... {
...
replace was = current
replace current = ...
qui count if was != current
<use the result>
}

In Yahoo! Pipes how can I return a single value from a loop that's built with values from every item?

For example, I have a list of items and each item has a name. I want to build a single string that contains a comma-separated list of all the names. In most programming languages, I would loop over the items and append to a value outside the list/array. But, I can't figure out any combination of Yahoo! Pipes modules to do it. Maybe I'm missing something obvious, but I also find nothing relevant from Google.
How do I append loop item values to a single value outside the loop?
Or how can I return a single value from a loop that's built with values from every item?
Or what is the correct method to accomplish this in Pipes if it's neither of those?
The best method I've come up with based on help from the Yahoo! group, is to use an Item Builder (item.string = default) --> Loop ( assign all to item.string ). Using another pipe inside the Loop to provide the values to concatenate was also very helpful.
Unfortunately the modules available with Yahoo Pipes alone cannot perform the task you are aiming at. The only solution available currently is to use "web service" module to call an externally hosted script (say in PHP)... the entire pipe content will be sent to the script as POST field "data". You can code the script such that it loops through all items to add the string to a single string and return it after processing.

Resources