My knowledge on SQL and Oracle does not go beyond querying I'm afraid, but what I'm looking to do is run some kind of script which will produce multiple reports from one execution. I have a whole bunch of code and at the moment I'm rerunning it about 30 times, each time replacing the following in the WHERE clause with a different "SUBJECT_ID":
SELECT ...
FROM ...
WHERE ...
AND (v.SUBJECT_ID LIKE 'B%')
...
I'm thinking I should be able to run some kind of loop script and output all the 30 reports in one go...? I am hoping to get some kind of training on this quite soon, but any help would be greatly appreciated!
Assuming by
each time replacing the following in the WHERE clause with a different "SUBJECT_ID"
you mean v.subject_id like '<some value>', where is different for each loop round the code, I would do the following if I were you:
replace the v.subject_id like '<some value>' with v.subject_id like '&&subj_id.' (the . is necessary to say that's the end of the parameter name) and create that as it's own script, eg. report.sql.
Then I'd create a new script and do:
define subj_id set <some value>;
##report.sql
define subj_id set <some other value>;
##report.sql
...
Related
I have several processes with almost same flow like "Get some parameters, extract data from database according to them and upload them to target". The parameters vary slightly across processes as well as targets but only a bit. Most of the process is the same. I would like to extract those differences to parameter-context and dynamically load them. My idea is to have parameters defined following way and then using them.
So core of question is:
How to dynamically choose which parameter group load and use?
Having several parameter contexts with same-named/different-valued parameters and dynamically switching them would be probably the best, but it is not possible as far as I know.
Also duplicating flows is out-of-the-table. Any error correction would be spread out over several places and maintenance would be a nightmare.
Moreover, I know I can do it like "In GenetrateFlowFile for process A set value1=#{A_value1} and in GenetrateFlowFile for process B set value1=#{B_value1}. But this is tedious, error-prone and scales kinda bad. Not speaking of situation when I can have dozens of parameters and several processes. Also it is a kind of hardcoding, not configuring...
I was hoping for something like defining group=A and then using it like value1=#{ ${ group:append('_value1') } } but this does not work - it is evaluated as parameter literally named ${ group:append('_value1') }.
TL;DR: Use evaluateELString().
The actual solution is to set in GenetrateFlowFile processor group=A and in next UpdateAttribute processor set the following:
value1=${ group:prepend('hash{ '):append('_value1 }'):replace('hash', '#'):evaluateELString() }
The magic being done here is "Take value of group slap around it #{ and _value1 } to make it valid NiFi Expression Language statement and then evaluate it." (Notice - the word hash and function replace is there since I didnĀ“t manage to escape the # char right before {.)
If you would like to have your value1 at the beginning of the statement then you can use following code. The result is same, it is easier to use (often-changed value value1 is at the beginning of the statement) and is less readable "what is really going on?"-wise.
value1=${ literal('value1'):prepend('_'):prepend(${ group }):prepend('hash{ '):append(' }'):replace('hash', '#'):evaluateELString() }
I am stuck with a Requirement but stuck in a Scenario, tried out lot of alternatives from Google but looks like not serving my Purpose.
Below is my Requirement-
tablename=$1
This is a Variable for which the value i get from the command line.
Now my requirement is i want to create a variable with the Name which i received in the variable called tablename.
And i want to use that variable in a loop further.
For Example :
Say,
tablename = dept1
Now my loop will look something like-
for dept1 in .... do
...
done
Can someone please help me with this..
I want to see number of affected rows of my target table. For that, I can write a shell script in which I pass a parameter as $PM#numAffectedRows. However, if my target table name is parameterized and I want to pass that in the same shell, how can I do that?
Eg.
$ParamTgtTable=myTable
When I pass $PM'$ParamTgtTable'#numAffectedRows in the shell script, it echos myTable#numAffectedRows. If I pass the same without the quote as $PM$ParamTgtTable#numAffectedRows, I get $ParamTgtTable#numAffectedRows as my output.
Is there any workaround for this? Appreciate your help on this.
pass 2 parameters seperately like
$ParamTgtTable=myTable
$PM#numAffectedRows=your_count
now create a third parameter as X=$ParamTgtTable$PM#numAffectedRows
if didnt work try using single quotes (i dont have access to UNIX to test right now)
I'm trying to use the map clause with Hive but I'm tripping over syntax and not finding many examples of my use case around. I used the map clause before when I had to process one of the columns of a table using an external script.
I had a python script called, say, run, that took one command line parameter and spit out three space separated values. So I just did:
FROM(MAP
tablename.columnName
USING
'run' AS
result1, result2, result3
FROM
tablename
) map_output
INSERT OVERWRITE TABLE results SELECT *;
Now I have a python script that receives a lot more parameters and tried a few things that didn't worked and couldn't find examples on this. I did the obvious thing:
FROM
(MAP
numAgents, alpha, beta, burnin, nsteps, thin
USING
'runAuthorityMCMC' AS numAgents, alpha, beta, energy, avgDegree, maxDegree, accept
FROM
parameters
) map_output
INSERT OVERWRITE TABLE results SELECT *;
But I got an error A user-supplied transfrom script has exited with error code 2 instead of 0. When I run runAuthorityMCMC, with 6 command line parameters sampled from that table, it works perfectly well.
It seems to me it's trying to run the script without passing the parameters at all. In one of the error messages I got exactly the output I expected if this was the case. What is the correct syntax to do what I'm trying to do?
EDIT:
Confirming - this was part of the error message:
usage: runAuthorityMCMC [-h]
numAgents normalizedBrainCapacity ecologicalPressure
burnInSteps monteCarloSteps thiningRatio
runAuthorityMCMC: error: too few arguments
Which is exactly the output I'd expect with too few arguments. The script should take six arguments.
Ok, perhaps there is a difference of vocabulary here but hive doesn't send the values as "arguments" to the script. They are read in through standard input (which is different than passing something as argument). Also, you can try sending the data to /bin/cat so see what's actually being sent to the hive. If my memory serves me right, the values are sent tab separated and result emitted out from the script is also expected to be tab separated.
Trying printing stuff from stdout (or stderr) in your script, you will see the result in your jobtracker logs. That will help you debug.
I often have a task a bit like this: insert a large number of users onto to the users table with similar properties. Not always that simple, but in general, list of strings -> list of corresponding sql statements.
my usual solution is this with the list of usernames in excel use a formula to generate a load of insert statements
=concatenate("insert into users values(username .......'",A1,"'.....
and then I fill down the formula to get all the insert rows.
This works but sometimes the statement is long, sometimes including a few different steps for each, and cramming it all into an excel formula and getting all the wrapping quotes right is a pain.
I'm wondering if there is a better way. What I really want is to be able to have a template file template txt:
insert into users
([username],
[company] ...
)
values('<template tag1>...
and then using some magic command line tool, to simply be able to type something like
command_line> make_big_file_using_template template.txt /values [username1 username2]
/output: bigfile.txt
and this gives me a big file with the template repeated for each username value with the tag replaced with the username.
So does such a command exist, or are my expectations of command line tools too high? Any freely available windows tool will do. I could whip up a c# program to do this in not too much time but I feel like there must be an easy to use tool out there already.
This is trivial using a Powershell script. PS allows inline variables in strings, so you could do something like:
$Tag1 = 'blah'
$Tag2 = 'foo'
$SQLHS = #"
INSERT INTO users
([username],
[company],...)
VALUES
('$tag1', '$tag2'...)
"#
set-content 'C:\Mynewfile.txt' -value $SQLHS
The #"...."# is a here-string, which makes it very easy to write readable code without escaping quotes and such.
The above could be very easily modified to accept parameters for the various tags and another for the output file, or to run for a set of values located in another .txt or .csv file as inputs.
EDIT:
To modify it to accept parameters, you can just add a param() block at top:
param($outfile, $tab1, $tab2, $tab3)
Then use those $variables in your script:
set-content "$outfile" -value $SQLHS