How to save a filename in a variable in Nifi? - apache-nifi

I'm new to Nifi and I'm trying to get a file name and save this filename in a variable to be used later on in the process.
Basically I have a file(data_yyyyMMdd.tar.gz) which contains 2 .txt files(1.txt and 2.txt), and before to unpack this file, I want to save it's name to a variable and then, use this variable to add content to the unpacked files.
content of the files(originally) :
1.txt
id|name
1|apple
2|orange
content of the files after be updated with the filename
id|name|filename
1|apple|data_yyyyMMdd.tar.gz
2|orange|data_yyyyMMdd.tar.gz
I managed to unpack to file successfully, but, I'm not being able to save the .tar.gz filename in a variable and add it's value to the content of each file.
Could you guys help me?

Depending on what processor you used to get the tar.gz file, you likely already have a FlowFile attribute called filename set to the name of the tar.gz file. After unpacking you may find that the filename attribute is overwritten (not sure though), so before unpacking, copy the filename attribute into some other attribute using UpdateAttribute. For example you can add a property in UpdateAttribute named original.filename and set its value to ${filename}.
After unpacking you can use UpdateRecord to add the original filename as a field in each record, I think by setting the Replacement Value Strategy to Literal Value and adding a property /filename set to ${original.filename}. I haven't tried this so I don't know if these are exactly the right settings, but the approach should work.

Related

How to read data from csv file in jmeter whose location is not fixed and might change in the future

I have a CSV File whose location is going to change later. Using Jmeter how can I still read the file even if the location change?
You can set the location of the CSV file as a JMeter property and pass it through the command line or through user.properties file
Set the file name with a property in the CSV Data Set Config element
${__P(full-path-to-file,/Users/hansi/Documents/test-data/test-data-users.csv)}
Note: A default value /Users/hansi/Documents/test-data/test-data-users.csv is set in the above screenshot.
2.Define the value
2.1 In user.properties file
full-path-to-file=/Users/hansi/Documents/test-data/test-data-users.csv
or
2.2 Set the property when JMeter test is executed from commandline
./jmeter.sh -n -t test-plan.jmx -Jfull-path-to="/Users/hansi/Documents/test-data/test-data-users.csv"
If the CSV file is going to change location then you could store the test plan which is using the file alongside it. This would maintain the context of the test plan if the CSV itself is going to be moving by storing them together. In JMeter relative file names are resolved with respect to the path of the active test plan (based on the official documentation), allowing you to specify only the name of the file in the CSV Data Set Config Filename property.

Search the File Pattern from File Name

I have a file Patter_File.txt which stores lines like below -
ABC|ABC_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].dat|8|,|70|NAME
ABC|ABC_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].dat|9|,|70|PLACE
XYZ|XYZ_[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].dat|23|,|70|SSN
XYZ|XYZ_[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].dat|33|,|70|DOB
MNO|MNO_SUMMIT.dat|40|,|70|ADDRESS
MNO|MNO_SUMMIT.dat|5|,|70|COUNTRY
So this PATTERN_FILE.txt stores some information of the actual file but file name is stored in the pattern(if file name has date in the name) except the actual name.
My requirement is a command in which I should pass the actual file name like "ABC_20200408.dat" and it should return all the related lines from this file. Can someone please help.
below command is working fine but in this case I have to pass each pattern one by one to check which one is working.
echo "ABC_20200408.dat"|grep ABC_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].dat

How to get the full name of a folder if I only know the beginning

I am receiving an input from the user which looks like follows:
echo +++Your input:+++
read USER_INPUT
The way I should use it is to retrieve the full name of a folder which starts with that input, but that contains other stuffs right after. All I know is that the folder is unique.
For example:
User input
123456
Target folder
/somepath/someotherpath/123456-111-222
What I need
MYNEED=123456-111-222
I was thinking to retrieve this with an MYNEED=$(ls /somepath/someotherpath/$USER_INPUT*), but if I do this I will get instead all the content of /somepath/someotherpath/123456-111-222 because that's the only folder existing with that name so the ls command directly goes to the next step.
May I have your idea to retrieve the value 123456-111-222 into a variable that I will need to use after?
basename extracts the filename from the whole path so this will do it:
MYNEED=$(basename /somepath/someotherpath/123456*)

SQLLDR file path argument

I have more than 30 files to load the data.
The path changes at every run in those files. So the path becomes
INFILE "/home/dmf/Cycle7Data/ITEM_IMAGE.csv"
INFILE "/home/dmf/Cycle8Data/ITEM_IMAGE.csv"
The file names change on every control file (SUPPLIER.csv)
Is there any way to pass the File path in a variable, or set any Env. Variable?
So that the control file is not edited everytime
You can pass the data file name on the command line; from the documentation:
DATA specifies the name of the data file containing the data to be loaded. If you do not specify a file extension or file type, then the default is .dat.
If you specify a data file on the command line and also specify data files in the control file with INFILE, then the data specified on the command line is processed first. The first data file specified in the control file is ignored. All other data files specified in the control file are processed.
So pass the relevant file name with each invocation, e.g.
sqlldr user/passwd control=myfile.ctl data=/home/dmf/Cycle7Data/ITEM_IMAGE.csv
If you have lots of files to load from a directory you could have a shell script that loops over the directory contents and passes each file name in turn to an SQL*Loader session.

How can I specify the file location to write and read from in Ruby?

So, I have a function that creates an object specifying user data. Then, using the Ruby YAML gem and some code, I put the object to a YAML file and save it. This saves the YAML file to the location where the Ruby script was run from. How can I tell it to save to a certain file directory? (A simplified version of) my code is this
print "Please tell me your name: "
$name=gets.chomp
$name.capitalize!
print "Please type in a four-digit PIN number: "
$pin=gets.chomp
I also have a function that enforces that the pin be a four-digit integer, but that is not important.
Then, I add this to an object
new_user=Hash.new (false)
new_user["name"]=$name
new_user["pin"]=$pin
and then add it to a YAML file and save it. If the YAML file doesn't exist, one is created. It creates it in the same file directory as the script is run in. Is there a way to change the save location?
The script fo save the object to a YAML file is this.
def put_to_yaml (new_user)
File.write("#{new_user["name"]}.yaml", new_user.to_yaml)
end
put_to_yaml(new_user)
Ultimately, the question is this: How can I change the save location of the file? And when I load it again, how can i tell it where to get the file from?
Thanks for any help
Currently when you use File.write it takes your current working directory, and appends the file name to that location. Try:
puts Dir.pwd # Will print the location you ran ruby script from.
You can specify the absolute path if you want to write it in a specific location everytime:
File.write("/home/chameleon/different_location/#{new_user["name"]}.yaml")
Or you can specify a relative path to your current working directory:
# write one level above your current working directory
File.write("../#{new_user["name"]}.yaml", new_user.to_yaml)
You can also specify relative to your current executing ruby file:
file_path = File.expand_path(File.dirname(__FILE__))
absolute_path = File.join(file_path, file_name)
File.write(absolute_path, new_user.to_yaml)
You are supplying a partial pathname (a mere file name), so we read and write from the current directory. Thus you have two choices:
Supply a full absolute pathname (personally, I like to use the Pathname class for this); or
Change the current directory first (with Dir.chdir)

Resources