Nifi: how to use fileFileter for fetching files from hadoop? [closed] - hadoop

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I want to fetch files from hadoop directory based on their filename,logically it looks like this ${filename}.* (because i have several files with similar name they look like this 2011-01-01.1 , 2011-01-01.2 etc.) i tried to use listhdfs+fetchhdfs but they can't match my logic
Can you give me any batter idea how can i do it inside nifi environment?
is it possible to make this task by groovy code inside ExecuteScript processor ?
how can i connect hdfs directory by groovy code ?
after getting this files i should put them in a flowfile list and can't transfer flowfiles untill flowfile list size hasn't matched the value of count attribute( placed in flowfile)

Salome,
Using ListHDFS can list out every files present in HDFS.
Afterwards you can use "RouteOnAttribute" to match files with below pattern then you can fetch those files.
${filename:matches('\d{4}-\d{2}-\d{2}.\d')}
Now it matches with files present in HDFS and it comes in matched route.
Next use FetchFile after RouteOnAttribute to be matched.
Here you can use fetch files with pattern"\d{4}-\d{2}-\d{2}.\d"
It will fetch your required files only.

Related

bash: Parse multiple arrays from multiple config files and write to one file [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 days ago.
Improve this question
Have multiple arrays like below in a multiple config files (abc1.conf, abc2.conf etc..)
which i want to consolidate into one file. So essentially take all CALCS and LOOPS form all config files and generate one config file with all arrays. Can anyone help out here
ex. abc1.conf contains the below
CALCS=( SUM CASES VALUE )
LOOPS=( SERVICE INSERT )
edit: filter out any duplicates if encountered in CALCS and LOOPS, so essentially only save unique values

Create Batch Script to Scan Multiple Files [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 months ago.
Improve this question
I'm looking to make a batch script. What I have is a ton of different logs in a ton of different places, all with a bit of important info and a lot of useless info.
Some of these logs are directly named, and some are named by date and time.
I'm trying to make a script that can basically scan them for me so I don't have to go through each folder and log and scan manually.
Some examples:
OmegaManager\logs\omega.log - scan for "State:" and "has been updated"
OmegaManager\logs\instance-0.log - scan for "shutdown"
KRBanking\PlayerDatabase*logs are numbers* - copy all information or scan for changes since last time?
KRBanking\ServerLogs*logs are dates and times* - scan for "Deposited" and "Withdrawed"
And then output the whole line.
Is this even moderately possible? Thanks in advance.
Yes, this is possible. Batch files support loops, conditions, and filtering/searching.
A FOR loop allows you to iterate through files or directories, you'll find more information in this SO post.
To find strings in a file, you have several options, i.e. find and findstr, see riptutorial.com:
FIND can scan large files line-by-line to find a certain string with no wildcard support.
FINDSTR has more features and supports regular expressions with wildcards in the search string.
Super simple example for one of your use cases
The batch script must be placed where your files are. Otherwise, you need to add a full path instead of just "omega.log".
FINDSTR /L /C:"State:" /C:"has been updated" omega.log
To start with batch programming, you can read the findstr documentation and some tutorials.

How to extract a specific list of files from a folder in Windows? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a folder of around 3,000 music files of all the same type (.flac).
I made an excel (and .txt) list of around 1,000 files in that folder that I want to move to a different folder.
Is there a way to accomplish this without having to manually move each file by referencing the list?
Thank you!
First make a backup of everything!
Yes this is indeed possible, I wrote a little python script for you:
import os
f = open("whichtomove.txt", "r")
filelist = f.read().split("\n")
for x in range(0, len(filelist)):
os.rename(("fromhere/" + str(filelist[x])),("tohere/" + str(filelist[x])))
You just have to change the folder paths and I assumed the list is in this format:
file1.flac
tihs.flac
If the format in your format is another you just have to change the split operator, e.g. to ";" if you split the list entries with ';'

Scanning for a child directory and then changing into that directory? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have a large program in Ruby that's being distributed over several projects. The only problem is the other several projects all have separate members on them that have their own way of coding, and the program is set up to be alongside a specific path, which was for the first project that I used it w/. I've had so many errors to correct that were simply mistaken paths. What I want to do now is scan an entire project for an individual directory (as the program's overhead directory is constant in every instance of its usage) and then set the path to that directory. To keep things simplistic, let's say it's w/ a Rails project, so Rails.root can be the overhead, and the directory to search for is myawesomedir. Any help is greatly appreciated!
You can use the find stdlib:
require 'find'
Find.find(Rails.root) do |path|
if File.basename(path) == 'myawesomedir'
Dir.chdir path
break
end
end

How to convert .nii to .nii.gz file? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have lots of .nii file. I want to know how to convert .nii file to .nii.gz file?
Thanks
As far as I know, there is nothing special about zipping NIfTI files. In MATLAB, you could simply do:
gzip('niftifilename.nii') % this will return niftifilename.nii.gz
gzip('*.nii') % for multiple nii files to create one .nii.gz
To work with the file again, you can unzip it, using gunzip. I've tried this on my Mac (don't know if this will work on Windows).
Typically, they are volume data, and hence take up a fair bit of disk space. Zipping it is purely for reducing the size of the file, and should not modify data.
You can simply do:
gzip({'*.nii'},outputdir)
Which will zip all your nii files into a nii.gz and place it into outputdir.
From the documentation:
To gzip all .m and .mat files in the current directory and store the
results in the directory archive, type:
gzip({'.m','.mat'},'archive')

Resources