Get directory name with grep and remove it - bash

please is there any simple way how can I get NAME output only from lines, where DATE < 5 days ago and then call other command called rm on these lines with NAME as argument?
I have the following output from mega-ls path/ -l (mega.nz) command:
FLAGS VERS SIZE DATE NAME
d--- - - 06Feb2020 05:00:01 bk_20200206050000
d--- - - 07Feb2020 05:00:01 bk_20200207050000
d--- - - 08Feb2020 05:00:01 bk_20200208050000
d--- - - 09Feb2020 05:00:01 bk_20200209050000
d--- - - 10Feb2020 05:00:01 bk_20200210050000
d--- - - 11Feb2020 05:00:01 bk_20200211050000
I tried grep, sort and other ways e.g. mega-ls path/ -l | head -n 5 but I don't know how to search these lines based on the date.
Thank you a lot.

I try find simple way for you request ;)
mega-ls path/ -l | head -n 5 | tr -s ' ' | cut -d ' ' -f6 | grep -v -e '^$' | grep '^bk_20200206.*' | xargs rm -f
Part 1 : This is you command (returned folders list by extra data)
mega-ls path/ -l | head -n 5
Part 2 : Try to remove extra space in your part 1 result
tr -s ' '
Part 3 : Try to use cut command to delimit result part 2 and return Name Folders column
cut -d ' ' -f6
Part 4 : Try to remove Empty lines from result part 3 (result of header line)
grep -v -e '^$'
Part 5 : This your request for search folders name by date yyyymmdd format example : 20200206 (replace 20200206 to your real date need)
grep '^bk_20200206.*'
Part 6 : (Very Important!!) If you need to delete result folders use this part (Very Important!!)
xargs rm -f
Best Regards

Related

Bash: concatenated variables derived from text file using grep gives confused output

In my directory, I have a multiple nifti files (e.g., WIP944_mp2rage-0.75iso_TR5.nii) from my MRI scanner accompanied by text files (e.g., WIP944_mp2rage-0.75iso_TR5_info.txt) containing information on the acquisition parameters (e.g., "Series description: WIP944_mp2rage-0.75iso_TR5_INV1_PHS_ND"). Based on these parameters (e.g., INV1_PHS_ND), I need to change the nifti file name, which are echoed in $niftibase. I used grep to do this. When echoing all variables individually, it gives me what I want, but when I try to concatenate them into one filename, the variables are mixed together, instead of delimited by a dot.
I tried multiple forms of sed to cut away potentially invisible characters and identified the source of the problems: the "INV1_PHS_ND" part of 'series description' gives me troubles, which is the $struct component, potentially due to the fact that this part varies in how many fields are extracted. Sometimes this is 3 (in the case of INV1_PHS_ND), but it can be 2 as well (INV1_ND). When I introduce this variable into the filename, everything goes haywire.
for infofile in ${PWD}/*.txt; do
# General characteristics of subjects (i.e., date of session, group number, and subject number)
reco=$(grep -A0 "Series description:" ${infofile} | cut -d ' ' -f 3 | cut -d '_' -f 1)
date=$(grep -A0 "Series date:" ${infofile} | cut -c 16-21)
group=$(grep -A0 "Subject:" ${infofile} | cut -d '^' -f 2 | cut -d '_' -f 1 )
number=$(grep -A0 "Subject:" ${infofile} | cut -d '^' -f 2 | cut -d '_' -f 2)
ScanNr=$(grep -A0 "Series number:" ${infofile} | cut -d ' ' -f 3)
# Change name if reco has structural prefix
if [[ $reco = *WIP944* ]]; then
struct=$(grep -A0 "Series description: WIP944" ${infofile} | cut -d '_' -f 4,5,6)
niftibase=$(basename $infofile _info.txt).nii
#echo ${subStudy}.struct.${date}.${group}.${protocol}.${paradigm}.nii
echo ${subStudy}.struct.${struct}.${date}.${group}.${protocol}${number}.${paradigm}.n${ScanNr}.nii
#mv ${niftibase} ${subStudy}.struct.${struct}.${date}.${group}.${protocol}${number}.${paradigm}.n${ScanNr}.nii
fi
done
This gives me output like this:
.niit47.n4lot.Noc002
.niit47.n5lot.Noc002D
.niit47.n6lot.Noc002
.niit47.n8lot.Noc002
.niit47.n9lot.Noc002
.niit47.n10ot.Noc002
.niit47.n11ot.Noc002D
for all 7 WIP944 files. However, it needs to be in the direction of this:
H1.struct.INV2_PHS_ND.190523.Pilot.Noc001.Heat47.n11.nii, where H1, Noc, and Heat47 are loaded in from a setup file.
EDIT: I tried to use awk in the following way:
reco=$(awk 'FNR==8 {print;exit}' $infofile | cut -d ' ' -f 3 | cut -d '_' -f 1)
date=$(awk 'FNR==2 {print;exit}' $infofile | cut -c 15-21)
group=$(awk 'FNR==6 {print;exit}' $infofile | cut -d '^' -f 2 | cut -d '_' -f 1 )
number=$(awk 'FNR==6 {print;exit}' $infofile | cut -d '^' -f 2 | cut -d '_' -f 2)
ScanNr=$(awk 'FNR==14 {print;exit}' $infofile | cut -d ' ' -f 3)
which again gave me the correct output when echoing the variables individually, but not when I tried to combine them: .niit47.n11022_PHS_ND.
I used echo "$struct" | tr -dc '[:print:]' | od -c to see if there were hidden characters due to line endings, which resulted in:
0000000 I N V 2 _ P H S _ N D
0000013
EDIT: This is how the text file looks like:
Series UID: 1.3.12.2.1107.5.2.34.18923.2019052316005066316714852.0.0.0
Study date: 20190523
Study time: 153529.718000
Series date: 20190523
Series time: 160111.750000
Subject: MDC-0153,pilot_003^pilot_003
Subject birth date: 19970226
Series description: WIP944_mp2rage-0.75iso_TR5_INV1_PHS_ND
Image type: ORIGINAL\PRIMARY\P\ND
Manufacturer: SIEMENS
Model name: Investigational_Device_7T
Software version: syngo MR B17
Study id: 1
Series number: 5
Repetition time (ms): 5000
Echo time[1] (ms): 2.51
Inversion time (ms): 900
Flip angle: 7
Number of averages: 1
Slice thickness (mm): 0.75
Slice spacing (mm):
Image columns: 320
Image rows: 320
Phase encoding direction: ROW
Voxel size x (mm): 0.75
Voxel size y (mm): 0.75
Number of volumes: 1
Number of slices: 240
Number of files: 240
Number of frames: 0
Slice duration (ms) : 0
Orientation: sag
PixelBandwidth: 248
I have one of these for each nifti file. subStudy is hardcoded in a setup file, which is loaded in prior to running the for loop. When I echo this, it shows the correct value. I need to change the names of multiple files with a specific prefix, which are stored in $reco.
As confirmed in comments, the input files have DOS carriage returns, which are basically invalid in Unix files. Also, you should pay attention to proper quoting.
As a general overhaul, I would recommend replacing the entire Bash script with a simple Awk script, which is both simpler and more idiomatic.
for infofile in ./*.txt; do # no need to use $(PWD)
# Pre-filter with a simple grep
grep -q '^Series description: [^ _]*WIP944' "$infofile" && continue
# Still here? Means we want to rename
suffix="$(awk -F : '
BEGIN { split("Series description:Series date:Subject:Series number", f, /:/) }
{ sub(/\r/, ""); } # get rid of pesky DOS carriage return
NR == 1 { nifbase = FILENAME; sub(/_info\.txt$/, ".nii", nifbase) }
$1 in f { x[$1] = substring($0, length($1)+2) }
END {
split(x["Series description"], t, /_/); struct=t[4] "_" t[5] "_" t[6]
split(x["Series description"], t, /_/); reco = t[1]
date=substr(x["Series date"], 16, 5)
split(x["Subject"], t, /\^/); split(t[2], tt, /_/); group=tt[1]
number=tt[2]
ScanNr=x["Series number"]
### FIXME: protocol and paradigm are still undefined
print struct "." date "." group "." protocol number "." paradigm ".n" ScanNr
}' "$infofile")"
echo mv "$infofile" "$subStudy.struct.$suffix"
done
This probably still requires some tweaking (at least "protocol" and "paradigm" are still undefined). Once it seems to print the correct values, you can remove the echo before mv and have it actually rename files for you.
(Probably still better test on a copy of your real data files first!)

Run script on all files in dir sharing a common id

I have some files in a dir:
A573R25.file_1.txt
A573R25.file_2.txt
A573R25.file_3.txt
A573R27.file_1.txt
A573R27.file_2.txt
A573R29.file_1.txt
A573R29.file_2.txt
A573R29.file_3.txt
A573R31.file_1.txt
A573R31.file_2.txt
A573R31.file_3.txt
A573R33.file_1.txt
A573R33.file_2.txt
A573R33.file_3.txt
I want to run a script on all files sharing a common id (but with varying text separating the id (e.g. A573R25) and .txt). For example:
perl my_script.pl A573R25*.txt
However, I want to do this for all files in the dir in a bash script.
Here's what I've tried:
samples+=$(ls -1 *.txt | cut -d '.' -f 1)
for ((i=0;i<${#samples[#]};++i))
do
ls -1 ${samples[i]}*.txt
done
But in each case I get (e.g.):
ls: A573R25: No such file or directory
My expected output for the first id is:
A573R25.file_1.txt
A573R25.file_2.txt
A573R25.file_3.txt
What am I doing wrong?
You need a sort -u in your sample collection, and it needs to be an array set:
samples+=( $( ls -1 *.txt | cut -d '.' -f 1 | sort -u ) )
Here is full code and results:
$ unset samples
$ samples+=( $(ls -1 *.txt | cut -d '.' -f 1 | sort -u ) )
$ for ((i=0;i<${#samples[#]};++i)); do ls -1 ${samples[i]}*.txt; done
A573R25.file_1.txt
A573R25.file_2.txt
A573R25.file_3.txt
A573R27.file_1.txt
A573R27.file_2.txt
A573R29.file_1.txt
A573R29.file_2.txt
A573R29.file_3.txt
A573R31.file_1.txt
A573R31.file_2.txt
A573R31.file_3.txt
A573R33.file_1.txt
A573R33.file_2.txt
A573R33.file_3.txt

How to properly use the grep command to grab and store integers?

I am currently building a bash script for class, and I am trying to use the grep command to grab the values from a simple calculator program and store them in the variables I assign, but I keep receiving a syntax error message when I try to run the script. Any advice on how to fix it? my script looks like this:
#!/bin/bash
addanwser=$(grep -o "num1 + num2" Lab9 -a 5 2)
echo "addanwser"
subanwser=$(grep -o "num1 - num2" Lab9 -s 10 15)
echo "subanwser"
multianwser=$(grep -o "num1 * num2" Lab9 -m 3 10)
echo "multianwser"
divanwser=$(grep -o "num1 / num2" Lab9 -d 100 4)
echo "divanwser"
modanwser=$(grep -o "num1 % num2" Lab9 -r 300 7)
echo "modawser"`
You want to grep the output of a command.
grep searches from either a file or standard input. So you can say either of these equivalent:
grep X file # 1. from a file
... things ... | grep X # 2. from stdin
grep X <<< "content" # 3. using here-strings
For this case, you want to use the last one, so that you execute the program and its output feeds grep directly:
grep <something> <<< "$(Lab9 -s 10 15)"
Which is the same as saying:
Lab9 -s 10 15 | grep <something>
So that grep will act on the output of your program. Since I don't know how Lab9 works, let's use a simple example with seq, that returns numbers from 5 to 15:
$ grep 5 <<< "$(seq 5 15)"
5
15
grep is usually used for finding matching lines of a text file. To actually grab a part of the matched line other tools such as awk are used.
Assuming the output looks like "num1 + num2 = 54" (i.e. fields are separated by space), this should do your job:
addanwser=$(Lab9 -a 5 2 | awk '{print $NF}')
echo "$addanwser"
Make sure you don't miss the '$' sign before addanwser when echo'ing it.
$NF selects the last field. You may select nth field using $n.

Get the first real number from a series of files

I try to take the first number from each file.dat of the form:
5.01 1 56.413481000 -0.00063400 0.00095770
5.01 2 61.193808800 0.00102170 0.00078280
5.01 3 65.974136600 -0.00108170 0.00102620
5.01 4 70.754464300 0.00082490 0.00103630
and then use this number (5.01) as the title of a .png file.
I use a bash script and I know the command line=$(head -n 1 $f) as found in a question here, but this take to me the first line of the file $f.
In this case also the space in the line is saved and the .png file title became:
plot 5.01 1 56.413481000 -0.00063400 0.00095770.png
There is some way to take only 5.01 and have a trim title for the plot?
Thanks to all.
I'd probably just do it with perl:
VAL=$( echo "$line" | perl -pe 's/^[^\d]+//g;s/[^\d\.].*$//' )
Something like that anyway.
Should remove:
anything that isn't a digit from the start of line.
Anything not-digit or not . to the end of line.
Or with grep:
grep -o "[0-9]*\.[0-9]*" file.dat | head -1
Edit:
Testing without the head -1 for a oneline input:
echo " 5.01 2 61.193808800 0.00102170 0.00078280" | grep -o "[0-9]*\.[0-9]*"
5.01
61.193808800
0.00102170
0.00078280
Using head -1 will return the first match on the first line.
When you know the match will be on the first line, so can we ignore files with an incorrect first line (and don't grep through complete files):
Make a two-headed monster:
head -1 | grep -o "[0-9]*\.[0-9]*" file.dat | head -1
To extract the first field, assuming they are tab separated:
val=$(head -n 1 $f | cut -f 1)
or, if they are space separated instead:
val=$(head -n 1 $f | cut -f 1 -d ' ')
OR you can avoid calling any extra processes and keep all data manipulation in the bash shell with
while read realNum restOfLine ;
break
done < $f
echo $realNum
This grabs the first "word" and puts the remaining into "restOfLine".
The break ensures that you only read the first line of the file.
IHTH

Get variable value by finding keyword in unix environment

In UNIX environment, I have a file.txt that contains following details:
Data recording started:
0001100 Matched at 412090
0001101 Mismatched at 414798
0001102 Matched at 420007
0001103 Mismatched at 420015
Job completed
How do I can get the first Matched value by searching "Matched" (line 2) word and also for the first "Mismatched" (line 3)
Find the difference between them and store as a variable, "dif"
The result is Matched minus Mismatched, so it cannot find the data by specify line number, i.e. find line 3 last integers minus line 2 last integers, because the mismatched may come at first like following:
Data recording started:
0001100 Mismatched at 412090
0001101 Matched at 414798
0001102 Mismatched at 420007
0001103 Matched at 420015
Job completed
One way:
echo $((
$(grep Matched input | head -1 | sed 's/.*at //')
- $(grep Mismatched input | head -1 | sed 's/.*at //')
))
or using only sed:
echo $((
$(sed -n 's/.*Matched.*at //p' input | head -1)
- $(sed -n 's/.*Mismatched.*at //p' input | head -1)
))
Output
-2708
We can use grep -m 1 to kick away head.
dif=$((
$(grep -m 1 'Matched' a.txt | sed 's/.*at \([0-9]*\).*/\1/')
- $(grep -m 1 'Mismatched' a.txt | sed 's/.*at \([0-9]*\).*/\1/')
))
echo $dif

Resources