Let's suppose I have a directory with this Structure:
I am making a script that removes the letter "x" in those cases when there is one at the end in order to have this:
So, I was thinking to do a For Loop that goes to each directory and then take the last part of the folder path
for d in Fruits_and_Vegetables/*/
do
(cd "$d" && Fruit_or_Vegetable=`basename "$PWD"`)
done
The problem is that I am not sure how to tell to go and take only the last Directory
And then, I was thinking in modifying the string
echo $Fruit_or_Vegetable | awk '{print substr($1, 1, length($1)-1)}' # substr(s, a, b) : it returns b number of chars from string s, starting at position a.
# The parameter b is optional, in which case it means up to the end of the string.
The problem is that I don't know how to tell AWK to consider "Moscato Giallox" as just one word, because when I execute the command it returns "Moscat Giallox" instead of "Moscato Giallo" .. also I guess I have to place an if statement to see if the last letter is x, and only execute the command in those cases.
Could you give me some suggestions, thanks.
shopt and Parameter Expansion parsing can make that pretty easy. Starting with your directory structure above:
$: find #(Fruits|Vegetables)
Fruits
Fruits/Grapes
Fruits/Grapes/Muskat Grape
Fruits/Grapes/Muskat Grape/Muscat Ottonel
Fruits/Grapes/Muskat Grape/Muscato Gallox
Fruits/Mangoes
Fruits/Mangoes/Ataulfo Mango
Fruits/Mangoes/Tommy Atkins Mangox
Vegetables
Vegetables/Potatoes
Vegetables/Potatoes/Ratte Potatox
Vegetables/Potatoes/Yukon Gold Potato
Code for the changes you want.
shopt -s extglob # allow fancy #(...) construct to specify dirs
shopt -s globstar # add double-asterisk for flexible depth
for d in #(Fruits|Vegetables)/**/*x/ # *EDITED* - added trailing / for dirs only
do echo "mv \"$d\" \"${d%x/}/\" " # show the command first
mv "$d" "${d%x/}/" # rename the dirs
done
Output from the echo statements:
mv "Fruits/Grapes/Muskat Grape/Muscato Gallox/" "Fruits/Grapes/Muskat Grape/Muscato Gallo/"
mv "Fruits/Mangoes/Tommy Atkins Mangox/" "Fruits/Mangoes/Tommy Atkins Mango/"
mv "Vegetables/Potatoes/Ratte Potatox/" "Vegetables/Potatoes/Ratte Potato/"
Result:
$: find #(Fruits|Vegetables)
Fruits
Fruits/Grapes
Fruits/Grapes/Muskat Grape
Fruits/Grapes/Muskat Grape/Muscat Ottonel
Fruits/Grapes/Muskat Grape/Muscato Gallo
Fruits/Mangoes
Fruits/Mangoes/Ataulfo Mango
Fruits/Mangoes/Tommy Atkins Mango
Vegetables
Vegetables/Potatoes
Vegetables/Potatoes/Ratte Potato
Vegetables/Potatoes/Yukon Gold Potato
Related
I need to find pairs of files with a specific pattern in one directory:
HU_IP_number_something.bam & HU_inp_number_something.bam
NOC_IP_number_something.bam & NOC_inp_number_something.bam
Numbers are 1...N for each pair
I have a solution but it works only for one set of files HU_* or NOC_* in one directory.
How can I improve it to find pairs, when both HU_* and NOC_* are in one directory?
for ip in *IP*.bam
do
num=$(echo $ip | sed 's/[^0-9]//g')
input=$(find -name *_inp_${num}*.bam)
echo ip sample: $ip
echo input sample: $input
done
Examples of files in one directory:
HU_inp_1-sorted.bam
HU_IP_1-sorted.bam
NOC_inp_1-sorted.bam
NOC_IP_1-sorted.bam
for 1,2,3,...N
The following builds an array, $a for each iteration of a for loop.
$ for f in *IP*.bam; do s=${f#*_}; a=( *${s} ); declare -p a; done
declare -a a=([0]="HU_IP_number_something.bam" [1]="NOC_IP_number_something.bam")
declare -a a=([0]="HU_IP_number_something.bam" [1]="NOC_IP_number_something.bam")
This works steps through all the files you've specified in your filespec, stripping off the first "field" (as denoted by the underscore separator), and using globbing to collect the relevant files in the array.
You can test for the length of the array (${#a[#]}) to make sure you have two entries.
If you want to group by the second field instead of the first, you need a little more processing:
$ for f in *IP*.bam; do s1=${f%%_*}; s2=${f#*_}; s2=${s2#*_}; a=( ${s1}*${s2} ); declare -p a; done
declare -a a=([0]="HU_IP_number_something.bam" [1]="HU_inp_number_something.bam")
declare -a a=([0]="NOC_IP_number_something.bam" [1]="NOC_inp_number_something.bam")
The technique here, using ${var#pattern} and ${var%pattern} is called Parameter Expansion, and you can find more details about it in the bash man page. Here too.
Do you only want to match HU to HU and NOC to NOC? If so:
If you add a line
pre=$(echo $ip | awk -F "_" '{print $1}')
then change you input to
input=$(find -name $pre_inp_${num}*.bam)
Here is my script:
#/bin/bash
list="a b c"
for i in $list; do
echo $i
done
This works:
➜ ~ ./lol.sh
a
b
c
This doesn't:
➜ ~ . ./lol.sh
a b c
Why split does not work with dot command and how can I fix it?
Lists should never be represented as strings. Use array syntax.
list=( a b c )
for i in "${list[#]}"; do
echo "$i"
done
There are several reasons this is preferable.
In ZSH:
ZSH breaks POSIX by not performing string-splitting at all on unquoted expansions unless they explicitly request it. You can make this request by either running setopt sh_word_split, or using the parameter expansions ${=list} or ${(ps: :)list}
In other Bourne-derived shells:
String-splitting is dependent on the value of IFS, which cannot be guaranteed to be at defaults, especially when sourced from a separate script (which may have changed it locally).
Unquoted expansion also performs globbing, which can have different results depending on which files are in the current working directory (for instance, if your list contains hello[world], this will behave in an unexpected manner if your current directory contains files named hellow, helloo, or otherwise matching the glob).
Avoiding the globbing step is not only more correct, but also more efficient.
Whilst I note the comment regarding lists by Charles Duffy, this was my solution/test.
#!/bin/zsh
function three()
{
first=$1
second=$2
third=$3
echo "1: $first 2: $second 3:$third"
}
setopt sh_word_split
set "1 A 2" "2 B 3" "3 C 4" "4 D 5"
for i;do
three $i;
done
This will output
1: 1 2: A 3:2
1: 2 2: B 3:3
1: 3 2: C 3:4
1: 4 2: D 3:5
I want to find all files that have multiple new line characters at the end of their content.
How is this possible?
This bash command prints all files in current directory and its subdirectories that terminate with at least one empty line at the end after a sequence of one or more lines (i.e. at least a sequence of two \n):
find . -type f -print | while read a; do tail -2 "$a" | ( read x && read y && [ x"$x" = x ] && echo "$a" ); done
FYI: It's possible to search for this with PHPStorm using RegEx search term \n+\n\Z
For example:
((
extract everything here, ignore the rest
))
I know how to ignore everything within, but I don't know how to do the opposite. Basically, it'll be a file and it needs to extract the data between the two points and then output it to another file. I've tried countless approaches, and all seem to tell me the indentation I'm stating doesn't exist in the file, when it does.
If somebody could point me in the right direction, I'd be grateful.
If your data are "line oriented", so the marker is alone (as in the example), you can try some of the following:
function getdata() {
cat - <<EOF
before
((
extract everything here, ignore the rest
someother text
))
after
EOF
}
echo "sed - with two seds"
getdata | sed -n '/((/,/))/p' | sed '1d;$d'
echo "Another sed solution"
getdata | sed -n '1,/((/d; /))/,$d;p'
echo "With GNU sed"
getdata | gsed -n '/((/{:a;n;/))/b;p;ba}'
echo "With perl"
getdata | perl -0777 -pe "s/.*\(\(\s*\\n(.*)?\)\).*/\$1/s"
Ps: yes, its looks like a dance of crazy toothpicks
Assuming you want to extract the string inside (( and )):
VAR="abc((def))ghi"
echo "$VAR"
VAR=${VAR##*((}
VAR=${VAR%%))*}
echo "$VAR"
## cuts away the longest string from the beginning; # cuts away the shortest string from the beginning; %% cuts away the longest string at the end; % cuts away the shortes string at the end
The file :
$ cat /tmp/l
((
extract everything here, ignore the rest
someother text
))
The script
$ awk '$1=="((" {p=1;next} $1=="))" {p=o;next} p' /tmp/l
extract everything here, ignore the rest
someother text
sed -n '/^((/,/^))/ { /^((/b; /^))/b; p }'
Brief explanation:
/^((/,/^))/: range addressing (inclusive)
{ /^((/b; /^))/b; p }: sequence of 3 commands
1. skip line with ^((
2. skip line with ^))
3. print
The line skipping is required to make the range selection exclusive.
I have a trivial problem with regular expression in bash.
#!/bin/bash
FNAME=$1
echo ${FNAME//.*\/tests\//}
I want to remove everything before /test/ including the /test/ as well. Because of some reasons ".*" doesn't work.
$ ./eclipse/unittest.sh /foo/tests/bar
/foo/tests/bar
How do I select anything in bash reg exp?
You can use # followed by a pattern to remove everything up to and including the pattern. It will use the shortest match:
function f {
echo ${1#*/tests/}
}
$ f /foo/tests/bar
bar
$ f /foo/tests/bar/tests/last
bar/tests/last
If you want to use the longest match, you can use ##:
function f {
echo ${1##*/tests/}
}
$ f /foo/tests/bar
bar
$ f /foo/tests/bar/tests/last
last