Script to Remove a Letter at the End of some Directories

Script to Remove a Letter at the End of some Directories - bash

Let's suppose I have a directory with this Structure:
I am making a script that removes the letter "x" in those cases when there is one at the end in order to have this:
So, I was thinking to do a For Loop that goes to each directory and then take the last part of the folder path
for d in Fruits_and_Vegetables/*/
do
(cd "$d" && Fruit_or_Vegetable=`basename "$PWD"`)
done
The problem is that I am not sure how to tell to go and take only the last Directory
And then, I was thinking in modifying the string
echo $Fruit_or_Vegetable | awk '{print substr($1, 1, length($1)-1)}' # substr(s, a, b) : it returns b number of chars from string s, starting at position a.
# The parameter b is optional, in which case it means up to the end of the string.
The problem is that I don't know how to tell AWK to consider "Moscato Giallox" as just one word, because when I execute the command it returns "Moscat Giallox" instead of "Moscato Giallo" .. also I guess I have to place an if statement to see if the last letter is x, and only execute the command in those cases.
Could you give me some suggestions, thanks.

shopt and Parameter Expansion parsing can make that pretty easy. Starting with your directory structure above:
$: find #(Fruits|Vegetables)
Fruits
Fruits/Grapes
Fruits/Grapes/Muskat Grape
Fruits/Grapes/Muskat Grape/Muscat Ottonel
Fruits/Grapes/Muskat Grape/Muscato Gallox
Fruits/Mangoes
Fruits/Mangoes/Ataulfo Mango
Fruits/Mangoes/Tommy Atkins Mangox
Vegetables
Vegetables/Potatoes
Vegetables/Potatoes/Ratte Potatox
Vegetables/Potatoes/Yukon Gold Potato
Code for the changes you want.
shopt -s extglob # allow fancy #(...) construct to specify dirs
shopt -s globstar # add double-asterisk for flexible depth
for d in #(Fruits|Vegetables)/**/*x/ # *EDITED* - added trailing / for dirs only
do echo "mv \"$d\" \"${d%x/}/\" " # show the command first
mv "$d" "${d%x/}/" # rename the dirs
done
Output from the echo statements:
mv "Fruits/Grapes/Muskat Grape/Muscato Gallox/" "Fruits/Grapes/Muskat Grape/Muscato Gallo/"
mv "Fruits/Mangoes/Tommy Atkins Mangox/" "Fruits/Mangoes/Tommy Atkins Mango/"
mv "Vegetables/Potatoes/Ratte Potatox/" "Vegetables/Potatoes/Ratte Potato/"
Result:
$: find #(Fruits|Vegetables)
Fruits
Fruits/Grapes
Fruits/Grapes/Muskat Grape
Fruits/Grapes/Muskat Grape/Muscat Ottonel
Fruits/Grapes/Muskat Grape/Muscato Gallo
Fruits/Mangoes
Fruits/Mangoes/Ataulfo Mango
Fruits/Mangoes/Tommy Atkins Mango
Vegetables
Vegetables/Potatoes
Vegetables/Potatoes/Ratte Potato
Vegetables/Potatoes/Yukon Gold Potato

Related

Find pairs of files in one directory with a specific pattern

I need to find pairs of files with a specific pattern in one directory:
HU_IP_number_something.bam & HU_inp_number_something.bam
NOC_IP_number_something.bam & NOC_inp_number_something.bam
Numbers are 1...N for each pair
I have a solution but it works only for one set of files HU_* or NOC_* in one directory.
How can I improve it to find pairs, when both HU_* and NOC_* are in one directory?
for ip in *IP*.bam
do
num=$(echo $ip | sed 's/[^0-9]//g')
input=$(find -name *_inp_${num}*.bam)
echo ip sample: $ip
echo input sample: $input
done
Examples of files in one directory:
HU_inp_1-sorted.bam
HU_IP_1-sorted.bam
NOC_inp_1-sorted.bam
NOC_IP_1-sorted.bam
for 1,2,3,...N

The following builds an array, $a for each iteration of a for loop.
$ for f in *IP*.bam; do s=${f#*_}; a=( *${s} ); declare -p a; done
declare -a a=([0]="HU_IP_number_something.bam" [1]="NOC_IP_number_something.bam")
declare -a a=([0]="HU_IP_number_something.bam" [1]="NOC_IP_number_something.bam")
This works steps through all the files you've specified in your filespec, stripping off the first "field" (as denoted by the underscore separator), and using globbing to collect the relevant files in the array.
You can test for the length of the array (${#a[#]}) to make sure you have two entries.
If you want to group by the second field instead of the first, you need a little more processing:
$ for f in *IP*.bam; do s1=${f%%_*}; s2=${f#*_}; s2=${s2#*_}; a=( ${s1}*${s2} ); declare -p a; done
declare -a a=([0]="HU_IP_number_something.bam" [1]="HU_inp_number_something.bam")
declare -a a=([0]="NOC_IP_number_something.bam" [1]="NOC_inp_number_something.bam")
The technique here, using ${var#pattern} and ${var%pattern} is called Parameter Expansion, and you can find more details about it in the bash man page. Here too.

Do you only want to match HU to HU and NOC to NOC? If so:
If you add a line
pre=$(echo $ip | awk -F "_" '{print $1}')
then change you input to
input=$(find -name $pre_inp_${num}*.bam)

Zsh - split string by spaces when using dot operator

Here is my script:
#/bin/bash
list="a b c"
for i in $list; do
echo $i
done
This works:
➜ ~ ./lol.sh
a
b
c
This doesn't:
➜ ~ . ./lol.sh
a b c
Why split does not work with dot command and how can I fix it?

Lists should never be represented as strings. Use array syntax.
list=( a b c )
for i in "${list[#]}"; do
echo "$i"
done
There are several reasons this is preferable.
In ZSH:
ZSH breaks POSIX by not performing string-splitting at all on unquoted expansions unless they explicitly request it. You can make this request by either running setopt sh_word_split, or using the parameter expansions ${=list} or ${(ps: :)list}
In other Bourne-derived shells:
String-splitting is dependent on the value of IFS, which cannot be guaranteed to be at defaults, especially when sourced from a separate script (which may have changed it locally).
Unquoted expansion also performs globbing, which can have different results depending on which files are in the current working directory (for instance, if your list contains hello[world], this will behave in an unexpected manner if your current directory contains files named hellow, helloo, or otherwise matching the glob).
Avoiding the globbing step is not only more correct, but also more efficient.

Whilst I note the comment regarding lists by Charles Duffy, this was my solution/test.
#!/bin/zsh
function three()
{
first=$1
second=$2
third=$3
echo "1: $first 2: $second 3:$third"
}
setopt sh_word_split
set "1 A 2" "2 B 3" "3 C 4" "4 D 5"
for i;do
three $i;
done
This will output
1: 1 2: A 3:2
1: 2 2: B 3:3
1: 3 2: C 3:4
1: 4 2: D 3:5

How to find files that contain MULTIPLE newlines at their end?

I want to find all files that have multiple new line characters at the end of their content.
How is this possible?

This bash command prints all files in current directory and its subdirectories that terminate with at least one empty line at the end after a sequence of one or more lines (i.e. at least a sequence of two \n):
find . -type f -print | while read a; do tail -2 "$a" | ( read x && read y && [ x"$x" = x ] && echo "$a" ); done

FYI: It's possible to search for this with PHPStorm using RegEx search term \n+\n\Z

Bash - extracting a string between two points

For example:
((
extract everything here, ignore the rest
))
I know how to ignore everything within, but I don't know how to do the opposite. Basically, it'll be a file and it needs to extract the data between the two points and then output it to another file. I've tried countless approaches, and all seem to tell me the indentation I'm stating doesn't exist in the file, when it does.
If somebody could point me in the right direction, I'd be grateful.

If your data are "line oriented", so the marker is alone (as in the example), you can try some of the following:
function getdata() {
cat - <<EOF
before
((
extract everything here, ignore the rest
someother text
))
after
EOF
}
echo "sed - with two seds"
getdata | sed -n '/((/,/))/p' | sed '1d;$d'
echo "Another sed solution"
getdata | sed -n '1,/((/d; /))/,$d;p'
echo "With GNU sed"
getdata | gsed -n '/((/{:a;n;/))/b;p;ba}'
echo "With perl"
getdata | perl -0777 -pe "s/.*\(\(\s*\\n(.*)?\)\).*/\$1/s"
Ps: yes, its looks like a dance of crazy toothpicks

Assuming you want to extract the string inside (( and )):
VAR="abc((def))ghi"
echo "$VAR"
VAR=${VAR##*((}
VAR=${VAR%%))*}
echo "$VAR"
## cuts away the longest string from the beginning; # cuts away the shortest string from the beginning; %% cuts away the longest string at the end; % cuts away the shortes string at the end

The file :
$ cat /tmp/l
((
extract everything here, ignore the rest
someother text
))
The script
$ awk '$1=="((" {p=1;next} $1=="))" {p=o;next} p' /tmp/l
extract everything here, ignore the rest
someother text

sed -n '/^((/,/^))/ { /^((/b; /^))/b; p }'
Brief explanation:
/^((/,/^))/: range addressing (inclusive)
{ /^((/b; /^))/b; p }: sequence of 3 commands
1. skip line with ^((
2. skip line with ^))
3. print
The line skipping is required to make the range selection exclusive.

parameter expansion with substitution in bash

I have a trivial problem with regular expression in bash.
#!/bin/bash
FNAME=$1
echo ${FNAME//.*\/tests\//}
I want to remove everything before /test/ including the /test/ as well. Because of some reasons ".*" doesn't work.
$ ./eclipse/unittest.sh /foo/tests/bar
/foo/tests/bar
How do I select anything in bash reg exp?

You can use # followed by a pattern to remove everything up to and including the pattern. It will use the shortest match:
function f {
echo ${1#*/tests/}
}
$ f /foo/tests/bar
bar
$ f /foo/tests/bar/tests/last
bar/tests/last
If you want to use the longest match, you can use ##:
function f {
echo ${1##*/tests/}
}
$ f /foo/tests/bar
bar
$ f /foo/tests/bar/tests/last
last

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Script to Remove a Letter at the End of some Directories - bash

Related

Find pairs of files in one directory with a specific pattern

Zsh - split string by spaces when using dot operator

How to find files that contain MULTIPLE newlines at their end?

Bash - extracting a string between two points

parameter expansion with substitution in bash

Categories

Resources