Parse filename string and extract parent at specific level using shell

Parse filename string and extract parent at specific level using shell - bash

I have a filename as a string, say filname="a/b/c/d.png".
Is there a general method to extract the parent directory at a given level using ONLY shell parameter expansion?
I.e. I would like to extract "level 1" and return c or "level 2" and return b.
Explicitly, I DO NOT want to get the entire parent path (i.e. a/b/c/, which is the result of ${filename%/*}).

Using just shell parameter expansion, assuming bash, you can first transform the path into an array (splitting on /) and then ask for specific array indexes:
filename=a/b/c/d.png
IFS=/
filename_array=( $filename )
unset IFS
echo "0 = ${filename_array[0]}"
echo "1 = ${filename_array[1]}"
echo "2 = ${filename_array[2]}"
echo "3 = ${filename_array[3]}"
Running the above produces:
0 = a
1 = b
2 = c
3 = d.png
These indexes are the reverse of what you want, but a little
arithmetic should fix that.

Using zsh, the :h modifier trims the final component off a path in variable expansion.
The (s:...:) parameter expansion flag can be used to split the contents of a variable. Combine those with normal array indexing where a negative index goes from the end of the array, and...
$ filename=a/b/c/d.png
$ print $filename:h
a/b/c
$ level=1
$ print ${${(s:/:)filename:h}[-level]}
c
$ level=2
$ print ${${(s:/:)filename:h}[-level]}
b
You could also use array subscript flags instead to avoid the nested expansion:
$ level=1
$ print ${filename[(ws:/:)-level-1]}
c
$ level=2
$ print ${filename[(ws:/:)-level-1]}
b
w makes the index of a scalar split on words instead of by character, and s:...: has the same meaning, to say what to split on. Have to subtract one from the level to skip over the trailing d.png, since it's not stripped off already like the first way.

The :h (head) and :t (tail) expansion modifiers in zsh accept digits to specify a level; they can be combined to get a subset of the path:
> filname="a/b/c/d.png"
> print ${filname:t2}
c/d.png
> print ${filname:t2:h1}
c
> print ${filname:t3:h1}
b
If the level is in a variable, then the F modifier can be used to repeat the h modifier a specific number of times:
> for i in 1 2 3; printf '%s: %s\n' $i ${filname:F(i)h:t}
1: c
2: b
3: a

If using printf (a shell builtin) is allowed then this will do the trick in bash:
filename='a/b/c/d.png'
level=2
printf -v spaces '%*s' $level
pattern=${spaces//?/'/*'}
component=${filename%$pattern}
component=${component##*/}
echo $component
prints out
b
You can assign different values to the variable level.

Related

How to iterate over multiple variables and echo them using Shell Script?

Consider the below variables which are dynamic and might change each time. Sometimes there might even be 5 variables, But the length of all the variables will be the same every time.
var1='a b c d e... upto z'
var2='1 2 3 4 5... upto 26'
var3='I II III IV V... upto XXVI'
I am looking for a generalized approach to iterate the variables in a for loop & My desired output should be like below.
a,1,I
b,2,II
c,3,III
d,4,IV
e,5,V
.
.
goes on upto
z,26,XXVI
If I use nested loops, then I get all possible combinations which is not the expected outcome.
Also, I know how to make this work for 2 variables using for loop and shift using below link
https://unix.stackexchange.com/questions/390283/how-to-iterate-two-variables-in-a-sh-script

With paste
paste -d , <(tr ' ' '\n' <<<"$var1") <(tr ' ' '\n' <<<"$var2") <(tr ' ' '\n' <<<"$var3")
a,1,I
b,2,II
c,3,III
d,4,IV
e...z,5...26,V...XXVI
But clearly having to add other parameter substitutions for more varN's is not scalable.

You need to "zip" two variables at a time.
var1='a b c d e...z'
var2='1 2 3 4 5...26'
var3='I II III IV V...XXVI'
zip_var1_var2 () {
set $var1
for v2 in $var2; do
echo "$1,$v2"
shift
done
}
zip_var12_var3 () {
set $(zip_var1_var2)
for v3 in $var3; do
echo "$1,$v3"
shift
done
}
for x in $(zip_var12_var3); do
echo "$x"
done
If you are willing to use eval and are sure it is safe to do so, you can write a single function like
zip () {
if [ $# -eq 1 ]; then
eval echo \$$1
return
fi
a1=$1
shift
x=$*
set $(eval echo \$$a1)
for v in $(zip $x); do
printf '=== %s\n' "$1,$v" >&2
echo "$1,$v"
shift
done
}
zip var1 var2 var3 # Note the arguments are the *names* of the variables to zip
If you can use arrays, then (for example, in bash)
var1=(a b c d e)
var2=(1 2 3 4 5)
var3=(I II III IV V)
for i in "${!var1[#]}"; do
printf '%s,%s,%s\n' "${var1[i]}" "${var2[i]}" "${var3[i]}"
done

Use this Perl one-liner:
perl -le '#in = map { [split] } #ARGV; for $i ( 0..$#{ $in[0] } ) { print join ",", map { $in[$_][$i] } 0..$#in; }' "$var1" "$var2" "$var3"
Prints:
a,1,I
b,2,II
c,3,III
d,4,IV
e,5,V
z,26,XXVI
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
The input variables must be quoted with double quotes "like so", to keep the blank-separated words from being treated as separate arguments.
#ARGV is an array of the command line arguments, here $var1, $var2, $var3.
#in is an array of 3 elements, each element being a reference to an array obtained as a result of splitting the corresponding element of #ARGV on whitespace. Note that split splits the string on whitespace by default, but you can specify a different delimiter, it accepts regexes.
The subsequent for loop prints #in elements separated by comma.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlvar: Perl predefined variables

The following is (almost) a copy of this answer with a few tweaks that make it fit this question.
The Original Question
First let’s assign a few variables to play with, 26 tokens in each of them:
var1="$(echo {a..z})"
var2="$(echo {1..26})"
var3="$(echo I II III IV \
V{,I,II,III} IX \
X{,I,II,III} XIV \
XV{,I,II,III} XIX \
XX{,I,II,III} XXIV \
XXV XXVI)"
var4="$(echo {A..Z})"
var5="$(echo {010101..262626..10101})"
Now we want a “magic” function that zips an arbitrary number of variables, ideally in pure Bash:
zip_vars var1 # a trivial test
zip_vars var{1..2} # a slightly less trivial test
zip_vars var{1..3} # the original question
zip_vars var{1..4} # more vars, becasuse we can
zip_vars var{1..5} # more vars, because why not
What could zip_vars look like? Here’s one in pure Bash, without any external commands:
zip_vars() {
local var
for var in "$#"; do
local -a "array_${var}"
local -n array_ref="array_${var}"
array_ref=(${!var})
local -ar "array_${var}"
done
local -n array_ref="array_${1}"
local -ir size="${#array_ref[#]}"
local -i i
local output
for ((i = 0; i < size; ++i)); do
output=
for var in "$#"; do
local -n array_ref="array_${var}"
output+=",${array_ref[i]}"
done
printf '%s\n' "${output:1}"
done
}
How it works:
It splits all variables (passed by reference (by variable name)) into arrays. For each variable varX it creates a local array array_varX.
It would be actually way easier if the input variables were already Bash arrays to start with (see below), but … we stick with the original question initially.
It determines the size of the first array and then blindly expects all arrays to be of that size.
For each index i from 0 to size - 1 it concatenates the ith elements of all arrays, separated by ,.
Arrays Make Things Easier
If you use Bash arrays from the very start, the script will be shorter and look simpler and there won’t be any string-to-array conversions.
zip_arrays() {
local -n array_ref="$1"
local -ir size="${#array_ref[#]}"
local -i i
local output
for ((i = 0; i < size; ++i)); do
output=
for arr in "$#"; do
local -n array_ref="$arr"
output+=",${array_ref[i]}"
done
printf '%s\n' "${output:1}"
done
}
arr1=({a..z})
arr2=({1..26})
arr3=( I II III IV
V{,I,II,III} IX
X{,I,II,III} XIV
XV{,I,II,III} XIX
XX{,I,II,III} XXIV
XXV
XXVI)
arr4=({A..Z})
arr5=({010101..262626..10101})
zip_arrays arr1 # a trivial test
zip_arrays arr{1..2} # a slightly less trivial test
zip_arrays arr{1..3} # (almost) the original question
zip_arrays arr{1..4} # more arrays, becasuse we can
zip_arrays arr{1..5} # more arrays, because why not

return array from perl to bash

I'm trying to get back an array from perl to bash.
My perl scrip has an array and then I use return(#arr)
from my bash script I use
VAR = `perl....
when I echo VAR
I get the aray as 1 long string with all the array vars connected with no spaces.
Thanks

In the shell (and in Perl), backticks (``) capture the output of a command. However, Perl's return is normally for returning variables from subroutines - it does not produce output, so you probably want print instead. Also, in bash, array variables are declared with parentheses. So this works for me:
$ ARRAY=(`perl -wMstrict -le 'my #array = qw/foo bar baz/; print "#array"'`); \
echo "<${ARRAY[*]}> 0=${ARRAY[0]} 1=${ARRAY[1]} 2=${ARRAY[2]}"
<foo bar baz> 0=foo 1=bar 2=baz
In Perl, interpolating an array into a string (like "#array") will join the array with the special variable $" in between elements; that variable defaults to a single space. If you simply print #array, then the array elements will be joined by the variable $,, which is undef by default, meaning no space between the elements. This probably explains the behavior you mentioned ("the array vars connected with no spaces").
Note that the above will not work the way you expect if the elements of the array contain whitespace, because bash will split them into separate array elements. If your array does contain whitespace, then please provide an MCVE with sample data so we can perhaps make an alternative suggestion of how to return that back to bash. For example:
( # subshell so IFS is only affected locally
IFS=$'\n'
ARRAY=(`perl -wMstrict -e 'my #array = ("foo","bar","quz baz"); print join "\n", #array'`)
echo "0=<${ARRAY[0]}> 1=<${ARRAY[1]}> 2=<${ARRAY[2]}>"
)
Outputs: 0=<foo> 1=<bar> 2=<quz baz>

Here is one way using Bash word splitting, it will split the string on white space into the new array array:
array_str=$(perl -E '#a = 1..5; say "#a"')
array=( $array_str )
for item in ${array[#]} ; do
echo ": $item"
done
Output:
: 1
: 2
: 3
: 4
: 5

The semantics of arrays in bash

Check out the following transcript. With all possible rigor and formality, what is going on at each step?
$> ls -1 #This command prints 3 items. no explanation required.
a
b
c
$> X=$(ls -1) #Capture the output (as what? a string?)
$> Y=($(ls -1)) #Capture it again (as an array now?)
$> echo ${#X[#]} #Why is the length 1?
1
$> echo ${#Y[#]} #This works because Y is an array of the 3 items?
3
$> echo $X #Why are the linefeeds now spaces?
a b c
$> echo $Y #Why does the array echo as its first element
a
$> for x in $X;do echo $x; done #iterate over $X
a
b
c
$> for y in $Y;do echo $y; done #iterating over y doesn't work
a
$> echo ${X[2]} #I can loop over $X but not index into it?
$> echo ${Y[2]} #Why does this work if I can't loop over $Y?
c
I assume bash has well established semantics about how arrays and text variables (if that's even what they're called) work, but the user manual is not organized in an optimal fashion for someone who wants to reason about scripts based on whatever small set of underlying principles the language designer intended.

Let me preface the following with the very strong suggestion that you never use ls to populate an array. The correct code would be
Z=( * )
to create an array with each (non-hidden) file in the current directory as a distinct array element.
$> ls -1 #This command prints 3 items. no explanation required.
a
b
c
Correct. Each file name is printed on a separate line (although, beware of file names containing newlines; the parts before and after each newline would appear as separate file names.)
$> X=$(ls -1) #Capture the output (as what? a string?)
Yes. The output of ls is concatenated by the command substitution into a single string using a single space to separate each line. (The command substitution would be subject to word-splitting if it weren't the right-hand side of an assignment; word-splitting will come up below.)
$> Y=($(ls -1)) #Capture it again (as an array now?)
Same as with X, but now each of the words in the result of the command substitution is treated as a separate array element. As long as none of the output lines contain any characters in the value of IFS, each file name is one word and will be treated as a separate array element.
$> echo ${#X[#]} #Why is the length 1?
1
X, not being a real array, is treated as an array with a single element, namely the value of $X.
$> echo ${#Y[#]} #This works because Y is an array of the 3 items?
3
Correct.
$> echo $X #Why are the linefeeds now spaces?
a b c
When $X is unquoted, the resulting expansion is subject to word-splitting. In this case, the newlines are simply treated the same as any other whitespace, separating the result into a sequence of words that are passed to echo as distinct arguments, which are then displayed separated by a single space each.
$> echo $Y #Why does the array echo as its first element
a
For a true array, $Y is equivalent to ${Y[0]}.
$> for x in $X;do echo $x; done #iterate over $X
a
b
c
This works, but has caveats.
$> for y in $Y;do echo $y; done #iterating over y doesn't work
a
See above; $Y only expands to the first element. You want for y in "${Y[#]}"; do to iterate over all the elements.
$> echo ${X[2]} #I can loop over $X but not index into it?
Correct. X is not an array, but $X expanded to a space-separated list which the for loop could iterate over.
$> echo ${Y[2]} #Why does this work if I can't loop over $Y?
c
Indexing and iteration are two completely different things in shell. You don't actually iterate over an array; you iterate over the resulting sequence of words of a properly expanded array.

How to loop through the first n letters of the alphabet in bash

I know that to loop through the alphabet, one can do
for c in {a..z}; do something; done
My question is, how can I loop through the first n letters (e.g. to build a string) where n is a variable/parameter given in the command line.
I searched SO, and only found answers doing this for numbers, e.g. using C-style for loop or seq (see e.g. How do I iterate over a range of numbers defined by variables in Bash?). And I don't have seq in my environment.
Thanks.

The straightforward way is sticking them in an array and looping over that by index:
#!/bin/bash
chars=( {a..z} )
n=3
for ((i=0; i<n; i++))
do
echo "${chars[i]}"
done
Alternatively, if you just want them dash-separated:
printf "%s-" "${chars[#]:0:n}"

that other guy's answer is probably the way to go, but here's an alternative that doesn't require an array variable:
n=3 # sample value
i=0 # var. for counting iterations
for c in {a..z}; do
echo $c # do something with "$c"
(( ++i == n )) && break # exit loop, once desired count has been reached
done
#rici points out in a comment that you could make do without aux. variable $i by using the conditional (( n-- )) || break to exit the loop, but note that this modifies $n.
Here's another array-free, but less efficient approach that uses substring extraction (parameter expansion):
n=3 # sample value
# Create a space-separated list of letters a-z.
# Note that chars={a..z} does NOT work.
chars=$(echo {a..z})
# Extract the substring containing the specified number
# of letters using parameter expansion with an arithmetic expression,
# and loop over them.
# Note:
# - The variable reference must be _unquoted_ for this to work.
# - Since the list is space-separated, each entry spans 2
# chars., hence `2*n` (you could subtract 1 after, but it'll work either way).
for c in ${chars:0:2*n}; do
echo $c # do something with "$c"
done
Finally, you can combine the array and list approaches for concision, although the pure array approach is more efficient:
n=3 # sample value
chars=( {a..z} ) # create array of letters
# `${chars[#]:0:n}` returns the first n array elements as a space-separated list
# Again, the variable reference must be _unquoted_.
for c in ${chars[#]:0:n}; do
echo $c # do something with "$c"
done

Are you only iterating over the alphabet to create a subset? If that's the case, just make it simple:
$ alpha=abcdefghijklmnopqrstuvqxyz
$ n=4
$ echo ${alpha:0:$n}
abcd
Edit. Based on your comment below, do you have sed?
% sed -e 's/./&-/g' <<< ${alpha:0:$n}
a-b-c-d-

You can loop through the character code of the letters of the alphabet and convert back and forth:
# suppose $INPUT is your input
INPUT='x'
# get the character code and increment it by one
INPUT_CHARCODE=`printf %x "'$INPUT"`
let INPUT_CHARCODE++
# start from character code 61 = 'a'
I=61
while [ $I -ne $INPUT_CHARCODE ]; do
# convert the index to a letter
CURRENT_CHAR=`printf "\x$I"`
echo "current character is: $CURRENT_CHAR"
let I++
done

This question and the answers helped me with my problem, partially.
I needed to loupe over a part of the alphabet based on a letter in bash.
Although the expansion is strictly textual
I found a solution: and made it even more simple:
START=A
STOP=D
for letter in $(eval echo {$START..$STOP}); do
echo $letter
done
Which results in:
A
B
C
D
Hope it's helpful for someone looking for the same problem i had to solve,
and ends up here as well
(also answered here)
And the complete answer to the original question is:
START=A
n=4
OFFSET=$( expr $(printf "%x" \'$START) + $n)
STOP=$(printf "\x$OFFSET")
for letter in $(eval echo {$START..$STOP}); do
echo $letter
done
Which results in the same:
A
B
C
D

Operations that can be performed on Bash shell variables

I've known several operations we can do to variables in shell, e.g:
1) "#" & "##" operation
with ${var#pattern}, we remove "pattern" in the head of ${var}. "*" could be used in the pattern to match everything. And the difference between "#" and "##" is that, "##" will remove the longest match substring while "#" removes the shortest. For example,
var=brbread
${var##*br} // ead
${var#*br} // bread
2) "%" & "%%" operation
with ${var%pattern}, we remove "pattern" at the end of ${var}. Of course, "%%" indicates longest match while "%" means the shortest. For example,
var=eadbreadbread
${var%%*br} // eadbreadbread
${var%%br*} // ead
${var%br*} // eadbread
3) "/" operation
with ${var/haha/heihei}, we replace "haha" in $var with "heihei". For example,
var=ihahai
${var/haha/heihei/} / iheiheii
I'm just curious wether or not we can make more operations to variables other than above ones?
Thanks.

Yes there is a lot a other operations on variables with bash, as case modification, array keys listing, name expanding, etc.
You should check the manual page at the Parameter Expansion chapter.

In one of your examples, you could do a global replacement with two slashes:
${var//ha/hei/} # the result would be the same
(Note that in Bash, the comment character is "#".)
Here are some examples of Parameter Expansion variable operations:
Provide a default:
$ unset foo
$ bar="hello"
$ echo ${foo:-$bar} # if $foo had a value, it would be output
hello
Alternate value:
$ echo ${bar:+"goodbye"}
goodbye
$ echo ${foo:+"goodbye"} # no substitution
Substrings:
$ echo ${bar:1:2}
el
$ echo ${bar: -4:2} # from the end (note the space before the minus)
el
List of array keys:
$ array=(123 456)
$ array[12]=7890
$ echo ${!array[#]}
0 1 12
Parameter Length:
$ echo ${#bar}
5
$ echo ${#array[#]} # number of elements in an array
3
$ echo ${#array[12]} # length of an array element
4
Modify Case (Bash 4):
$ greeting="hello jim"
$ echo ${greeting^}
Hello jim
$ echo ${greeting^^}
HELLO JIM
$ greeting=($greeting)
$ echo ${greeting[#]^}
Hello Jim

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Parse filename string and extract parent at specific level using shell - bash

Related

How to iterate over multiple variables and echo them using Shell Script?

return array from perl to bash

The semantics of arrays in bash

How to loop through the first n letters of the alphabet in bash

Operations that can be performed on Bash shell variables

Categories

Resources