Combine expressions and parameter expansion in bash - bash

Is it possible to combine parameter expansion with arithmetic expressions in bash? For example, could I do a one-liner to evaluate lineNum or numChar here?
echo "Some lines here
Here is another
Oh look! Yet another" > $1
lineNum=$( grep -n -m1 'Oh look!' $1 | cut -d : -f 1 ) #Get line number of "Oh look!"
(( lineNum-- )) # Correct for array indexing
readarray -t lines < $1
substr=${lines[lineNum]%%Y*} # Get the substring "Oh look! "
numChar=${#substr} # Get the number of characters in the substring
(( numChar -= 2 )) # Get the position of "!" based on the position of "Y"
echo $lineNum
echo $numChar
> 2
8
In other words, can I get the position of one character in a string based on the position of another in a one-line expression?

As far as for getting position of ! in a line that matches Oh look! regex, just:
awk -F'!' '/Oh look!/{ print length($1) + 1; quit }' "$file"
You can also do calculation to your liking, so with your original code I think that would be:
awk -F':' '/^[[:space:]][A-Z]/{ print length($1) - 2; quit }' "$file"
Is it possible to combine parameter expansion with arithmetic expressions in bash?
For computing ${#substr} you have to have the substring. So you could:
substr=${lines[lineNum-1]%%.*}; numChar=$((${#substr} - 2))
You could also edit your grep and have the filtering from Y done by bash, but awk is going to be magnitudes faster:
IFS=Y read -r line _ < <(grep -m1 'Oh look!' "$file")
numChar=$((${#line} - 2))
Still you could merge the 3 lines into just:
numChar=$(( $(<<<${lines[lineNum - 1]%%Y*} wc -c) - 1))

Related

How to detect and remove indentation of a piped text

I'm looking for a way to remove the indentation of a piped text. Below is a solution using cut -c 9- which assumes the indentation is 8 character wide.
I'm looking for a solution which can detect the number of spaces to remove. This implies going through the whole (piped) file to know the minimum number of spaces (tabs?) used to indent it, then remove them on each line.
run.sh
help() {
awk '
/esac/{b=0}
b
/case "\$arg" in/{b=1}' \
"$me" \
| cut -c 9-
}
while [[ $# -ge 1 ]]
do
arg="$1"
shift
case "$arg" in
help|h|?|--help|-h|'-?')
# Show this help
help;;
esac
done
$ ./run.sh --help
help|h|?|--help|-h|'-?')
# Show this help
help;;
Note: echo $' 4\n 2\n 3' | python3 -c 'import sys; import textwrap as tw; print(tw.dedent(sys.stdin.read()), end="")' works but I expect there is a better, way (I mean, one which doesn't only depends on software more common than python. Maybe awk? I wouldn't mind seeing a perl solution either.
Note2: echo $' 4\n 2\n 3' | python -c 'import sys; import textwrap as tw; print tw.dedent(sys.stdin.read()),' also works (Python 2.7.15rc1).
The following is pure bash, with no external tools or command substitutions:
#!/usr/bin/env bash
all_lines=( )
min_spaces=9999 # start with something arbitrarily high
while IFS= read -r line; do
all_lines+=( "$line" )
if [[ ${line:0:$min_spaces} =~ ^[[:space:]]*$ ]]; then
continue # this line has at least as much whitespace as those preceding it
fi
# this line has *less* whitespace than those preceding it; we need to know how much.
[[ $line =~ ^([[:space:]]*) ]]
line_whitespace=${BASH_REMATCH[1]}
min_spaces=${#line_whitespace}
done
for line in "${all_lines[#]}"; do
printf '%s\n' "${line:$min_spaces}"
done
Its output is:
4
2
3
Suppose you have:
$ echo $' 4\n 2\n 3\n\ttab'
4
2
3
tab
You can use the Unix expand utility to expand the tabs to spaces. Then run through an awk to count the minimum number of spaces on a line:
$ echo $' 4\n 2\n 3\n\ttab' |
expand |
awk 'BEGIN{min_indent=9999999}
{lines[++cnt]=$0
match($0, /^[ ]*/)
if(RLENGTH<min_indent) min_indent=RLENGTH
}
END{for (i=1;i<=cnt;i++)
print substr(lines[i], min_indent+1)}'
4
2
3
tab
Here's the (semi-) obvious temp file solution.
#!/bin/sh
t=$(mktemp -t dedent.XXXXXXXXXX) || exit
trap 'rm -f $t' EXIT ERR
awk '{ n = match($0, /[^ ]/); if (NR == 1 || n<min) min = n }1
END { exit min+1 }' >"$t"
cut -c $?- "$t"
This obviously fails if all lines have more than 255 leading whitespace characters because then the result won't fit into the exit code from Awk.
This has the advantage that we are not restricting ourselves to the available memory. Instead, we are restricting ourselves to the available disk space. The drawback is that disk might be slower, but the advantage of not reading big files into memory will IMHO trump that.
echo $' 4\n 2\n 3\n \n more spaces in the line\n ...' | \
(text="$(cat)"; echo "$text" \
| cut -c "$(echo "$text" | sed 's/[^ ].*$//' | awk 'NR == 1 {a = length} length < a {a = length} END {print a + 1}')-"\
)
With explanations:
echo $' 4\n 2\n 3\n \n more spaces in the line\n ...' | \
(
text="$(cat)" # Obtain the input in a varibale
echo "$text" | cut -c "$(
# `cut` removes the n-1 first characters of each line of the input, where n is:
echo "$text" | \
sed 's/[^ ].*$//' | \
awk 'NR == 1 || length < a {a = length} END {print a + 1}'
# sed: keep only the initial spaces, remove the rest
# awk:
# At the first line `NR == 1`, get the length of the line `a = length`.
# For any shorter line `a < length`, update the length `a = length`.
# At the end of the piped input, print the shortest length + 1.
# ... we add 1 because in `cut`, characters of the line are indexed at 1.
)-"
)
Update:
It is possible to avoid spawning sed. As per tripleee's comment, sed's s/// can be replace awk's sub(). Here is an even shorter option, using n = match() as in tripleee's answer.
echo $' 4\n 2\n 3\n \n more spaces in the line\n ...' | \
(
text="$(cat)" # Obtain the input in a varibale
echo "$text" | cut -c "$(
# `cut` removes the a-1 first characters of each line of the input, where a is:
echo "$text" | \
awk '
{n = match($0, /[^ ]/)}
NR == 1 || n < a {a = n}
END || a == 0 {print a + 1; exit 0}'
# awk:
# At every line, get the position of the first non-space character
# At the first line `NR == 1`, copy that lenght to `a`.
# For any line with less spaces than `a` (`n < a`) update `a`, (`a = n`).
# At the end of the piped input, print a + 1.
# a is then the minimum number of common leading spaces found in all lines.
# ... we add 1 because in `cut`, characters of the line are indexed at 1.
#
# I'm not sure the whether the `a == 0 {...; exit 0}` optimisation will let the "$text" be written to the script stdout yet (which is not desirable at all). Gotta test that when I get the time.
)-"
)
Apparently, it's also possible to do in Perl 6 with the function my &f = *.indent(*);.
Another solution with awk, based on dawg’s answer. Major differences include:
No need to set an arbitrary large number for indentation, which feels hacky.
Works on text with empty lines, by not considering them when gathering the lowest indented line.
awk '
{
lines[++count] = $0
if (NF == 0) next
match($0, /[^ ]/)
if (length(min) == 0 || RSTART < min) min = RSTART
}
END {
for (i = 1; i <= count; i++) print substr(lines[i], min)
}
' <<< $' 4\n 2\n 3'
Or all on the same line
awk '{ lines[++count] = $0; if (NF == 0) next; match($0, /[^ ]/); if (length(min) == 0 || RSTART < min) min = RSTART; } END { for (i = 1; i <= count; i++) print substr(lines[i], min) }' <<< $' 4\n 2\n 3'
Explanation:
Add current line to an array, and increment count variable
{
lines[++count] = $0
If line is empty, skip to next iteration
if (NF == 0) next
Set RSTART to the start index of the first non-space character.
match($0, /[^ ]/)
If min isn’t set or is higher than RSTART, set the former to the latter.
if (length(min) == 0 || RSTART < min) min = RSTART
}
Run after all input is read.
END {
Loop over the array, and for each line print only a substring going from the index set in min to the end of the line.
for (i = 1; i <= count; i++) print substr(lines[i], min)
}
solution using bash
#!/usr/bin/env bash
cb=$(xclip -selection clipboard -o)
firstchar=${cb::1}
if [ "$firstchar" == $'\t' ];then
tocut=$(echo "$cb" | awk -F$'\t' '{print NF-1;}' | sort -n | head -n1)
else
tocut=$(echo "$cb" | awk -F '[^ ].*' '{print length($1)}' | sort -n | head -n1)
fi
echo "$cb" | cut -c$((tocut+1))- | xclip -selection clipboard
Note: assumes first line has the left-most indent
Works for both spaces and tabs
Ctrl+V some text, run that bash script, and now the dedented text is saved to your clipboard
solution using python
detab.py
import sys
import textwrap
data = sys.stdin.readlines()
data = "".join(data)
print(textwrap.dedent(data))
use with pipes
xclip -selection clipboard -o | python detab.py | xclip -selection clipboard

How to add a hyphen after every fifth character of a word in bash

Given "ABCDEFGHIJKLMOPQRSTUVWXY"
How does one achieve this outcome? "ABCDE-FGHIJ-KLMNO-PQRST-UVWXY"
With sed you can do this by first adding a - after every 5 characters, then removing the trailing - at the end of the line:
$ sed -E 's/.{5}/&-/g; s/-$//' <<<"ABCDEFGHIJKLMNOPQRSTUVWXY"
ABCDE-FGHIJ-KLMNO-PQRST-UVWXY
In extended (-E) mode:
.{5} matches any 5 characters
&- replaces with the whole match (the 5 characters) plus -
Then the second substitution command matches - at the end of the line ($) and replaces with nothing.
With GNU awk, one option would be to use FPAT to define the way the line is interpreted as a series of fields, then add - between each field:
$ awk -v FPAT='.{5}' -v OFS='-' '{ $1 = $1 } 1' <<<"ABCDEFGHIJKLMNOPQRSTUVWXY"
ABCDE-FGHIJ-KLMNO-PQRST-UVWXY
The field pattern FPAT is defined as any 5 characters and the Output Field Separator OFS is defined as -. $1 = $1 "touches" every line, causing it to be reformatted (without this part, nothing would happen). 1 is the shortest true condition causing each line to be printed.
It's not too difficult to do this in bash either:
#!/bin/bash
input="ABCDEFGHIJKLMNOPQRSTUVWXY"
parts=()
# build an array from slices of length 5
for (( i = 0; i < ${#input}; i += 5 )) do
parts+=( "${input:i:5}" )
done
# join the array on IFS (use a subshell to avoid modifying IFS for rest of script)
( IFS=-; echo "${parts[*]}" )
Could you please try following.
echo "ABCDEFGHIJKLMOPQRSTUVWXY" | sed 's/...../&-/g;s/-$//'
A simple solution for only letters will be
sed -E 's/[A-Z]{4}./&-/g' file.txt
The output will be:
ABCDE-FGHIJ-KLMOP-QRSTU-VWXY
if you want them to include more than capital letters just do a:
sed -E 's/[A-Za-z]{4}./&-/g' file.txt
Try this
#!/bin/bash
s="ABCDEFGHIJKLMNOPQRSTUVWXY"
a=($(echo ${s} | grep -o .))
o=""
i=0
while [[ ${i} -lt ${#a[#]} ]]; do
o="${o}${a[${i}]}"
(( i++ ))
[[ $(( i % 5 )) -eq 0 ]] && [[ ${i} -ne ${#a[#]} ]] && o="${o}-"
done
echo ${o}
exit 0
another solution with fold/paste
$ echo {A..Y} | tr -d ' ' | # this is to generate the string
fold -w5 | paste -sd-
ABCDE-FGHIJ-KLMNO-PQRST-UVWXY
This might work for you (GNU sed):
sed 's/.\{5\}\B/&-/g' file
Insert a hyphen every five characters as long as the fifth character is inside a word.
Yet another choice
perl -pe 's/(.{5})(?=.)/$1-/g' file
Match 5 characters that are followed by another character (to avoid the trailing hyphen problem)

Column separation inside shell script

If I have file.txt with the data:
abcd!1023!92
efgh!9873!xk
and a basic tutorial.sh file which goes through each line
while read line
do
name = $line
done < $1
How do I separate the data between the "!" into a column and select the second column and add them? (I am aware of the "sed -k 2 | bc " function but I can't/ do not understand how to get it to work with a shell script.
You can use awk:
awk -F '!' '{sum += $2} END{print sum}' file
10896
To adjust your while loop:
while IFS='!' read -r a b c
do
((sum += b))
done < "$1" # always quote "$vars"
echo "$sum"
IFS is the shell's "internal field separator" used for splitting strings into words. It's normally "whitespace" but you can use it for your specific needs.

adding numbers without grep -c option

I have a txt file like
Peugeot:406:1999:Silver:1
Ford:Fiesta:1995:Red:2
Peugeot:206:2000:Black:1
Ford:Fiesta:1995:Red:2
I am looking for a command That counts the number of red Ford Fiesta cars.
The last number in each line is the amount of that particular car.
The command I am looking for CANNOT use the -c option of grep.
so this command should just output the number 4.
Any help would be welcome, thank you.
A simple bit of awk would do the trick:
awk -F: '$1=="Ford" && $4=="Red" { c+=$5 } END { print c }' file
Output:
4
Explanation:
The -F: switch means that the input field separator is a colon, so the car manufacturer is $1 (the 1st field), the model is $2, etc.
If the 1st field is "Ford" and the 4th field is "Red", then add the value of the 5th (last) field to the variable c. Once the whole file has been processed, print out the value of c.
For a native bash solution:
c=0
while IFS=":" read -ra col; do
[[ ${col[0]} == Ford ]] && [[ ${col[3]} == Red ]] && (( c += col[4] ))
done < file && echo $c
Effectively applies the same logic as the awk one above, without any additional dependencies.
Methods:
1.) use some scripting language for counting, like awk or perl and such. Awk solution already posted, here is an perl solution.
perl -F: -lane '$s+=$F[4] if m/Ford:.*:Red/}{print $s' < carfile
#or
perl -F: -lane '$s+=$F[4] if ($F[0]=~m/Ford/ && $F[3]=~/Red/)}{print $s' < carfile
both examples prints
4
2.) The second method is based on shell-pipelining
filter out the right rows
extract the column with the count
sum the numbers
e.g some examples:
grep 'Ford:.*:Red:' carfile | cut -d: -f5 | paste -sd+ | bc
the grep filter out the right rows
the cut get the last column
the paste creates an line like 2+2 what can be counted by
the bc for counting
Another example:
sed -n 's/\(Ford:.*:Red\):\(.*\)/\2/p' carfile | paste -sd+ | bc
the sed filter and extract
another example - different way of counting
(echo 0 ; sed -n 's/\(Ford:.*:Red\):\(.*\)/\2+/p' carfile ;echo p )| dc
numbers are counted by RPN calculator called dc, e.g. it works like 0 2 + - first comes the values and as the last the operation.
the first echo puts into the stack 0
the sed creates a stream of numbers like 2+ 2+
the last echo p prints the stack
exists many other possibilies how count a strem of numbers.
e.g counting by bash
while read -r num
do
sum=$(( $sum + $num ))
done < <(sed -n 's/\(Ford:.*:Red\):\(.*\)/\2/p' carfile)
and pure bash:
while IFS=: read -r maker model year color count
do
if [[ "$maker" == "Ford" && "$color" == "Red" ]]
then
(( sum += $count ))
fi
done < carfile
echo $sum

Replacing numbers with SED

I'm trying to replace numbers from -20 to 30 using sed, but it adds "v" character. What's wrong?
For example: SINR=-18, output must be "c", but output is "vc".
I tryed to delete 1st character, but it returns 1 instead of j.
SINR=`curl -s http://10.0.0.1/status | awk '/3GPP.SINR=/ {print $0}' | awk -F "3GPP.SINR=" '{print $2}'` # returns number
echo $SINR | sed "s/-20/a/;s/-19/b/;s/-18/c/;s/-17/d/;s/-16/e/;s/-15/f/;s/-14/g/;s/-13/h/;s/-12/i/;s/-11/j/;s/-10/k/;s/-9/l/;s/-8/m/;s/-7/n/;s/-6/o/;s/-5/p/;s/-4/q/;s/-3/r/;s/-2/s/;s/-1/t/;s/0/u/;s/1/v/;s/2/w/;s/3/x/;s/4/y/;s/5/z/;s/6/A/;s/7/B/;s/8/C/;s/9/D/;s/10/E/;s/11/F/;s/12/G/;s/13/H/;s/14/I/;s/15/J/;s/16/K/;s/17/L/;s/18/M/;s/19/N/;s/20/O/;s/21/P/;s/22/Q/;s/23/R/;s/24/S/;s/25/T/;s/26/U/;s/27/V/;s/28/W/;s/29/X/;s/30/Y/"
This way would be more elegant and less error-prone:
echo $SINR | awk 'BEGIN { chars="abcdefg" } { print substr(chars, $1 + 21, 1) }'
Of course, chars should contain all the letters you need for the mapping. That is, all the way until ...VWXY as in your example, I just wrote until g to keep it short and sweet.
With this solution your problem disappears.
You don't really need sed or awk if you have bash like you say you do. You can use arrays, which is maybe even less error-prone ;-)
map=({a..z} {A..Z}) # Create map of your characters
SINR=-18 # Set your SINR number to something
SINR=$(($SINR+20)) # Add an offset to get to right place
result=${map[$SINR]} # Lookup your result
echo $result # Print it
c
If you have a mapping process, you're surely better off building a switch statement, a couple of if's, or even using bash associative arrays (bash >= 4.0). For example, you could tackle your problem with the following snippet:
function mapper() {
if [[ $1 -ge -20 && $1 -le 5 ]]; then
printf \\$(printf '%03o' $(( $1 + 117 )) )
elif [[ $1 -ge 6 && $1 -le 30 ]]; then
printf \\$(printf '%03o' $(( $1 + 59 )) )
else
echo ""; return 1
fi
return 0
}
And use like below:
$ mapper -20
a
$ mapper 5
z
$ mapper 6
A
$ mapper 30
Y
$ mapper $SINR
c
echo "${SINR}" | sed 's/-20/a/;t;s/-19/b/;t;s/-18/c/;t;s/-17/d/;t;s/-16/e/;t;s/-15/f/;t;s/-14/g/;t;s/-13/h/;t;s/-12/i/;t;s/-11/j/;t;s/-10/k/;t;s/-9/l/;t;s/-8/m/;t;s/-7/n/;t;s/-6/o/;t;s/-5/p/;t;s/-4/q/;t;s/-3/r/;t;s/-2/s/;t;s/-1/t/;t;s/0/u/;t;s/1/v/;t;s/2/w/;t;s/3/x/;t;s/4/y/;t;s/5/z/;t;s/6/A/;t;s/7/B/;t;s/8/C/;t;s/9/D/;t;s/10/E/;t;s/11/F/;t;s/12/G/;t;s/13/H/;t;s/14/I/;t;s/15/J/;t;s/16/K/;t;s/17/L/;t;s/18/M/;t;s/19/N/;t;s/20/O/;t;s/21/P/;t;s/22/Q/;t;s/23/R/;t;s/24/S/;t;s/25/T/;t;s/26/U/;t;s/27/V/;t;s/28/W/;t;s/29/X/;t;s/30/Y/'
Use the t after s// to accelerate a bit.
vc is normaly not occuring if SINR is just a number like specified

Resources