Remove all chars that are not a digit from a string

Remove all chars that are not a digit from a string - bash

I'm trying to make a small function that removes all the chars that are not digits.
123a45a ---> will become ---> 12345
I've came up with :
temp=$word | grep -o [[:digit:]]
echo $temp
But instead of 12345 I get 1 2 3 4 5. How to I get rid of the spaces?

Pure bash:
word=123a45a
number=${word//[^0-9]}

Here's a pure bash solution
var='123a45a'
echo ${var//[^0-9]/}
12345

is this what you are looking for?
kent$ echo "123a45a"|sed 's/[^0-9]//g'
12345
grep & tr
echo "123a45a"|grep -o '[0-9]'|tr -d '\n'
12345

I would recommend using sed or perl instead:
temp="$(sed -e 's/[^0-9]//g' <<< "$word")"
temp="$(perl -pe 's/\D//g' <<< "$word")"
Edited to add: If you really need to use grep, then this is the only way I can think of:
temp="$( grep -o '[0-9]' <<< "$word" \
| while IFS= read -r ; do echo -n "$REPLY" ; done
)"
. . . but there's probably a better way. (It uses grep -o, like your solution, then runs over the lines that it outputs and re-outputs them without line-breaks.)
Edited again to add: Now that you've mentioned that you use can use tr instead, this is much easier:
temp="$(tr -cd 0-9 <<< "$word")"

What about using sed?
$ echo "123a45a" | sed -r 's/[^0-9]//g'
12345
As I read you are just allowed to use grep and tr, this can make the trick:
$ echo "123a45a" | grep -o [[:digit:]] | tr -d '\n'
12345
In your case,
temp=$(echo $word | grep -o [[:digit:]] | tr -d '\n')

tr will also work:
echo "123a45a" | tr -cd '[:digit:]'
# output: 12345

Grep returns the result on different lines:
$ echo -e "$temp"
1
2
3
4
5
So you cannot remove those spaces during the filtering, but you can afterwards, since $temp can transform itself like this:
temp=`echo $temp | tr -d ' '`
$ echo "$temp"
12345

Related

count all the lines in all folders in bash [duplicate]

wc -l file.txt
outputs number of lines and file name.
I need just the number itself (not the file name).
I can do this
wc -l file.txt | awk '{print $1}'
But maybe there is a better way?

Try this way:
wc -l < file.txt

cat file.txt | wc -l
According to the man page (for the BSD version, I don't have a GNU version to check):
If no files are specified, the standard input is used and no file
name is
displayed. The prompt will accept input until receiving EOF, or [^D] in
most environments.

To do this without the leading space, why not:
wc -l < file.txt | bc

Comparison of Techniques
I had a similar issue attempting to get a character count without the leading whitespace provided by wc, which led me to this page. After trying out the answers here, the following are the results from my personal testing on Mac (BSD Bash). Again, this is for character count; for line count you'd do wc -l. echo -n omits the trailing line break.
FOO="bar"
echo -n "$FOO" | wc -c # " 3" (x)
echo -n "$FOO" | wc -c | bc # "3" (√)
echo -n "$FOO" | wc -c | tr -d ' ' # "3" (√)
echo -n "$FOO" | wc -c | awk '{print $1}' # "3" (√)
echo -n "$FOO" | wc -c | cut -d ' ' -f1 # "" for -f < 8 (x)
echo -n "$FOO" | wc -c | cut -d ' ' -f8 # "3" (√)
echo -n "$FOO" | wc -c | perl -pe 's/^\s+//' # "3" (√)
echo -n "$FOO" | wc -c | grep -ch '^' # "1" (x)
echo $( printf '%s' "$FOO" | wc -c ) # "3" (√)
I wouldn't rely on the cut -f* method in general since it requires that you know the exact number of leading spaces that any given output may have. And the grep one works for counting lines, but not characters.
bc is the most concise, and awk and perl seem a bit overkill, but they should all be relatively fast and portable enough.
Also note that some of these can be adapted to trim surrounding whitespace from general strings, as well (along with echo `echo $FOO`, another neat trick).

How about
wc -l file.txt | cut -d' ' -f1
i.e. pipe the output of wc into cut (where delimiters are spaces and pick just the first field)

How about
grep -ch "^" file.txt

Obviously, there are a lot of solutions to this.
Here is another one though:
wc -l somefile | tr -d "[:alpha:][:blank:][:punct:]"
This only outputs the number of lines, but the trailing newline character (\n) is present, if you don't want that either, replace [:blank:] with [:space:].

Another way to strip the leading zeros without invoking an external command is to use Arithmetic expansion $((exp))
echo $(($(wc -l < file.txt)))

Best way would be first of all find all files in directory then use AWK NR (Number of Records Variable)
below is the command :
find <directory path> -type f | awk 'END{print NR}'
example : - find /tmp/ -type f | awk 'END{print NR}'

This works for me using the normal wc -l and sed to strip any char what is not a number.
wc -l big_file.log | sed -E "s/([a-z\-\_\.]|[[:space:]]*)//g"
# 9249133

Add symbol every 2 bytes

I have a string 20000024ff3dbf50 that I would like to convert it like: 20:00:00:24:ff:3d:bf:50, I've tried with sed:
echo 20000024ff3dbf50 | sed 's/\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)\(..\)/\1:\2:\3:\4:\5:\6:\7:\8/'
but it's a little ugly.

Two substitutions:
echo "20000024ff3dbf50" | sed 's/../&:/g;s/.$//'
Results:
20:00:00:24:ff:3d:bf:50

echo 20000024ff3dbf50 | grep -o .. | paste -d ':' -s -
Grep with -o splits the input to 2 chars per line;
paste uses delimiter ':' to pad them [-s]erially

You could also use GNU awk auto-splitting for this:
echo 20000024ff3dbf50 | awk '$1=$1' FPAT=.. OFS=:
Output:
20:00:00:24:ff:3d:bf:50

How to get "wc -l" to print just the number of lines without file name?

wc -l file.txt
outputs number of lines and file name.
I need just the number itself (not the file name).
I can do this
wc -l file.txt | awk '{print $1}'
But maybe there is a better way?

Try this way:
wc -l < file.txt

cat file.txt | wc -l
According to the man page (for the BSD version, I don't have a GNU version to check):
If no files are specified, the standard input is used and no file
name is
displayed. The prompt will accept input until receiving EOF, or [^D] in
most environments.

To do this without the leading space, why not:
wc -l < file.txt | bc

Comparison of Techniques
I had a similar issue attempting to get a character count without the leading whitespace provided by wc, which led me to this page. After trying out the answers here, the following are the results from my personal testing on Mac (BSD Bash). Again, this is for character count; for line count you'd do wc -l. echo -n omits the trailing line break.
FOO="bar"
echo -n "$FOO" | wc -c # " 3" (x)
echo -n "$FOO" | wc -c | bc # "3" (√)
echo -n "$FOO" | wc -c | tr -d ' ' # "3" (√)
echo -n "$FOO" | wc -c | awk '{print $1}' # "3" (√)
echo -n "$FOO" | wc -c | cut -d ' ' -f1 # "" for -f < 8 (x)
echo -n "$FOO" | wc -c | cut -d ' ' -f8 # "3" (√)
echo -n "$FOO" | wc -c | perl -pe 's/^\s+//' # "3" (√)
echo -n "$FOO" | wc -c | grep -ch '^' # "1" (x)
echo $( printf '%s' "$FOO" | wc -c ) # "3" (√)
I wouldn't rely on the cut -f* method in general since it requires that you know the exact number of leading spaces that any given output may have. And the grep one works for counting lines, but not characters.
bc is the most concise, and awk and perl seem a bit overkill, but they should all be relatively fast and portable enough.
Also note that some of these can be adapted to trim surrounding whitespace from general strings, as well (along with echo `echo $FOO`, another neat trick).

How about
wc -l file.txt | cut -d' ' -f1
i.e. pipe the output of wc into cut (where delimiters are spaces and pick just the first field)

How about
grep -ch "^" file.txt

Obviously, there are a lot of solutions to this.
Here is another one though:
wc -l somefile | tr -d "[:alpha:][:blank:][:punct:]"
This only outputs the number of lines, but the trailing newline character (\n) is present, if you don't want that either, replace [:blank:] with [:space:].

Another way to strip the leading zeros without invoking an external command is to use Arithmetic expansion $((exp))
echo $(($(wc -l < file.txt)))

Best way would be first of all find all files in directory then use AWK NR (Number of Records Variable)
below is the command :
find <directory path> -type f | awk 'END{print NR}'
example : - find /tmp/ -type f | awk 'END{print NR}'

This works for me using the normal wc -l and sed to strip any char what is not a number.
wc -l big_file.log | sed -E "s/([a-z\-\_\.]|[[:space:]]*)//g"
# 9249133

Evaluate variable to integer with sed

This can be so easy to a people who know. I am almost finishing this command
echo VERSION=1.0 | sed 's/^VERSION=\([0-9]\).\([0-9]\)/VERSION=\1.\2+1/'
I only want to write VERSION=1.1 . How can I evaluate \2 to integer and sum +1..

of couse sed can do that. that's what e for. you can pass matched/replaced string to shell command using "e"
see the example based on your sed line:
kent$ echo VERSION=1.0 | sed 's/^VERSION=\([0-9]\).\([0-9]\)/echo "VERSION=\1.$((\2+1))"/e'
VERSION=1.1

You can use the bc command:
echo VERSION=`echo "1.0 + 0.1" | bc`
Results in:
VERSION=1.1
man bc
echo "VERSION="`echo "v=1.0; v+=0.1; v" | bc` > myFile.txt
cat myFile.txt
VERSION=1.1

Crpytic answer - how to use the whole toolkit:
x='VERSION=1.0'
echo -n $x | sed 's/\..*/./'; expr `echo $x | grep -o '\..*' | cut -c 2-` + 1

How to split a string in shell and get the last field

Suppose I have the string 1:2:3:4:5 and I want to get its last field (5 in this case). How do I do that using Bash? I tried cut, but I don't know how to specify the last field with -f.

You can use string operators:
$ foo=1:2:3:4:5
$ echo ${foo##*:}
5
This trims everything from the front until a ':', greedily.
${foo <-- from variable foo
## <-- greedy front trim
* <-- matches anything
: <-- until the last ':'
}

Another way is to reverse before and after cut:
$ echo ab:cd:ef | rev | cut -d: -f1 | rev
ef
This makes it very easy to get the last but one field, or any range of fields numbered from the end.

It's difficult to get the last field using cut, but here are some solutions in awk and perl
echo 1:2:3:4:5 | awk -F: '{print $NF}'
echo 1:2:3:4:5 | perl -F: -wane 'print $F[-1]'

Assuming fairly simple usage (no escaping of the delimiter, for example), you can use grep:
$ echo "1:2:3:4:5" | grep -oE "[^:]+$"
5
Breakdown - find all the characters not the delimiter ([^:]) at the end of the line ($). -o only prints the matching part.

You could try something like this if you want to use cut:
echo "1:2:3:4:5" | cut -d ":" -f5
You can also use grep try like this :
echo " 1:2:3:4:5" | grep -o '[^:]*$'

One way:
var1="1:2:3:4:5"
var2=${var1##*:}
Another, using an array:
var1="1:2:3:4:5"
saveIFS=$IFS
IFS=":"
var2=($var1)
IFS=$saveIFS
var2=${var2[#]: -1}
Yet another with an array:
var1="1:2:3:4:5"
saveIFS=$IFS
IFS=":"
var2=($var1)
IFS=$saveIFS
count=${#var2[#]}
var2=${var2[$count-1]}
Using Bash (version >= 3.2) regular expressions:
var1="1:2:3:4:5"
[[ $var1 =~ :([^:]*)$ ]]
var2=${BASH_REMATCH[1]}

$ echo "a b c d e" | tr ' ' '\n' | tail -1
e
Simply translate the delimiter into a newline and choose the last entry with tail -1.

Using sed:
$ echo '1:2:3:4:5' | sed 's/.*://' # => 5
$ echo '' | sed 's/.*://' # => (empty)
$ echo ':' | sed 's/.*://' # => (empty)
$ echo ':b' | sed 's/.*://' # => b
$ echo '::c' | sed 's/.*://' # => c
$ echo 'a' | sed 's/.*://' # => a
$ echo 'a:' | sed 's/.*://' # => (empty)
$ echo 'a:b' | sed 's/.*://' # => b
$ echo 'a::c' | sed 's/.*://' # => c

There are many good answers here, but still I want to share this one using basename :
basename $(echo "a:b:c:d:e" | tr ':' '/')
However it will fail if there are already some '/' in your string.
If slash / is your delimiter then you just have to (and should) use basename.
It's not the best answer but it just shows how you can be creative using bash commands.

If your last field is a single character, you could do this:
a="1:2:3:4:5"
echo ${a: -1}
echo ${a:(-1)}
Check string manipulation in bash.

Using Bash.
$ var1="1:2:3:4:0"
$ IFS=":"
$ set -- $var1
$ eval echo \$${#}
0

echo "a:b:c:d:e"|xargs -d : -n1|tail -1
First use xargs split it using ":",-n1 means every line only have one part.Then,pring the last part.

Regex matching in sed is greedy (always goes to the last occurrence), which you can use to your advantage here:
$ foo=1:2:3:4:5
$ echo ${foo} | sed "s/.*://"
5

A solution using the read builtin:
IFS=':' read -a fields <<< "1:2:3:4:5"
echo "${fields[4]}"
Or, to make it more generic:
echo "${fields[-1]}" # prints the last item

for x in `echo $str | tr ";" "\n"`; do echo $x; done

improving from #mateusz-piotrowski and #user3133260 answer,
echo "a:b:c:d::e:: ::" | tr ':' ' ' | xargs | tr ' ' '\n' | tail -1
first, tr ':' ' ' -> replace ':' with whitespace
then, trim with xargs
after that, tr ' ' '\n' -> replace remained whitespace to newline
lastly, tail -1 -> get the last string

For those that comfortable with Python, https://github.com/Russell91/pythonpy is a nice choice to solve this problem.
$ echo "a:b:c:d:e" | py -x 'x.split(":")[-1]'
From the pythonpy help: -x treat each row of stdin as x.
With that tool, it is easy to write python code that gets applied to the input.
Edit (Dec 2020):
Pythonpy is no longer online.
Here is an alternative:
$ echo "a:b:c:d:e" | python -c 'import sys; sys.stdout.write(sys.stdin.read().split(":")[-1])'
it contains more boilerplate code (i.e. sys.stdout.read/write) but requires only std libraries from python.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Remove all chars that are not a digit from a string - bash

I'm trying to make a small function that removes all the chars that are not digits. 123a45a ---> will become ---> 12345 I've came up with : temp=$word | grep -o [[:digit:]] echo $temp But instead of 12345 I get 1 2 3 4 5. How to I get rid of the spaces?

Pure bash: word=123a45a number=${word//[^0-9]}

Here's a pure bash solution var='123a45a' echo ${var//[^0-9]/} 12345

is this what you are looking for? kent$ echo "123a45a"|sed 's/[^0-9]//g' 12345 grep & tr echo "123a45a"|grep -o '[0-9]'|tr -d '\n' 12345

What about using sed? $ echo "123a45a" | sed -r 's/[^0-9]//g' 12345 As I read you are just allowed to use grep and tr, this can make the trick: $ echo "123a45a" | grep -o [[:digit:]] | tr -d '\n' 12345 In your case, temp=$(echo $word | grep -o [[:digit:]] | tr -d '\n')

tr will also work: echo "123a45a" | tr -cd '[:digit:]' # output: 12345

Grep returns the result on different lines: $ echo -e "$temp" 1 2 3 4 5 So you cannot remove those spaces during the filtering, but you can afterwards, since $temp can transform itself like this: temp=`echo $temp | tr -d ' '` $ echo "$temp" 12345

Related

count all the lines in all folders in bash [duplicate]

Add symbol every 2 bytes

How to get "wc -l" to print just the number of lines without file name?

Evaluate variable to integer with sed

How to split a string in shell and get the last field

Categories

Resources