How to measure the depth of a file system path? - macos

I'm looking for a way to do this on the command line, since this is not too hard a task in Java or Python.
Something like:
$ measure_depth /a/b/c/d/e/f
6
$ measure_depth /a
1
This question is functionally equivalent to "is there an easy way to count the number of slashes in a filename?"

Define a measure_depth function:
measure_depth() { echo "${*#/}" | awk -F/ '{print NF}'; }
Then, use it as follows:
$ measure_depth /a/b/c/d/e/f
6
$ measure_depth /a
1

You can do something like
tr -s "/" "\n" | wc -l
which gives you an extra one, so a "hacky" way around it would be
sed "s/^\///" | tr -s "/" "\n" | wc -l
echo "/a/b/c/d/e/f" | sed "s/^\///" | tr -s "/" "\n" | wc -l
6

Use realpath before counting the slashes to avoid overestimations as e.g. /home/user/../user/../user/../user/dir/ would be translated to /home/user/dir.
realpath <dir> | grep -o '/' | wc -l

Related

How to filter all the paths from the urls using "sed" or "grep"

I was trying to filter all the files from the URLs and get only paths.
echo -e "http://sub.domain.tld/secured/database_connect.php\nhttp://sub.domain.tld/section/files/image.jpg\nhttp://sub.domain.tld/.git/audio-files/top-secret/audio.mp3" | grep -Ei "(http|https)://[^/\"]+" | sort -u
http://sub.domain.tld
But I want the result like this
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
Is there any way to do it with sed or grep
Using grep
$ echo ... | grep -o '.*/'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
with grep
If your grep has the -o option:
... | grep -Eio 'https?://.*/'
If there could be multiple URLs per line:
... | grep -Eio 'https?://[^[:space:]]+/'
with sed
If the input is always precisely one URL per line and nothing else, you can just delete the filename part:
... | sed 's/[^/]*$//'
You could use match function of awk, will work in any version of awk. Simple explanation would be, passing echo command's output to awk program. Using match matching everything till last occurrence of / and then printing the sub-string to print just before /(with -1 to RLENGTH).
your_echo_command | awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH-1)}'
GNU Awk
$ echo ... | awk 'match($0,/.*\//,a){print a[0]}'
$ echo ... | awk '{print gensub(/(.*\/).*/,"\\1",1)}'
$ echo ... | awk 'sub(/[^/]*$/,"")'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/
xargs
$ echo ... | xargs -i sh -c 'echo $(dirname "{}")/'
http://sub.domain.tld/secured/
http://sub.domain.tld/section/files/
http://sub.domain.tld/.git/audio-files/top-secret/

count all the lines in all folders in bash [duplicate]

wc -l file.txt
outputs number of lines and file name.
I need just the number itself (not the file name).
I can do this
wc -l file.txt | awk '{print $1}'
But maybe there is a better way?
Try this way:
wc -l < file.txt
cat file.txt | wc -l
According to the man page (for the BSD version, I don't have a GNU version to check):
If no files are specified, the standard input is used and no file
name is
displayed. The prompt will accept input until receiving EOF, or [^D] in
most environments.
To do this without the leading space, why not:
wc -l < file.txt | bc
Comparison of Techniques
I had a similar issue attempting to get a character count without the leading whitespace provided by wc, which led me to this page. After trying out the answers here, the following are the results from my personal testing on Mac (BSD Bash). Again, this is for character count; for line count you'd do wc -l. echo -n omits the trailing line break.
FOO="bar"
echo -n "$FOO" | wc -c # " 3" (x)
echo -n "$FOO" | wc -c | bc # "3" (√)
echo -n "$FOO" | wc -c | tr -d ' ' # "3" (√)
echo -n "$FOO" | wc -c | awk '{print $1}' # "3" (√)
echo -n "$FOO" | wc -c | cut -d ' ' -f1 # "" for -f < 8 (x)
echo -n "$FOO" | wc -c | cut -d ' ' -f8 # "3" (√)
echo -n "$FOO" | wc -c | perl -pe 's/^\s+//' # "3" (√)
echo -n "$FOO" | wc -c | grep -ch '^' # "1" (x)
echo $( printf '%s' "$FOO" | wc -c ) # "3" (√)
I wouldn't rely on the cut -f* method in general since it requires that you know the exact number of leading spaces that any given output may have. And the grep one works for counting lines, but not characters.
bc is the most concise, and awk and perl seem a bit overkill, but they should all be relatively fast and portable enough.
Also note that some of these can be adapted to trim surrounding whitespace from general strings, as well (along with echo `echo $FOO`, another neat trick).
How about
wc -l file.txt | cut -d' ' -f1
i.e. pipe the output of wc into cut (where delimiters are spaces and pick just the first field)
How about
grep -ch "^" file.txt
Obviously, there are a lot of solutions to this.
Here is another one though:
wc -l somefile | tr -d "[:alpha:][:blank:][:punct:]"
This only outputs the number of lines, but the trailing newline character (\n) is present, if you don't want that either, replace [:blank:] with [:space:].
Another way to strip the leading zeros without invoking an external command is to use Arithmetic expansion $((exp))
echo $(($(wc -l < file.txt)))
Best way would be first of all find all files in directory then use AWK NR (Number of Records Variable)
below is the command :
find <directory path> -type f | awk 'END{print NR}'
example : - find /tmp/ -type f | awk 'END{print NR}'
This works for me using the normal wc -l and sed to strip any char what is not a number.
wc -l big_file.log | sed -E "s/([a-z\-\_\.]|[[:space:]]*)//g"
# 9249133

how to grep for "+" or "-" values?

I need to grep for values with + or - symbol
eg: -3 or +3
I tried the following , doesnt help me
$ echo $accessTime | grep "^((\+)|(\-))[0-9]+$"
$ echo $accessTime | grep "+[0-9]+"
$ echo $accessTime | grep "\+[0-9]+"
$ echo $accessTime | grep "'+'[0-9]+"
$ echo $accessTime | grep "^'\+'[0-9]+"
$ echo $accessTime | grep "^(\+)[0-9]+"
$ echo $accessTime | grep "^(\+)[0-9]+"
Can you guys pls help me ....Im learning bash for past few days only..Thanks
Its easier than what you think, you should try
echo $accessTime | grep [+-]
You can use:
accessTime='+3'
echo "$accessTime" | grep "+[0-9]\+"
+3
Quantifier + needs to be escaped in normal grep and literal + must not be escaped.
With grep -E it is exactly reverse:
echo "$accessTime" | grep -E "\+[0-9]+"
+3
Quantifier + must not be escaped in grep -E (extended regex) and literal + . needs to be escaped.

How to get "wc -l" to print just the number of lines without file name?

wc -l file.txt
outputs number of lines and file name.
I need just the number itself (not the file name).
I can do this
wc -l file.txt | awk '{print $1}'
But maybe there is a better way?
Try this way:
wc -l < file.txt
cat file.txt | wc -l
According to the man page (for the BSD version, I don't have a GNU version to check):
If no files are specified, the standard input is used and no file
name is
displayed. The prompt will accept input until receiving EOF, or [^D] in
most environments.
To do this without the leading space, why not:
wc -l < file.txt | bc
Comparison of Techniques
I had a similar issue attempting to get a character count without the leading whitespace provided by wc, which led me to this page. After trying out the answers here, the following are the results from my personal testing on Mac (BSD Bash). Again, this is for character count; for line count you'd do wc -l. echo -n omits the trailing line break.
FOO="bar"
echo -n "$FOO" | wc -c # " 3" (x)
echo -n "$FOO" | wc -c | bc # "3" (√)
echo -n "$FOO" | wc -c | tr -d ' ' # "3" (√)
echo -n "$FOO" | wc -c | awk '{print $1}' # "3" (√)
echo -n "$FOO" | wc -c | cut -d ' ' -f1 # "" for -f < 8 (x)
echo -n "$FOO" | wc -c | cut -d ' ' -f8 # "3" (√)
echo -n "$FOO" | wc -c | perl -pe 's/^\s+//' # "3" (√)
echo -n "$FOO" | wc -c | grep -ch '^' # "1" (x)
echo $( printf '%s' "$FOO" | wc -c ) # "3" (√)
I wouldn't rely on the cut -f* method in general since it requires that you know the exact number of leading spaces that any given output may have. And the grep one works for counting lines, but not characters.
bc is the most concise, and awk and perl seem a bit overkill, but they should all be relatively fast and portable enough.
Also note that some of these can be adapted to trim surrounding whitespace from general strings, as well (along with echo `echo $FOO`, another neat trick).
How about
wc -l file.txt | cut -d' ' -f1
i.e. pipe the output of wc into cut (where delimiters are spaces and pick just the first field)
How about
grep -ch "^" file.txt
Obviously, there are a lot of solutions to this.
Here is another one though:
wc -l somefile | tr -d "[:alpha:][:blank:][:punct:]"
This only outputs the number of lines, but the trailing newline character (\n) is present, if you don't want that either, replace [:blank:] with [:space:].
Another way to strip the leading zeros without invoking an external command is to use Arithmetic expansion $((exp))
echo $(($(wc -l < file.txt)))
Best way would be first of all find all files in directory then use AWK NR (Number of Records Variable)
below is the command :
find <directory path> -type f | awk 'END{print NR}'
example : - find /tmp/ -type f | awk 'END{print NR}'
This works for me using the normal wc -l and sed to strip any char what is not a number.
wc -l big_file.log | sed -E "s/([a-z\-\_\.]|[[:space:]]*)//g"
# 9249133

bash: Grab fields 5 and 7 from a Unix path?

Given paths like this:
/data/mirrors/third-party/centos/5/projectA/x86_64
/data/mirrors/third-party/centos/5/projectA/i386
/data/mirrors/third-party/centos/5/projectA/noarch
/data/mirrors/third-party/centos/4/projectB/x86_64
/data/mirrors/third-party/centos/4/projectB/i386
/data/mirrors/third-party/centos/4/projectB/noarch
/data/mirrors/third-party/centos/4/projectC/x86_64
/data/mirrors/third-party/centos/4/projectC/i386
/data/mirrors/third-party/centos/4/projectC/noarch
How can I grab the values from field 5 and 7 ('5' and 'x86_64') using Bash shell commands?
I have something like this so far, but I'm looking for something more elegant, and without the need to capture the 'junk*':
cd /data/mirrors/third-party/centos/5/project/x86_64
echo `pwd` | tr '/' ' ' | while read junk1 junk2 junk3 junk4 version junk5 arch; do
echo version=$version arch=$arch
done
version=5 arch=x86_64
This works for me:
pwd | awk -F'/' '{print "version=" $6 " arch=" $8}'
You can use IFS and an array to split the directory into its components:
#!/bin/bash
saveIFS=$IFS
IFS='/'
dirs=($(pwd))
IFS=$saveIFS
version=${dirs[5]}
arch=${dirs[7]}
> p=$(pwd)
> echo $p
/data/mirrors/third-party/centos/5/projectA/x86_64
> basename ${p}
x86_64
> basename ${p%/*/*}
5
You can also use something like:
echo `expr match "$p" '<regular-expression>'`
...perhaps someone might help me with that regular expression ;)
try this
echo `pwd` | cut -d'/' -f6,8 | tr '/' ' '
to display field
or to display with sring version and arch
echo `pwd` | cut -d'/' -f6,8 | sed -e 's/\(.*\)\/\(.*\)/version=\1 arch=\2/'

Resources