Bash get text between 5th and 6th underscore in a variable - bash

I have a variable called $folder_name which contains the string
Release_2019_Config_V6_Standalone_PJ6678_Test
which is the name of a folder.
I'm trying to extract PJ6678 from the folder name.
I know the folder name will put the user id (the text I need) between the 5th and 6th underscore, I don't know what text/symbols will be present after the 6th underscore.
I'm using Bash script, i'd really appreciate the help if someone could help with this functionality as i'm completely lost trying to use sed (after reading for hours i'm assuming this is the correct tool for the job?

Here is a Bash only solution:
#!/bin/bash
INPUT="Release_2019_Config_V6_Standalone_PJ6678_Test"
IFS='_' read -ra IN <<< "$INPUT"
echo ${IN[5]}
Or use cut:
cut -d '_' -f 6 <<< "Release_2019_Config_V6_Standalone_PJ6678_Test"
Or use awk:
awk -F "_" '{ print $6 }' <<< "Release_2019_Config_V6_Standalone_PJ6678_Test"

If you want pure-bash solution, you can use tokenize the file name, and pick up the 5th element
IFS=_ read -a token <<< "$folder_name"
id=${token[5]}
Eliminating dependency and performance hit from launching additional programs per folder name.

Try this command:
echo $a | awk -F'_' '{print $6}'
Here, _ is the delimiter and $a is a variable that holds the value.

For completeness, here's a pure-shell solution that doesn't rely on bash extensions like arrays.
$ folder_name=Release_2019_Config_V6_Standalone_PJ6678_Test
$ tmp=${folder_name#*_*_*_*_*_} # Because we know how many _ to strip
$ echo ${tmp%_*}
PJ6678
Because the # operator strips the shortest prefix, this won't allow * to match any _ itself; if it did, we could shorten the prefix by making the underscore match one of the literal _ in the pattern instead.

Related

Read file by line then process as different variable

I have created a text file with a list of file names like below
022694-39.tar
022694-39.tar.2017-05-30_13:56:33.OLD
022694-39.tar.2017-07-04_09:22:04.OLD
022739-06.tar
022867-28.tar
022867-28.tar.2018-07-18_11:59:19.OLD
022932-33.tar
I am trying to read the file line by line then strip anything after .tar with awk and use this to create a folder unless it exists.
Then the plan is to copy the original file to the new folder with the original full name stored in $LINE.
$QNAP= "Path to storage"
$LOG_DIR/$NOVA_TAR_LIST= "Path to text file containing file names"
while read -r LINE; do
CURNT_JOB_STRIPED="$LINE | `awk -F ".tar" '{print $1}'`"
if [ ! -d "$QNAP/$CURNT_JOB_STRIPED" ]
then
echo "Folder $QNAP/$CURNT_JOB_STRIPED doesn't exist."
#mkdir "$QNAP/$CURNT_JOB_STRIPED"
fi
done <"$LOG_DIR/$NOVA_TAR_LIST"
Unfortunately this seems to be trying to join all the file names together when trying to create the directories rather than doing them one by one and I get a
File name too long
output:
......951267-21\n951267-21\n961075-07\n961148-13\n961520-20\n971333-21\n981325-22\n981325-22\n981743-40\n999111-99\n999999-04g\n999999-44': File name too long
Apologies if this is trivial, bit of a rookie...
Try modifying your script as follows:
CURNT_JOB_STRIPED=$(echo "${LINE}" | awk -F ".tar" '{print $1}')
You have to use $(...) for command substitution. Also, you should print the variable LINE in order to prevent the shell from interpreting its value as a command but passing it to the next command of the pipe (as an input) instead. Finally, you should remove the backticks from the awk expression (this is the deprecated syntax for command substitution) since what you want is the result from the piping commands.
For further information, take a look over http://tldp.org/LDP/abs/html/commandsub.html
Alternatively, and far less readable (neither with a higher performance, thus just as a "curiosity"), you can just use instead of the whole while loop:
xargs -I{} bash -c 'mkdir -p "${2}/${1%.tar*}"' - '{}' "${QNAP}" < "${LOG_DIR}/${NOVA_TAR_LIST}"
The problem is with the CURNT_JOB_STRIPED="$LINE | `awk -F ".tar" '{print $1}'`" line.
The `command` is legacy a syntax, $(command) should be used instead.
$LINE variable should be printed so awk can receive its value trough a pipe.
If you run the whole thing in a sub shell ( $(command) ) you can assign the output into a variable: var=$(date)
Is is safer to put variables into ${} so if there is surrounding text you will not get unexpected results.
This should work:
CURNT_JOB_STRIPED=$(echo "${LINE}" | awk -F '.tar' '{print $1}')
With variable substitution this can be achieved with more efficient code, and it also clean to read I believe.
Variable substitution is not change the ${LINE} variable so it can be used later as the variable that have the full filename unchanged while ${LINE%.tar*} cut the last .tar text from the variable value and with * anything after that.
while read -r LINE; do
if [ ! -d "${QNAP}/${LINE%.tar*}" ]
then
echo "Folder ${QNAP}/${LINE%.tar*} doesn't exist."
#mkdir "${QNAP}/${LINE%.tar*}"
fi
done <"${LOG_DIR}/${NOVA_TAR_LIST}"
This way you not store the directory name as variable and ${LINE} only store the filename. If You need it into a variable you can do that easily: var="${LINE%.tar*}"
Variable Substitution:
There is more i only picked this 4 for now as they similar and relevant here.
${var#pattern} - Use value of var after removing text that match pattern from the left
${var##pattern} - Same as above but remove the longest matching piece instead the shortest
${var%pattern} - Use value of var after removing text that match pattern from the right
${var%%pattern} - Same as above but remove the longest matching piece instead the shortest

Bash read filename and return version number with awk

I am trying to use one or two lines of Bash (that can be run in a command line) to read a folder-name and return the version inside of the name.
So if I have myfolder_v1.0.13 I know that I can use echo "myfolder_v1.0.13" | awk -F"v" '{ print $2 }' and it will return with 1.0.13.
But how do I get the shell to read the folder name and pipe with the awk command to give me the same result without using echo? I suppose I could always navigate to the directory and translate the output of pwd into a variable somehow?
Thanks in advance.
Edit: As soon as I asked I figured it out. I can use
result=${PWD##*/}; echo $result | awk -F"v" '{ print $2 }'
and it gives me what I want. I will leave this question up for others to reference unless someone wants me to take it down.
But you don't need an Awk at all, here just use bash parameter expansion.
string="myfolder_v1.0.13"
printf "%s\n" "${string##*v}"
1.0.13
You can use
basename "$(cd "foldername" ; pwd )" | awk -Fv '{print $2}'
to get the shell to give you the directory name, but if you really want to use the shell, you could also avoid the use of awk completetly:
Assuming you have the path to the folder with the version number in the parameter "FOLDERNAME":
echo "${FOLDERNAME##*v}"
This removes the longest prefix matching the glob expression "*v" in the value of the parameter FOLDERNAME.

How do I seperate a link to get the end of a URL in shell?

I have some data that looks like this
"thumbnailUrl": "http://placehold.it/150/adf4e1"
I want to know how I can get the trailing part of the URL, I want the output to be
adf4e1
I was trying to grep when starting with / and ending with " but I'm only a beginner in shell scripting and need some help.
I came up with a quick and dirty solution, using grep (with perl regex) and cut:
$ cat file
"thumbnailUrl": "http://placehold.it/150/adf4e1"
"anotherUrl": "http://stackoverflow.com/questions/3979680"
"thumbnailUrl": "http://facebook.com/12f"
"randortag": "http://google.com/this/is/how/we/roll/3fk19as1"
$ cat file | grep -o '/\w*"$' | cut -d'/' -f2- | cut -d'"' -f1
adf4e1
3979680
12f
3fk19as1
We could kill this with a thousand little cuts, or just one blow from Awk:
awk -F'[/"]' '{ print $(NF-1); }'
Test:
$ echo '"thumbnailUrl": "http://placehold.it/150/adf4e1"' \
| awk -F'[/"]' '{ print $(NF-1); }'
adf4e1
Filter thorugh Awk using double quotes and slashes as field separators. This means that the trailing part ../adf4e1" is separated as {..}</>{adf4e1}<">{} where curly braces denote fields and angle brackets separators. The Awk variable NF gives the 1-based number of fields and so $NF is the last field. That's not the one we want, because it is blank; we want $(NF-1): the second last field.
"Golfed" version:
awk -F[/\"] '$0=$(NF-1)'
If the original string is coming from a larger JSON object, use something like jq to extract the value you want.
For example:
$ jq -n '{thumbnail: "http://placehold.it/150/adf4e1"}' |
> jq -r '.thumbnail|split("/")[-1]'
adf4e1
(The first command just generates a valid JSON object representing the original source of your data; the second command parses it and extracts the desired value. The split function splits the URL into an array, from which you only care about the last element.)
You can also do this purely in bash using string replacement and substring removal if you wrap your string in single quotes and assign it to a variable.
#!/bin/bash
string='"thumbnailUrl": "http://placehold.it/150/adf4e1"'
string="${string//\"}"
echo "${string##*/}"
adf4e1 #output
You can do that using 'cut' command in linux. Cut it using '/' and keep the last cut. Try it, its fun!
Refer http://www.thegeekstuff.com/2013/06/cut-command-examples

Shell cut delimiter before last

I`m trying to cut a string (Name of a file) where I have to get a variable in the name.
But the problem is, I have to put it in a shell variable, until now it is ok.
Here is the example of what i have to do.
NAME_OF_THE_FILE_VARIABLEiWANTtoGET_DATE
NAMEfile_VARIABLEiWANT_DATE
NAME_FILE_VARIABLEiWANT_DATE
The position of the variable I want always can change, but it will be always 1 before last. The delimiter is the "_".
Is there a way to count the size of the array to get size-1 or something like that?
OBS: when i cut strings I always use things like that:
VARIABLEiWANT=`echo "$FILENAME" | cut 1 -d "_"`
awk -F'_' '{print $(NF-1)}' file
or you have a string
awk -F'_' '{print $(NF-1)}' <<< "$FILENAME"
save the output of above oneliner into your variable.
IFS=_ read -a array <<< "$FILENAME"
variable_i_want=${array[${#array[#]}-2]}
It's a bit of a mess visually, but it's more efficient than starting a new process. ${#array[#]} is the number of elements read from FILENAME, so the indices for the array range from 0 to ${#array[#]}-1.
As of bash 4.3, though, you can use a negative index instead of computing it.
variable_i_want=${array[-2]}
If you need POSIX compatibility (no arrays), then
tmp=${FILENAME%_${FILENAME##*_}} # FILENAME with last field removed
variable_i_want=${tmp##*_} # last field of tmp
Just got it... I find someone using a cat function... I got to use it with the echo... and rev. didn't understand the rev thing, but I think it revert the order of the delimiter.
CODIGO=`echo "$ARQ_NAME" | rev | cut -d "_" -f 2 | rev `

Delete everything before last / in a file path

I have many file paths in a file that look like so:
/home/rtz11/files/testfiles/547/prob547455_01
I want to use a bash script that will print all the filenames to the screen, basically whatever comes after the last /. I don't want to assume that it would always be the same length because it might not be.
Would there be a way to delete everything before the last /? Maybe a sed command?
Using sed for this is vast overkill -- bash has extensive string manipulation built in, and using this built-in support is far more efficient when operating on only a single line.
s=/home/rtz11/files/testfiles/547/prob547455_01
basename="${s##*/}"
echo "$basename"
This will remove everything from the beginning of the string greedily matching */. See the bash-hackers wiki entry for parameter expansion.
If you only want to remove everything prior to the last /, but not including it (a literal reading of your question, but also a generally less useful operation), you might instead want if [[ $s = */* ]]; then echo "/${s##*/}"; else echo "$s"; fi.
awk '{print $NF}' FS=/ input-file
The 'print $NF' directs awk to print the last field of each line, and assigning FS=/ makes forward slash the field delimeter. In sed, you could do:
sed 's#.*/##' input-file
which simply deletes everything up to and including the last /.
Meandering but simply because I can remember the syntax I use:
cat file | rev | cut -d/ -f1 | rev
Many ways to skin a 'cat'. Ouch.
One more way:
Use the basename executable (command):
basename /path/with/many/slashes/and/file.extension
>file.extension
basename /path/with/many/slashes/and/file.extension .extension
OR
basename -s .extension /path/with/many/slashes/and/file.extension
> file

Resources