Need to split 1st string before delimiter which is comma(,) - shell

Need to split the 1st string before delimiter comma.
For example
A="ABC:20.10.0-5,DEF:21.10.0-9,XYZ:20.10.0-9"
We need to extract 1st string before the comma(,) and the result should be like this -
B="ABC:20.10.0-5"
After this, I need to extract the numbers after colon(:) and before the dash(-). So the final value should be -
C="20.10.0"

It can be done with simple shell substitution:
A="ABC:20.10.0-5,DEF:21.10.0-9,XYZ:20.10.0-9"
B="${A%%,*}" # Remove everything after the first comma and the comma itself
nodash="${B%%-*}" # Remove everything after the dash and the dash itself
C="${nodash##*:}" # Remove everything before the colon and the colon itself

You can refer to the below code. I wasn't sure where exactly are you running it as there's only one tag and also that if you actually wanted to print double quotes in the output. Hence, i added them as well. I am assuming you are aware of cut command.
-bash-4.2$ cat test2.sh
#!/bin/bash
A="ABC:20.10.0-5,DEF:21.10.0-9,XYZ:20.10.0-9"
echo "A="\"$A\"
B=`echo $A | cut -d"," -f1`
echo "B="\"$B\"
C=`echo $B | cut -d":" -f2 | cut -d"-" -f1`
echo "C="\"$C\"
-bash-4.2$ ./test2.sh
A="ABC:20.10.0-5,DEF:21.10.0-9,XYZ:20.10.0-9"
B="ABC:20.10.0-5"
C="20.10.0"

Related

How to loop comma separated values in shell script

I tried to loop comma separated values with space, but not able to get the exact value since it has space in the string.
I tried in different ways, but i not able to get desired results.
Can anyone help me on this
#!/bin/ksh
values="('A','sample text','Mark')"
for i in `echo $values | sed 's/[)(]//g' | sed 's/,/ /g'`
do
echo $i
done
My expected output is:
A
sample text
Mark
First, change values to an array. Then iterating over it is a simple matter.
values=(A "sample text" Mark)
for i in "${values[#]}"; do
echo "$i"
done
This is the same as Chepner's answer, only kludgier, (variable substitution), and more dangerous, (the eval...), the better to use the OP's exact $values assignment:
values="('A','sample text','Mark')"
eval values=${values//,/ }
for i in "${values[#]}"; do
echo "$i"
done
It works in ksh, but really, if at all possible try to use Chepner's simpler and safer $values assignment.
Simply trim the quotes
#!/bin/ksh
values="('A','sample text','Mark')"
echo $values | tr -d "()'\"" | tr ',' '\n'
output:
A
sample text
Mark
You should use the single quotes for splitting the string (and quote "$values").
When your sed supports \n for replacement into a line, you can do without a loop:
echo "${values}" | sed "s/[)(]//g;s/','/\n/g;s/'//g"
# or
sed "s/[)(]//g;s/','/\n/g;s/'//g" <<< "${values}"
When the values in your string are without a comma and parentheses, you can use
grep -Eo "[^',()]*" <<< "${values}"
Better is looking for fields between 2 single quotes and remove those single quotes.
grep -Eo "'[^']*'" <<< "${values}" | tr -d "'"

Bash split word with same characters

How can I split string which contain more of the same characters.
For example name=John:adress=London. I need result name as variable and John:adress=London as value.
I have no idea how to. Thanks.
You can use cut.
# print first field
echo "name=John:#(ADDRESS=(LONDON=(STREET=XY)))" | cut -d = -f 1
# print remaining fields
echo "name=John:#(ADDRESS=(LONDON=(STREET=XY)))" | cut -d = -f 2-
You can use a cut and command grouping
INPUT='name=#(ADDRESS=(LONDON=(STREET=XY)))'
NAME=$(echo "$STR" | cut -d '=' -f 1)
INFO=$(echo "$STR" | cut -d '=' -f 2-)
The single quotes in the first line prevent any special bash symbols to be interpreted literally. The variable $NAME accepts the value of a command grouping, signified by $(). The $INPUT is echoed into the cut command, where the delimiter = is specified by the -d flag, and the first field (-f flag) is specified.
Next, the variable $INFO is assigned the value of the command grouping, where the second field until the end is signified. The dash after the two in this part: -f 2- tells bash to select everything after the first = sign to the end.
The first equals sign will not be in the $INFO variable at the end.

bash assign variable to another after operation

I'm trying to print domain and topLeveldomain variables (example.com)
$line = example.com
domain =$line | cut -d. -f 1
topLeveldomain = $line | cut -d. -f 2
However when I try and echo $domain, it doesn't display desired value
test.sh: line 4: domain: command not found
test.sh: line 5: topLeveldomain: command not found
I suggest:
line="example.com"
domain=$(echo "$line" | cut -d. -f 1)
topLeveldomain=$(echo "$line" | cut -d. -f 2)
The right code for this should be:
line="example.com"
domain=$(echo "$line" | cut -d. -f 1)
topLeveldomain=$(echo "$line" | cut -d. -f 2)
Consider the right syntax of bash:
variable=value
(there are no blanks allowed)
if you want to use the content of the variable you have to add a leading $
e.g.
echo $variable
You don't need external tools for this, just do this in bash
$ string="example.com"
# print everything upto first de-limiter '.'
$ printf "${string%%.*}\n"
example
# print everything after first de-limiter '.'
$ printf "${string#*.}\n"
com
Remove spaces around =:
line=example.com # YES
line = example.com # NO
When you create a variable, do not prepend $ to the variable name:
line=example.com # YES
$line=example.com # NO
When using pipes, you need to pass standard output to the next command. Than means, you usually need to echo variables or cat files:
echo $line | cut -d. -f1 # YES
$line | cut -d. -f1 # NO
Use the $() syntax to get the output of a command into a variable:
new_variable=$(echo $line | cut -d. -f1) # YES
new_variable=echo $line | cut -d. -f1 # NO
I would rather use AWK:
domain="abc.def.hij.example.com"
awk -F. '{printf "TLD:%s\n2:%s\n3:%s\n", $NF, $(NF-1), $(NF-2)}' <<< "$domain"
Output
TLD:com
2:example
3:hij
In the command above, -F option specifies the field separator; NF is a built-in variable that keeps the number of input fields.
Issues with Your Code
The issues with your code are due to invalid syntax.
To set a variable in the shell, use
VARNAME="value"
Putting spaces around the equal sign will cause errors. It is a good
habit to quote content strings when assigning values to variables:
this will reduce the chance that you make errors.
Refer to the Bash Guide for Beginners.
this also works:
line="example.com"
domain=$(echo $line | cut -d. -f1)
toplevel=$(cut -d. -f2 <<<$line)
echo "domain name=" $domain
echo "Top Level=" $toplevel
You need to remove $ from line in the beginning, correct the spaces and echo $line in order to pipe the value to cut . Alternatively feed the cut with $line.

cut a string after a specified pattern (comma)

I want to cut a string and assign it to a variable after first occurrence of comma.
my_string="a,b,c,d,e,f"
Output expected:
output="b,c,d,e,f"
When I use the command
output=`echo $my_string | cut -d ',' f2
I am getting only b as output.
Adding a dash '-' to the end of your -f2 will output the remainder of the string.
$ echo "a,b,c,d,e,f,g"|cut -d, -f2-
b,c,d,e,f,g
With parameter expansion instead of cut:
$ my_string="a,b,c,d,e,f"
$ output="${my_string#*,}"
$ echo "$output"
b,c,d,e,f
${my_string#*,} stands for "remove everything up to and including the first comma from my_string" (see the Bash manual).
You must add the minus sign (-) after the position you are looking for.
a=`echo $my_string|cut -d "," -f 2-`
echo $a
b,c,d,e,f

How to process large csv files efficiently using shell script, to get better performance than that for following script?

I have a large csv file input_file with 5 columns. I want to do two things to second column:
(1) Remove last character
(2) Append leading and trailing single quote
Following are the sample rows from input_file.dat
420374,2014-04-06T18:44:58.314Z,214537888,12462,1
420374,2014-04-06T18:44:58.325Z,214537850,10471,1
281626,2014-04-06T09:40:13.032Z,214535653,1883,1
Sample output would look like :
420374,'2014-04-06T18:44:58.314',214537888,12462,1
420374,'2014-04-06T18:44:58.325',214537850,10471,1
281626,'2014-04-06T09:40:13.032',214535653,1883,1
I have written a following code to do the same.
#!/bin/sh
inputfilename=input_file.dat
outputfilename=output_file.dat
count=1
while read line
do
echo $count
count=$((count + 1))
v1=$(echo $line | cut -d ',' -f1)
v2=$(echo $line | cut -d ',' -f2)
v3=$(echo $line | cut -d ',' -f3)
v4=$(echo $line | cut -d ',' -f4)
v5=$(echo $line | cut -d ',' -f5)
v2len=${#v2}
v2len=$((v2len -1))
newv2=${v2:0:$v2len}
newv2="'$newv2'"
row=$v1,$newv2,$v3,$v4,$v5
echo $row >> $outputfilename
done < $inputfilename
But it's taking lot of time.
Is there any efficient way to achieve this?
You can do this with awk
awk -v q="'" 'BEGIN{FS=OFS=","} {$2=q substr($2,1,length($2)-1) q}1' input_file.dat
How it works:
BEGIN{FS=OFS=","} : set input and output field separator (FS, OFS) to ,.
-v q="'" : assign a literal single quote to the variable q (to avoid complex escaping in the awk expression)
{$2=q substr($2,1,length($2)-1) q} : Replace the second field ($2) with a single quote (q) followed by the value of the 2nd field without the last character (substr(string, start, length)) and appending a literal single quote (q) at the end.
1 : Just invoke the default action, which is print the current (edited) line.

Resources