Bash - how to convert lines into a single line of quoted values - bash

Given a list of values returned by a previous command:
ABC-55080
ABC-55060
ABC-55040
ABC-55035
ABC-55030
ABC-55025
ABC-55020
I want to get a single-lined list of quoted values:
("ABC-55060", "ABC-55040", "ABC-55035", "ABC-55030", "ABC-55025", "ABC-55020")
I've tried to do this using awk:
cat input_list.csv | awk '{print}' ORS='", "'
But what I get is the list without the opening and closing quotes:
ABC-55060", "ABC-55040", "ABC-55035", "ABC-55030", "ABC-55025", "ABC-55020
How can achieve this?

$ awk '{o=o s "\""$0"\""; s=", "} END{print "(" o ")"}' file
("ABC-55080", "ABC-55060", "ABC-55040", "ABC-55035", "ABC-55030", "ABC-55025", "ABC-55020")

Could you please try following, written and tested with shown samples in GNU awk. Written and tested in link
https://ideone.com/zO1eYf
awk '
BEGIN{
s1="\""
OFS=", "
}
FNR>1{
val=(val?val OFS:"")s1 $0 s1
}
END{
print "(" val ")"
}
' Input_file
Explanation: In BEGIN section setting value of variable s1 as " and setting output field separator as , here. Then coming to main program block checking condition if line is greater than 1 then keep adding that line's value to variable val with wrapping it's value with s1. In END block of this program printing value of val with adding ( and ) before and after val value respectively.

When you want to use sed:
sed 's/.*/"&",/;1s/^/(/;$s/,$/)/' input_list.csv | tr -d "\n"
When your inputfile is used as an example and actually you want to process the output of a stream, you can use
cat input_list.csv | xargs -I"{}" printf '"%s", ' "{}" | sed 's/^/(/; s/, $/)\n/'

If ed is available/acceptable.
#!/bin/sh
ed -s input_list.csv <<-'EOF'
g/./s/^/"/\
s/$/", /
1,$j
s/^/(/
s/,[[:space:]]$/)/
,p
Q
EOF
Change Q to w if you want to edit the file-inplace.

Related

How to find content in a file and replace the adjecent value

Using bash how do I find a string and update the string next to it for example pass value
my.site.com|test2.spin:80
proxy_pass.map
my.site2.com test2.spin:80
my.site.com test.spin:8080;
Expected output is to update proxy_pass.map with
my.site2.com test2.spin:80
my.site.com test2.spin:80;
I tried using awk
awk '{gsub(/^my\.site\.com\s+[A-Za-z0-9]+\.spin:8080;$/,"my.site2.comtest2.spin:80"); print}' proxy_pass.map
but does not seem to work. Is there a better way to approch the problem. ?
One awk idea, assuming spacing needs to be maintained:
awk -v rep='my.site.com|test2.spin:80' '
BEGIN { split(rep,a,"|") # split "rep" variable and store in
site[a[1]]=a[2] # associative array
}
$1 in site { line=$0 # if 1st field is in site[] array then make copy of current line
match(line,$1) # find where 1st field starts (in case 1st field does not start in column #1)
newline=substr(line,1,RSTART+RLENGTH-1) # save current line up through matching 1st field
line=substr(line,RSTART+RLENGTH) # strip off 1st field
match(line,/[^[:space:];]+/) # look for string that does not contain spaces or ";" and perform replacement, making sure to save everything after the match (";" in this case)
newline=newline substr(line,1,RSTART-1) site[$1] substr(line,RSTART+RLENGTH)
$0=newline # replace current line with newline
}
1 # print current line
' proxy_pass.map
This generates:
my.site2.com test2.spin:80
my.site.com test2.spin:80;
If the input looks like:
$ cat proxy_pass.map
my.site2.com test2.spin:80
my.site.com test.spin:8080;
This awk script generates:
my.site2.com test2.spin:80
my.site.com test2.spin:80;
NOTES:
if multiple replacements need to be performed I'd suggest placing them in a file and having awk process said file first
the 2nd match() is hardcoded based on OP's example; depending on actual file contents it may be necessary to expand on the regex used in the 2nd match()
once satisified with the result the original input file can be updated in a couple ways ... a) if using GNU awk then awk -i inplace -v rep.... or b) save result to a temp file and then mv the temp file to proxy_pass.map
If the number of spaces between the columns is not significant, a simple
proxyf=proxy_pass.map
tmpf=$$.txt
awk '$1 == "my.site.com" { $2 = "test2.spin:80;" } {print}' <$proxyf >$tmpf && mv $tmpf $proxyf
should do. If you need the columns to be lined up nicely, you can replace the print by a suitable printf .... statement.
With your shown samples and attempts please try following awk code. Creating shell variable named var where it stores value my.site.com|test2.spin:80 in it. which further is being passed to awk program. In awk program creating variable named var1 which has shell variable var's value in it.
In BEGIN section of awk using split function to split value of var(shell variable's value container) into array named arr with separator as |. Where num is total number of values delimited by split function. Then using for loop to be running till value of num where it creates array named arr2 with index of current i value and making i+1 as its value(basically 1 is for key of array and next item is value of array).
In main block of awk program checking condition if $1 is in arr2 then print arr2's value else print $2 value as per requirement.
##Shell variable named var is being created here...
var="my.site.com|test2.spin:80"
awk -v var1="$var" '
BEGIN{
num=split(var1,arr,"|")
for(i=1;i<=num;i+=2){
arr2[arr[i]]=arr[i+1]
}
}
{
print $1,(($1 in arr2)?arr2[$1]:$2)
}
' Input_file
OR in case you want to maintain spaces between 1st and 2nd field(s) then try following code little tweak of Above code. Written and tested with your shown samples Only.
awk -v var1="$var" '
BEGIN{
num=split(var1,arr,"|")
for(i=1;i<=num;i+=2){
arr2[arr[i]]=arr[i+1]
}
}
{
match($0,/[[:space:]]+/)
print $1 substr($0,RSTART,RLENGTH) (($1 in arr2)?arr2[$1]:$2)
}
' Input_file
NOTE: This program can take multiple values separated by | in shell variable to be passed and checked on in awk program. But it considers that it will be in format of key|value|key|value... only.
#!/bin/sh -x
f1=$(echo "my.site.com|test2.spin:80" | cut -d'|' -f1)
f2=$(echo "my.site.com|test2.spin:80" | cut -d'|' -f2)
echo "${f1}%${f2};" >> proxy_pass.map
tr '%' '\t' < proxy_pass.map >> p1
cat > ed1 <<EOF
$
-1
d
wq
EOF
ed -s p1 < ed1
mv -v p1 proxy_pass.map
rm -v ed1
This might work for you (GNU sed):
<<<'my.site.com|test2.spin:80' sed -E 's#\.#\\.#g;s#^(\S+)\|(\S+)#/^\1\\b/s/\\S+/\2/2#' |
sed -Ef - file
Build a sed script from the input arguments and apply it to the input file.
The input arguments are first prepared so that their metacharacters ( in this case the .'s are escaped.
Then the first argument is used to prepare a match command and the second is used as the value to be replaced in a substitution command.
The result is piped into a second sed invocation that takes the sed script and applies it the input file.

How do I join lines using space and comma

I have the file that contains content like:
IP
111
22
25
I want to print the output in the format IP 111,22,25.
I have tried tr ' ' , but its not working
Welcome to paste
$ paste -sd " ," file
IP 111,22,25
Normally what paste does is it writes to standard output lines consisting of sequentially corresponding lines of each given file, separated by a <tab>-character. The option -s does it differently. It states to paste each line of the files sequentially with a <tab>-character as a delimiter. When using the -d flag, you can give a list of delimiters to be used instead of the <tab>-character. Here I gave as a list " ," indicating, use space and then only commas.
In pure Bash:
# Read file into array
mapfile -t lines < infile
# Print to string, comma-separated from second element on
printf -v str '%s %s' "${lines[0]}" "$(IFS=,; echo "${lines[*]:1}")"
# Print
echo "$str"
Output:
IP 111,22,25
I'd go with:
{ read a; read b; read c; read d; } < file
echo "$a $b,$c,$d"
This will also work:
xargs printf "%s %s,%s,%s" < file
Try cat file.txt | tr '\n' ',' | sed "s/IP,/IP /g"
tr deletes new lines, sed changes IP,111,22,25 into IP 111,22,25
The following awk script will do the requested:
awk 'BEGIN{OFS=","} FNR==1{first=$0;next} {val=val?val OFS $0:$0} END{print first FS val}' Input_file
Explanation: Adding explanation for above code now.
awk ' ##Starting awk program here.
BEGIN{ ##Starting BEGIN section here of awk program.
OFS="," ##Setting OFS as comma, output field separator.
} ##Closing BEGIN section of awk here.
FNR==1{ ##Checking if line is first line then do following.
first=$0 ##Creating variable first whose value is current first line.
next ##next keyword is awk out of the box keyword which skips all further statements from here.
} ##Closing FNR==1 BLOCK here.
{ ##This BLOCK will be executed for all lines apart from 1st line.
val=val?val OFS $0:$0 ##Creating variable val whose values will be keep concatenating its own value.
}
END{ ##Mentioning awk END block here.
print first FS val ##Printing variable first FS(field separator) and variable val value here.
}' Input_file ##Mentioning Input_file name here which is getting processed by awk.
Using Perl
$ cat captain.txt
IP
111
22
25
$ perl -0777 -ne ' #k=split(/\s+/); print $k[0]," ",join(",",#k[1..$#k]) ' captain.txt
IP 111,22,25
$

print first 3 characters and / rest of the string with stars

I'have this input like this
John:boofoo
I want to print rest of the string with stars and keep only 3 characters of the string.
The output will be like this
John:boo***
this my command
awk -F ":" '{print $1,$2 ":***"}'
I want to use only print command if possible. Thanks
With GNU sed:
echo 'John:boofoo' | sed -E 's/(:...).*/\1***/'
Output:
John:boo***
With GNU awk for gensub():
$ awk 'BEGIN{FS=OFS=":"} {print $1, substr($2,1,3) gensub(/./,"*","g",substr($2,4))}' file
John:boo***
With any awk:
awk 'BEGIN{FS=OFS=":"} {tl=substr($2,4); gsub(/./,"*",tl); print $1, substr($2,1,3) tl}' file
John:boo***
Could you please try following. This will print stars(keeping only first 3 letters same as it is) how many characters are present in 2nd field after first 3 characters.
awk '
BEGIN{
FS=OFS=":"
}
{
stars=""
val=substr($2,1,3)
for(i=4;i<=length($2);i++){
stars=stars"*"
}
$2=val stars
}
1
' Input_file
Output will be as follows.
John:boo***
Explanation: Adding explanation for above code too here.
awk '
BEGIN{ ##Starting BEGIN section from here.
FS=OFS=":" ##Setting FS and OFS value as : here.
} ##Closing block of BEGIN section here.
{ ##Here starts main block of awk program.
stars="" ##Nullifying variable stars here.
val=substr($2,1,3) ##Creating variable val whose value is 1st 3 letters of 2nd field.
for(i=4;i<=length($2);i++){ ##Starting a for loop from 4(becasue we need to have from 4th character to till last in 2nd field) till length of 2nd field.
stars=stars"*" ##Keep concatenating stars variable to its own value with *.
}
$2=val stars ##Assigning value of variable val and stars to 2nd field here.
}
1 ##Mentioning 1 here to print edited/non-edited lines for Input_file here.
' Input_file ##Mentioning Input_file name here.
Or even with good old sed
$ echo "John:boofoo" | sed 's/...$/***/'
Output:
John:boo***
(note: this just replaces the last 3 characters of any string with "***", so if you need to key off the ':', see the GNU sed answer from Cyrus.)
Another awk variant:
awk -F ":" '{print $1 FS substr($2, 1, 3) "***"}' <<< 'John:boofoo'
John:boo***
Since we have the tags awk, bash and sed: for completeness sake here is a bash only solution:
INPUT="John:boofoo"
printf "%s:%s\n" ${INPUT%%:*} $(TMP1=${INPUT#*:};TMP2=${TMP1:3}; echo "${TMP1:0:3}${TMP2//?/*}")
It uses two arguments to printf after the format string. The first one is INPUT stripped of by everything uncluding and after the :. Lets break down the second argument $(TMP1=${INPUT#*:};TMP2=${TMP1:3}; echo "${TMP1:0:3}${TMP2//?/*}"):
$(...) the string is interpreted as a bash command its output is substituted as last argument to printf
TMP1=${INPUT#*:}; remove everything up to and including the :, store the string in TMP1.
TMP2=${TMP1:3}; geht all characters of TMP1 from offset 3 to the end and store them in TMP2.
echo "${TMP1:0:3}${TMP2//?/*}" output the temporary strings: the first three chars from TMP1 unmodified and all chars from TMP2 as *
the output of the last echo is the last argument to printf
Here is the bash -x output:
+ INPUT=John:boofoo
++ TMP1=boofoo
++ TMP2=foo
++ echo 'boo***'
+ printf '%s:%s\n' John 'boo***'
John:boo***
Another sed : replace all chars after the third by *
sed -E ':A;s/([^:]*:...)(.*)[^*]([*]*)/\1\2\3*/;tA'
Some more awk
awk 'BEGIN{FS=OFS=":"}{s=sprintf("%0*d",length(substr($2,4)),0); gsub(/0/,"*",s);print $1,substr($2,1,3) s}' infile
You can use the %* form of printf, which accepts a variable width. And, if you use '0' as your value to print, combined with the right-aligned text that's zero padded on the left..
Better Readable:
awk 'BEGIN{
FS=OFS=":"
}
{
s=sprintf("%0*d",length(substr($2,4)),0);
gsub(/0/,"*",s);
print $1,substr($2,1,3) s
}
' infile
Test Results:
$ awk --version
GNU Awk 3.1.7
Copyright (C) 1989, 1991-2009 Free Software Foundation.
$ cat f
John:boofoo
$ awk 'BEGIN{FS=OFS=":"}{s=sprintf("%0*d",length(substr($2,4)),0); gsub(/0/,"*",s);print $1,substr($2,1,3) s}' f
John:boo***
Another pure Bash, using the builtin regular expression predicate.
input="John:boofoo"
if [[ $input =~ ^([^:]*:...)(.*)$ ]]; then
printf '%s%s\n' "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]//?/*}"
else
echo >&2 "String doesn't match pattern"
fi
We split the string in two parts: the first part being everything up to (and including) the three chars found after the first colon (stored in ${BASH_REMATCH[1]}), the second part being the remaining part of string (stored in ${BASH_REMATCH[2]}). If the string doesn't match this pattern, we just insult the user.
We then print the first part unchanged, and the second part with every character replaced with *.

Shell command for inserting a newline every nth element of a huge line of comma separated strings

I have a one line csv containing a lot of elements. Now I want to insert a newline after every n-th element in a bash/shell script.
Bonus: I'd like to prepend a line with descriptors and using the count of descriptors as 'n'.
Example:
"4908041eee3d4bf98e606140b21ebc89.16","7.38974601030349731","45.31298584267982221","94ff11ce7eb54642b0768dde313e8b25.16","7.38845318555831909","45.31425320325949713", (...)
into
"id","lon","lat"
"4908041eee3d4bf98e606140b21ebc89.16","7.38974601030349731","45.31298584267982221"
"94ff11ce7eb54642b0768dde313e8b25.16","7.38845318555831909","45.31425320325949713"
(...)
Edit: I made a first attempt, but the comma delimiters are missing then:
(...) | xargs --delimiter=',' -n3
"4908041eee3d4bf98e606140b21ebc89.16" "7.38974601030349731" "45.31298584267982221"
"94ff11ce7eb54642b0768dde313e8b25.16" "7.38845318555831909" "45.31425320325949713"
trying to replace the " " with ","
(...) | xargs --delimiter=',' -n3 -i echo ${{}//" "/","}
-bash: ${{}//\": bad substitution
I would go with Perl for that!
Let's assume this outputs something like your file:
printf "1,2,3,4,5,6,7,8,9,10"
1,2,3,4,5,6,7,8,9,10
Then you could use this if you wanted every 4th comma replaced:
printf "1,2,3,4,5,6,7,8,9,10" | perl -pe 's{,}{++$n % 4 ? $& : "\n"}ge'
1,2,3,4
5,6,7,8
9,10
cat data.txt | xargs -n 3 -d, | sed 's/ /,/g'
With n=3 here and input filename is called data.txt
Note: What distinguishes this solution is that it derives the number of output columns from the number of columns in the header line.
Assuming that the fields in your CSV input have no embedded , instances (in which case you'd need a proper CSV parser), try awk:
awk -v RS=, -v header='"id","lon","lat"' '
BEGIN {
print header
colCount = 1 + gsub(",", ",", header)
}
{
ORS = NR % colCount == 0 ? "\n" : ","
print
}
' file.csv
Note that if the input file ends with a newline (as is typical), you'll get an extra newline trailing the output.
With GNU Awk or Mawk (but not BSD/OSX Awk, which only supports literal, single-character RS values), you can fix this as follows:
awk -v RS='[,\n]' -v header='"id","lon","lat"' '
BEGIN {
print header
colCount = 1 + gsub(",", ",", header)
}
{
ORS = NR % colCount == 0 ? "\n" : ","
print
}
' file.csv
BSD/OSX Awk workaround: stick with -v RS=, and replace file.csv with <(tr -d '\n' < file.csv) in order to remove all newlines from the input first.
Assuming your input file is named input:
echo id,lon,lat; awk '{ORS=NR%3?",":"\n"}1' RS=, input

Modify content inside quotation marks, BASH

Good day to all,
I was wondering how to modify the content inside quotation marks and left unmodified the outside.
Input line:
,,,"Investigacion,,, desarrollo",,,
Output line:
,,,"Investigacion, desarrollo",,,
Initial try:
sed 's/\"",,,""*/,/g'
But nothing happens, thanks in advance for any clue
The idiomatic awk way to do this is simply:
$ awk 'BEGIN{FS=OFS="\""} {sub(/,+/,",",$2)} 1' file
,,,"Investigacion, desarrollo",,,
or if you can have more than one set of quoted strings on each line:
$ cat file
,,,"Investigacion,,, desarrollo",,,"foo,,,,bar",,,
$ awk 'BEGIN{FS=OFS="\""} {for (i=2;i<=NF;i+=2) sub(/,+/,",",$i)} 1' file
,,,"Investigacion, desarrollo",,,"foo,bar",,,
This approach works because everything up to the first " is field 1, and everything from there to the second " is field 2 and so on so everything between "s is the even-numbered fields. It can only fail if you have newlines or escaped double quotes inside your fields but that'd affect every other possible solution too so you'd need to add cases like that to your sample input if you want a solution that handles it.
Using a language that has built-in CSV parsing capabilities like perl will help.
perl -MText::ParseWords -ne '
print join ",", map { $_ =~ s/,,,/,/; $_ } parse_line(",", 1, $_)
' file
,,,"Investigacion, desarrollo",,,
Text::ParseWords is a core module so you don't need to download it from CPAN. Using the parse_line method we set the delimiter and a flag to keep the quotes. Then just do simple substitution and join the line to make your CSV again.
Using egrep, sed and tr:
s=',,,"Investigacion,,, desarrollo",,,'
r=$(egrep -o '"[^"]*"|,' <<< "$s"|sed '/^"/s/,\{2,\}/,/g'|tr -d "\n")
echo "$r"
,,,"Investigacion, desarrollo",,,
Using awk:
awk '{ p = ""; while (match($0, /"[^"]*,{2,}[^"]*"/)) { t = substr($0, RSTART, RLENGTH); gsub(/,+/, ",", t); p = p substr($0, 1, RSTART - 1) t; $0 = substr($0, RSTART + RLENGTH); }; $0 = p $0 } 1'
Test:
$ echo ',,,"Investigacion,,, desarrollo",,,' | awk ...
,,,"Investigacion, desarrollo",,,
$ echo ',,,"Investigacion,,, desarrollo",,,",,, "' | awk ...
,,,"Investigacion, desarrollo",,,", "

Resources