substitute a letter at a specific position in the file itself using bash - bash

I am trying to do this:
I have a file with content like below;
file:
abcdefgh
I am looking for a way to do this;
file:
aBCdefgh
So,make the 2nd and 3rd letter "capital/uppercase" in the file itself, because I have to do multiple conversions at different positions in a string in the file. Can someone please help me to know how to do this?
I came to know something like this below, but it does only for a single first character of the string in the file:
sed -i 's/^./\U&/' file
output:
Abcdefgh
Thanks much!

Change your sed approach to the following:
sed -i 's/\(.\)\(..\)/\1\U\2/' file
$ cat file
aBCdefgh
matching section:
\(.\) - match the 1st char of the string into the 1st captured group
\(..\) - match the next 2 chars placing into the 2nd captured group
replacement section:
\1 - points to the 1st parenthesized group \1 i.e. the 1st char
\U\2 - uppercase the characters from the 2nd captured group \2
Bonus approach for I want to capitalize "105th & 106th" characters:
sed -Ei 's/(.{104})(..)/\1\U\2/' file

awk on duty.
echo "abcdefgh" | awk '{print substr($0,1,1) toupper(substr($0,2,2)) substr($0,4)}'
Output will be as follows.
aBCdefgh
In case you have a Input_file and you want to save the edits into same Input_file.
awk '{print substr($0,1,1) toupper(substr($0,2,2)) substr($0,4)}' Input_file > temp_file && mv temp_file Input_file
Explanation: Please run above code as this is only for explanation purposes.
echo "abcdefgh" ##using echo command to print a string on the standard output.
| ##Pipe(|) is used for taking a command's standard output to pass as a standard input to another command(in this case echo is passing it's standard output to awk).
awk '{ ##Starting awk here.
##Print command in awk is being used to print anything variable, string etc etc.
##substring is awk's in-built utility which will allow us to get the specific parts of the line, variable. So it's syntax is substr(line/variable,starting point of the line/number,number of characters you need from the strating point mentioned), in case you haven't mentioned any number of characters it will take all the characters from starting point to till the end of the line.
##toupper, so it is also a awk's in-built utility which will covert any text to UPPER CASE passed to it, so in this case I am passing 2nd and 3rd character to it as per OP's request.
print substr($0,1,1) toupper(substr($0,2,2)) substr($0,4)}'

Related

How to remove string between two characters and before the first occurrence using sed

I would like to remove the string between ":" and the first "|" using sed.
input:
|abc:1.2.3|def|
output from sed:
|abc|def|
I managed to come up with sed 's|\(:\)[^|]*|\1|', but this sed command does not remove the first character (":"). How can I modify this command to also remove the colon?
You don't need to group : in your pattern and use it in substitution.
You should keep it simple:
s='|abc:1.2.3|def|'
sed 's/:[^|]*//' <<< "$s"
|abc|def|
: matches a colon and [^|]* matches 0 or more non-pipe characters
1st solution: With awk you could try following awk program.
awk 'match($0,/:[^|]*/){print substr($0,1,RSTART-1) substr($0,RSTART+RLENGTH)}' Input_file
Explanation: Using match function of awk, where matching from : to till first occurrence of | here. So what match function does is, whenever a regex is matched in it, it will SET values for its OOTB variables named RSTART and RLENGTH, so based on that we are printing sub-string to neglect matched part and print everything else as per required output in question.
2nd solution: Using FPAT option in GNU awk, try following, written and tested with your shown samples only.
awk -v FPAT=':[^|]*' '{print $1,$2}' Input_file

sed extract part of string from a file

I've ben trying to extract only part of string from a file looking like this:
str1=USER_NAME
str2=justAstring
str3=https://product.org/v-4.5-bin.zip
str4=USER_HOME
I need to extract ONLY the version - in this case: 4.5
I did it by grep and then sed but now the output is 4.5-bin.zip
-> grep str3 file.txt
str3=https://product.org/v-4.5-bin.zip
-> echo str3=https://product.org/v-4.5-bin.zip | sed -n "s/^.*v-\(\S*\)/\1/p"
4.5-bin.zip
What should I do in order to remove also the -bin.zip at the end?
Thanks.
1st solution: With your shown samples, please try following sed code.
sed -n '/^str3=/s/.*-\([^-]*\)-.*/\1/p' Input_file
Explanation: Using sed's -n option which will STOP printing of values by default, to only print matched part. In main program checking condition if line starts from str3= then perform substitution there. In substitution catching everything between 1st - and next - in a capturing group and substituting whole line with it by using \1 and printing the matched portion only by using p option.
2nd solution: Using GNU grep you could try following grep program.
grep -oP '^str3=.*?-\K([^-]*)' Input_file
3rd solution: Using awk program for getting expected output as per shown smaples.
awk -F'-' '/^str3=/{print $2}' Input_file
4th solution: Using awk's match function to get expected results with help of using RSTART and RLENGTH variables which get set once a TRUE match is found by match function.
awk 'match($0,/^str3=.*-/){split(substr($0,RSTART,RLENGTH),arr,"-");print arr[2]}' Input_file
If you know the version contains just digits and dots, replace \S by [0-9.]. Also, match the remaining characters outside of the capture group to get it removed.
sed -n 's/^.*v-\([0-9.]*\).*/\1/p'

Making bash output a certain word from a .txt file

I have a question on Bash:
Like the title says, I require bash to output a certain word, depending on where it is in the file. In my explicit example I have a simple .txt file.
I already found out that you can count the number of words within a file with the command:
wc -w < myFile.txt
An output example would be:
78501
There certainly is also a way to make "cat" to only show word number x. Something like:
cat myFile.txt | wordno. 3125
desired-word
Notice, that I will welcome any command, that gets this done, not only cat.
Alternatively or in addition, I would be happy to know how you can make certain characters in a file show, based on their place in it. Something like:
cat myFile.txt | characterno. 2342
desired-character
I already know how you can achieve this with a variable:
a="hello, how are you"
echo ${a:9:1}
w
Only problem is a variable can only be so long. Is it as long as a whole .txt file, it won't work.
I look forward to your answers!
You could use awkfor this job it splits the string at spaces and prints the $wordnumber stringpart and tr is used to remove newlines
cat myFile.txt | tr -d '\n' | awk -v wordnumber=5 '{ print $wordnumber }'
And if you want the for example 5th. character you could do this like so
head -c 5 myFile.txt | tail -c 1
Since you have NOT shown samples of Input_file or expected output so couldn't test it. You could simply do this with awk as follows could be an example.
awk 'FNR==1{print substr($0,2342,1);next}' Input_file
Where we are telling awk to look for 1st line FNR==1 and in substr where we tell awk to take character 2342 and next 1 means from that position take only 1 character you could increase its value or keep it as per your need too.
With gawk:
awk 'BEGIN{RS="[[:space:]]+"} NR==12345' file
or
gawk 'NR==12345' RS="[[:space:]]+" file
I'm setting the record separator to a sequences of spaces which includes newlines and then print the 12345th record.
To improve the average performance you can exit the script once the match is found:
gawk 'BEGIN{RS="[[:space:]]+"}NR==12345{print;exit}' file

Sed doubts when n occurrences are used

I'm trying to replace the nth occurrence of a substring in a file. I tried to achieve this using sed but all attempts failed to give me the desired output. Some of the attempts are:
sed 's/old/new/g'
sed 's/old/new/3'
sed 's/old/new/3g'
The most common usage of sed is to perform a replacement such as
sed 's/foo/bar/' file
This will replace the first occurrence of the string foo by the string bar and it will do this for every line in file.
If you want to replace the 3rd occurrence of the string foo only, but do this for every line, then you can write:
sed 's/foo/bar/3' file.
Finally, if you want to replace all occurrences, then you use :
sed 's/foo/bar/g' file.
Any combination such as
sed 's/foo/bar/3g' file
results in unspecified behaviour.
If you want to replace the nth occurrence in a file than sed is not the right tool, but perl or awk might be better.
If you know you have maximum one occurrence of "foo" per line, you can do
awk '/foo/{c++}(c==n){sub("foo","bar")}1' file
If more than a single occurrence per line might appear it becomes a bit more tricky, various solutions are possible:
awk 'BEGIN{FS="foo";OFS="bar";n=5}
(c<n) && (c+NF-1>=n) {
for(i=1;i<NF;++i) printf $i ((++c==n) ? OFS : FS); print $NF; next
}
{c+=NF-1; print}' file

Shell Script Replace a Specified Column with sed

I have a example dataset separated by semicolon as below;
123;IZMIR;ZMIR;123
abc;ANKAR;aaa;999
AAA;ZMIR;ZMIR;bob
BBB;ANKR;RRRR;ABC
I would like to replace values in a specified column. Lets say I want to change "ZMIR" AS "IZMIR" but only for the third column, the ones on the second column must stay the same.
Desired output is;
123;IZMIR;IZMIR;123
abc;ANKAR;aaa;999
AAA;ZMIR;IZMIR;bob
BBB;ANKR;RRRR;ABC
I tried;
sed 's/;ZMIR;/;IZMIR;/' file.txt
the problem is that it changes all the values on the file not just the 3rd one.
I also tried;
awk -F";" '{gsub("ZMIR",";IZMIR;",$2)}1'
and here it specifies the column but, it somehow adds spaces;
123 I;IZMIR; ZMIR 123
abc;ANKAR;aaa;999
AAA ;IZMIR; ZMIR bob
BBB;ANKR;RRRR;ABC
sed doesn't know about columns, awk does (but in awk they're called "fields"):
awk 'BEGIN{FS=OFS=";"} $3=="ZMIR"{$3="IZMIR"} 1' file
Note that since the above is doing a literal string search and replace, you don't have to worry about regexp or backreference metacharacters in the search or replacement strings, unlike in a sed solution (see https://stackoverflow.com/a/29626460/1745001).
wrt what you tried previously with awk:
awk -F";" '{gsub("ZMIR",";IZMIR;",$2)}1'
That says: find "ZMIR" in the 2nd semi-colon-separated field and replace it with ";IZMIR;" and also change every existing ";" on the line to a blank character.
To learn awk, read the book Effective Awk Programming, 4th Edition, by Arnold Robbins.
If you exactly know where the word to replace is located and how many of them are in that line you could use sed with something like:
sed '3 s/ZMIR/IZMIR/2'
With the 3 in the beginning you are selecting the third line and with the 2 in the end the second occurrence. However the awk solution is a better one. But just that you know how it works in sed ;)
This might work for you (GNU sed):
sed -r 's/[^;]+/\n&\n/3;s/\nZMIR\n/IZMIR/;s/\n//g' file
Surround the required field by unique markers then replace the required string (plus markers) by the replacement string. Finally remove the unique markers.
Perl on Command Line
Input
123;IZMIR;ZMIR;123
000;ANKAR;aaa;999
AAA;ZMIR;ZMIR;bob
BBB;ANKR;RRRR;ABC
$. == 1 means first row it does the work only for this row So second row $. == 2
$F[0] means first column and it only does on this column So fourth column $F[3]
-a -F\; means that delimiter is ;
what you want
perl -a -F\; -pe 's/$F[0]/***/ if $. == 1' your-file
output
***;IZMIR;ZMIR;123
abc;ANKAR;aaa;999
AAA;ZMIR;ZMIR;bob
BBB;ANKR;RRRR;ABC
for row == 2 and column == 2
perl -a -F\; -pe 's/$F[1]/***/ if $. == 2' your-file
123;IZMIR;ZMIR;123
abc;***;aaa;999
AAA;ZMIR;ZMIR;bob
BBB;ANKR;RRRR;ABC
Also without -a -F
perl -pe 's/123/***/ if $. == 1' your-file
output
***;IZMIR;ZMIR;123
abc;ANKAR;aaa;999
AAA;ZMIR;ZMIR;bob
BBB;ANKR;RRRR;ABC
If you want to edit you can add -i option that means Edit in-place And that's it, it simply find, replace and save in the same file
perl -i -a -F\; and so on
You need to include some absolute references in the line:
^ for beginning of the line
unequivocal separation pattern
^.*ZMIR and [^;]*;ZMIR give different values where first take everything before ZMIR and sed take the longest possible
Specific
sed 's/^\([^;]*;[^;]*;\)ZMIR;/\1IZMIR;/' YourFile
generic where Old and New are batch variable (Remember, this is regex value so regex rules to apply like escaping some char)
#Old='ZMIR'
#New='IZMIR'
sed 's/^\(\([^;]*;\)\{2\}\)'${Old}';/\1'${New}';/' YourFile
In this simple case sed is an alternative, but awk is better for a complex or long line.

Resources