Copy text from one line and create a new line with that next under it - bash

I have a text file in which I want to find all of ID:= "abc123" when it finds that I want it to take that value of abc123 and create a new line and have a set string, newId:= "abc123 How can I do this within terminal?
I'd like to use bash, below are some examples, find the string '"ID": ", copy the value (abc123) and make a new line with this data.
"ID": "abc123"
"newID": "abc123"

You can do this:
sed -e 's/^"ID": "\(.*\)"/&\
"newID": "\1"/' myfile.txt
First, I'll try to explain the regular expression that searches for matches:
^ Matches the start of the line
"ID": " Matches that exact string
\(.*\) Matches a sequence of zero or more (*) of any character (.). Placing this expression between backslashed parenthesis creates a "capture", which allows us to store the resulting part of the match into an auxiliary variable \1.
" Matches the double-quote character
When it finds a match, it replaces it with:
& the match itself. This operator is an auxiliary variable that represents what was matched.
\<new-line> the backslash followed by an actual new line character escapes a new line, ie. it allows us to print a new line character into the replacement
"newId": " prints that exact string
\1 prints the contents of our capture, so it prints the ID we found
" prints a double quote character.
Hope this helps =)

Try doing this :
sed -r 's#^"ID": "([a-Z0-9]+)"#"newID": "\1"#' file.txt
sed : the executable
-r : extented mode (no need to backslash parenthesis)
s : we perform a substitution, skeleton is s#origin#replacement# (the separator can be anything)
^ : means start of line in regex
( ) : parenthesis is a capture
"newID": is the start of the new string
\1 : is the end of the substituted string (the captured string)

Considering your question is very vague I made some assumptions which will become apparent in my implementation.
INPUT FILE -- call it t
ID="one"
dkkd
ID="two"
ffkjf
ID="three"
ldl
Command ran on input file
for line in `cat t`; do newID=`echo $line | grep ID | cut -d= -f2`; if [[ "$newID" != "" ]]; then echo $line >> t2; echo newID=$newID >> t2; else echo $line >> t2; fi; done
OUTPUT FILE -- Name is t2 (apparent from the command)
ID="one"
newID="one"
dkkd
ID="two"
newID="two"
ffkjf
ID="three"
newID="three"
ldl
Basically this command goes line by line in the file (in this case called t) looks for an ID line. If it finds one it gets its value, prints the original line with the ID and then prints another one with a newID following right after. If the line in question does not have and ID then it just prints the line it self.
Things to note:
If you have any other line in the file that contains "ID" in it but is not the normal ID that you requested, this will not work.

Related

How to find string from txt file and store in variable in BASH

Requirement is to find a string from txt file and store it to variable.
file look like this(rsa.txt)
Encrypting String
... Input string : Test_123
... Encrypted string : $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
Required output (variable name : encstring):
encstring = $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
I tried below code but showing no result
encstring=$(grep -oE '$ENC[^()]*==)' <<< rsa.txt)
With awk, could you please try following. Simply, search for string /Encrypted string along with a condition to check if last field of that line has $ENC in it then last field for that line by using $NF.
encstring=$(awk '/Encrypted string/ && $NF~/\$ENC/{print $NF}'
You can use
encstring=$(sed -n 's/.*\(\$ENC(.*)\).*/\1/p' rsa.txt)
# OR
encstring=$(grep -oP '\$ENC\(.*?\)' rsa.txt)
See an online demo:
s='Encrypting String
... Input string : Test_123
... Encrypted string : $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)'
encstring=$(sed -n 's/.*\(\$ENC(.*)\).*/\1/p' <<< "$s")
echo "$encstring"
# => $ENC(JEVOQyhZbVpkQmM0L3ArT2c4M05TZks5TmxRPT1+KQ==)
The sed -n 's/.*\(\$ENC(.*)\).*/\1/p' command does the following:
-n suppresses the default line output
s/.*\(\$ENC(.*)\).*/\1/ - finds any text, then captures $ENC(...) into Group 1 and then matches the rest of the string, and replaces the match with the Group 1 value
p - prints the result of the substitution.
The grep -oP '\$ENC\(.*?\)' command extracts all $ENC(...) matches, with any chars, as few as possible, between ( and ).
You are searching for ENC which is followed by 0 or more occurances of something which is not an open or closed parenthesis. However, in your input file, there is an open parenthese after ENC. Therefore [^()]* matches the null string. After this you expect the string ==). This would match only for the input ENC==)`.
You need to escape $ as \$ as it means "end of string" with -E

How to replace lower case with sed

SET_VALUE(ab.ms.r.gms_dil_cfg.f().gms_dil_mode, dsad_sd );
How can I use sed to replace only from the SET_VALUE until the , with each letter after _ to be upper case?
result:
SET_VALUE(ab.ms.r.gmsDilCfg.f().gmsDilMode, dsad_sd );
For your input string you may apply the following sed expression + bash variable substitution:
s="SET_VALUE(ab.ms.r.gms_dil_cfg.f().gms_dil_mode, dsad sd )"
res=$(sed '1s/_\([a-z]\)/\U\1/g;' <<< "${s%,*}"),${s#*,}
echo "$res"
The output:
SET_VALUE(ab.ms.r.gmsDilCfg.f().gmsDilMode, dsad_sd );
Got distracted while writing this one up so Roman beat me to the punch, but this has a slight variation so figured I'd post it as another option ...
$ s="SET_VALUE(ab.ms.r.gms_dil_cfg.f().gms_dil_mode, dsad_sd );"
$ sed 's/,/,\n/g' <<< "$s" | sed -n '1{s/_\([a-z]\)/\U\1/g;N;s/\n//;p}'
SET_VALUE(ab.ms.r.gmsDilCfg.f().gmsDilMode, dsad_sd );
s/,/,\n/g : break input into separate lines at the comma (leave comma on first line, push rest of input to a second line)
at this point we've broken our input into 2 lines; the second sed invocation will now be working with a 2-line input
sed -n : refrain from printing input lines as they're processed; we'll explicitly print lines when required
1{...} : for the first line, apply the commands inside the braces ...
s/_\([a-z]\)/\U\1/g : for each pattern we find like '_[a-z]', save the [a-z] in buffer #1, and replace the pattern with the upper case of the contents of buffer #1
at this point we've made the desired edits to line #1 (ie, everything before the comma in the original input), now ...
N : read and append the next line into the pattern space
s/\n// : replace the carriage return with a null character
at this point we've pasted lines #1 and #2 together into a single line
p : print the pattern space

Print, modify, print again Bash variable

I am looping over a CSV file. Each line of the file is formatted something like this (it's Open Street Maps data):
planet_85.287_27.665_51a5fb91,AcDbEntity:AcDbPolyline,{ [name] Purano
Bus Park-Thimi [type] route [route] microbus [ref] 10 } { [Id] 13.0
[Srid] 3857 [FieldsTableId]
This follows the format:
Layer,SubClasses,ExtendedEntity,Linetype,EntityHandle,Text
I want to add a new column for Name. I can find the name in a line by cutting off everything before [name] and after [. This code successfully creates a new-line delineated file of all of the names (which I open as a CSV and then copy-paste into the original file as a new column).
cat /path/to/myfile.csv | while read line
do
if [[ ${line} == *"name"* ]]
then
printf "$(echo $line | LC_ALL=C sed 's/^.*name\]//g'| LC_ALL=C cut -f1 -d'[') \n"
else
printf "\n"
fi
done >/path/to/newrow.csv
This system is clearly suboptimal - I would far prefer to print the entire final row. But when I replace that printf line with this:
printf "$line,$(echo $line | LC_ALL=C sed 's/^.*name\]//g'| LC_ALL=C cut -f1 -d'[') \n"
It prints the line but not the name. I've tried printing them in separate print statements, printing the line and then echoing the name, saving the name in a variable and then printing, and a number of other techniques, and each time I either a) only print the line, or b) print the name on a new line, which breaks the CSV format.
What am I doing wrong? How can I print the full original line with the name appended as a new column at the end?
NOTE: I am running this in Terminal on macOS Sierra on a MacBook Pro 15" Retina.
If I understand correctly, you want to extract the name between [name] and [type], and append as the new last CSV column. You can do that using capture groups:
sed -e 's/.*\[name\] \(.*\) \[type\].*/&,\1/' < input
Notice the \(.*\) in the middle. That captures the text between [name] and [type].
In the replacement string, & stands for the matched string, which is the entire line, as the pattern starts and ends with .*.
Next the , is a literal comma, and \1 stands for the content of the first capture group, the part matched within \(...\).

shell: how to read a certain column in a certain line into a variable

I want to extract the first column of the last line of a text file. Instead of output the content of interest in another file and read it in again, can I just use some command to read it into a variable directly?
For exampole, if my file is like this:
...
123 456 789(this is the last line)
What I want is to read 123 into a variable in my shell script. How can I do that?
One approach is to extract the line you want, read its columns into an array, and emit the array element you want.
For the last line:
#!/bin/bash
# ^^^^- not /bin/sh, to enable arrays and process substitution
read -r -a columns < <(tail -n 1 "$filename") # put last line's columns into an array
echo "${columns[0]}" # emit the first column
Alternately, awk is an appropriate tool for the job:
line=2
column=1
var=$(awk -v line="$line" -v col="$column" 'NR == line { print $col }' <"$filename")
echo "Extracted the value: $var"
That said, if you're looking for a line close to the start of a file, it's often faster (in a runtime-performance sense) and easier to stick to shell builtins. For instance, to take the third column of the second line of a file:
{
read -r _ # throw away first line
read -r _ _ value _ # extract third value of second line
} <"$filename"
This works by using _s as placeholders for values you don't want to read.
I guess with "first column", you mean "first word", do you?
If it is guaranteed, that the last line doesn't start with a space, you can do
tail -n 1 YOUR_FILE | cut -d ' ' -f 1
You could also use sed:
$> var=$(sed -nr '$s/(^[^ ]*).*/\1/p' "file.txt")
The -nr tells sed to not output data by default (-n) and use extended regular expressions (-r to avoid needing to escape the paranthesis otherwise you have to write \( \))). The $ is an address that specifies the last line. The regular expression anchors the beginning of the line with the first ^, then matches everything that is not a space [^ ]* and puts that the result into a capture group ( ) and then gets rid of the rest of the line .* by replacing the line with the capture group \1, then print p to print the line.

how to get substring, starting from the first occurence of a pattern in bash

I'm trying to get a substring from the start of a pattern.
I would simply use cut, but it wouldn't work if the pattern is a few characters long.
if I needed a single-character, delimiter, then this would do the trick:
result=`echo "test String with ( element in parenthesis ) end" | cut -d "(" -f 2-`
edit: sample tests:
INPUT: ("This test String is an input", "in")
OUTPUT: "ing is an input"
INPUT: ("This test string is an input", "in ")
OUTPUT: ""
INPUT: ("This test string is an input", "n")
OUTPUT: "ng is an input"
note: the parenthesis mean that the input both takes a string, and a delimiter string.
EDITED:
In conclusion, what was requested was a way to parse out the text from a string beginning at a particular substring and ending at the end of the line. As mentioned, there are numerous ways to do this. Here's one...
egrep -o "DELIM.*" input
... where 'DELIM' is the desired substring.
Also
awk -v delim="in" '{print substr($0, index($0, delim))}'
This can be done without external programs. Assuming the string to be processed is in $string and the delimiter is DELIM:
result=${string#"${string%%DELIM*}"}
The inner part substitutes $string with everything starting from the first occurrence of DELIM (if any) removed. The outer part then removes that value from the start of $string, leaving everything starting from the first occurrence of DELIM or the empty string if DELIM does not occur. (The variable string remains unchanged.)

Resources