I'm trying to achieve the following results:
I have a command that spits out some env vars to the terminal:
./script will output the following:
AWS_OKTA_PROFILE=xxx
AWS_ACCESS_KEY_ID=xxx
AWS_SECRET_ACCESS_KEY=xx
AWS_SECURITY_TOKEN=xx
AWS_SESSION_TOKEN=xx
I want to write this to an output file in another location to be almost the same except like this (but only before the equal sign)
[default]
aws_okta_profile=xxx
aws_access_key_id=xxx
aws_secret_access_key=xx
aws_security_token=xx
aws_session_token=xx
Notice, I'm also prepending [default] to the file.
Thanks!
In addition to sed, awk also provides a simple solution. You can use the '=' as the field-separator and simply convert the first field to lowercase with tolower() if the record contains an '=' sign. (or you can check NF>1 to check you have more than one field) The 1 at the end of the rule is simply short-hand for print. Putting it altogether, you can use
awk -F= -v OFS='=' '/=/{$1=tolower($1)}1' file
Example Use/Output
With your input in the file file, you would get:
$ awk -F= -v OFS='=' '/=/{$1=tolower($1)}1' file
[default]
aws_okta_profile=xxx
aws_access_key_id=xxx
aws_secret_access_key=xx
aws_security_token=xx
aws_session_token=xx
awk does not have an edit in-place mechanism (except by non-standard extension), so simply redirect the output to a new file, e.g.
$ awk -F= -v OFS='=' '/=/{$1=tolower($1)}1' file > newfile
The search and replace with lowercase can be done with sed like this:
#!/usr/bin/env -S sed -f
s/\([^=]\+\)=\([^=]\+\)/\L\1=\E\2/
s/\([^=]\+\)=\([^=]\+\)/: search regex pattern:
\([^=]\+\): capture group of 1 or more characters not an = sign,
=: followed by an = sign,
\([^=]\+\): followed by another captured group of 1 or more characters not an = sign.
/\L\1=\E\2/: Replace matches with this:
\L\1: Lowercase the captured group 1,
=: followed by an = sign,
\E\2: followed by captured group 2 with case unchanged.
Related
The text file is like this,
#एक
1के
अंकगणित8IU
अधोरेखाunderscore
$thatऔर
%redएकyellow
$चिह्न
अंडरस्कोर#_
The desired text file should be like,
#
1
8IU
underscore
$that
%redyellow
$
#_
This is what I have tried so far, using awk
awk -F"[अ-ह]*" '{print $1}' filename.txt
And the output that I am getting is,
#
1
$that
%red
$
and using this awk -F"[अ-ह]*" '{print $1,$2}' filename.txt and I am getting an output like this,
#
1 े
ं
ो
$that
%red yellow
$ ि
ं
Is there anyway to solve this in bash script?
Using perl:
$ perl -CSD -lpe 's/\p{Devanagari}+//g' input.txt
#
1
8IU
underscore
$that
%redyellow
$
#_
-CSD tells perl that standard streams and any opened files are encoded in UTF-8. -p loops over input files printing each line to standard output after executing the script given by -e. If you want to modify the file in place, add the -i option.
The regular expression matches any codepoints assigned to the Devanagari script in the Unicode standard and removes them. Use \P{Devanagari} to do the opposite and remove the non-Devanagari characters.
Using awk you can do:
awk '{sub(/[^\x00-\x7F]+/, "")} 1' file
#
1
8IU
underscore
$that
%redyellow
See documentation: https://www.gnu.org/software/gawk/manual/html_node/Bracket-Expressions.html
using [\x00-\x7F].
This matches all values numerically between zero and 127, which is the defined range of the ASCII character set. Use a complemented character list [^\x00-\x7F] to match any single-byte characters that are not in the ASCII range.
tr is a very good fit for this task:
LC_ALL=C tr -c -d '[:cntrl:][:graph:]' < input.txt
It sets the POSIX C locale environment so that only US English character set is valid.
Then instructs tr to -d delete -c complement [:cntrl:][:graph:], control and drawn characters classes (those not control or visible) characters. Since it is sets all the locale setting to C, all non-US-English characters are discarded.
I am trying to find all numbers in a json file and replace them with a half value of the original number using sed on mac. For example, here I search for 2010 and replace it with 1005:
file="data.json"
sed -i '' -E 's,([^0-9]|^)2010([^0-9]|$),\1 1005\2,g' "$file"
I would like to find all number instances, and replace them with half values of themselves. It would need to work on decimals, eg: 2009 would become 1004.5, 10.5 would become 5.25.
I'm aware this could take each individual number character, so perhaps it would need to find numbers with non-numerical characters either side of it.
edit: I would like it to be flexible and work on all forms of text files, not just JSON files. (.txt, .html, .rtf etc...)
You may use Perl with a regex with e modifier:
perl -pe 's{(?<!\d)(\d+(?:\.\d+)?)(?!\d)}{$1/2}ge' file
To modify the file inline, add -i option:
perl -i -pe 's{(?<!\d)(\d+(?:\.\d+)?)(?!\d)}{$1/2}ge' file
perl -pi.bak -e 's{(?<!\d)(\d+(?:\.\d+)?)(?!\d)}{$1/2}ge' file # To save a backup of the original file
See the online demo:
s="abc_2010_and+2009+or-10.5"
perl -pe 's{(?<!\d)(\d+(?:\.\d+)?)(?!\d)}{$1/2}ge' <<< "$s"
# => abc_1005_and+1004.5+or-5.25
The (?<!\d)(\d+(?:\.\d+)?)(?!\d) regex matches
(?<!\d) - no digit immediately to the left is allowed
(\d+(?:\.\d+)?) - Group 1 ($1): 1+ digits followed with an optional sequence of . and 1+ digits
(?!\d) - no digit immediately to the right is allowed.
The RHS - $1/2 - is an expression that divides the Group 1 value with 2. It is achieved through adding e modifier at the end of the regex.
With GNU awk for multi-char RS and RT it'd just be:
awk -v RS='[0-9]+([.][0-9]+)?' -v ORS= 'RT{$0=$0 RT/2} 1'
e.g borrowing #Wiktors example:
$ s="abc_2010_and+2009+or-10.5"
$ awk -v RS='[0-9]+([.][0-9]+)?' -v ORS= 'RT{$0=$0 RT/2} 1' <<< "$s"
abc_1005_and+1004.5+or-5.25
If you want to overwrite an input file then add -i inplace:
awk -i inplace -v RS...1' file
System: Linux. Bash 4.
I have the following file, which will be read into a script as a variable:
/path/sample_A.bam A 1
/path/sample_B.bam B 1
/path/sample_C1.bam C 1
/path/sample_C2.bam C 2
I want to append "_string" at the end of the filename of the first column, but before the extension (.bam). It's a bit trickier because of containing the path at the beginning of the name.
Desired output:
/path/sample_A_string.bam A 1
/path/sample_B_string.bam B 1
/path/sample_C1_string.bam C 1
/path/sample_C2_string.bam C 2
My attempt:
I did the following script (I ran: bash script.sh):
List=${1};
awk -F'\t' -vOFS='\t' '{ $1 = "${1%.bam}" "_string.bam" }1' < ${List} ;
And its output was:
${1%.bam}_string.bam
${1%.bam}_string.bam
${1%.bam}_string.bam
${1%.bam}_string.bam
Problem:
I followed the idea of using awk for this substitution as in this thread https://unix.stackexchange.com/questions/148114/how-to-add-words-to-an-existing-column , but the parameter expansion of ${1%.bam} it's clearly not being recognised by AWK as I intend. Does someone know the correct syntax for that part of code? That part was meant to mean "all the first entry of the first column, except the last part of .bam". I used ${1%.bam} because it works in Bash, but AWK it's another language and probably this differs. Thank you!
Note that the paramter expansion you applied on $1 won't apply inside awk as the entire command
body of the awk command is passed in '..' which sends content literally without applying any
shell parsing. Hence the string "${1%.bam}" is passed as-is to the first column.
You can do this completely in Awk
awk -F'\t' 'BEGIN { OFS = FS }{ n=split($1, arr, "."); $1 = arr[1]"_string."arr[2] }1' file
The code basically splits the content of $1 with delimiter . into an array arr in the context of Awk. So the part of the string upto the first . is stored in arr[1] and the subsequent split fields are stored in the next array indices. We re-construct the filename of your choice by concatenating the array entries with the _string in the filename part without extension.
If I understood your requirement correctly, could you please try following.
val="_string"
awk -v value="$val" '{sub(".bam",value"&")} 1' Input_file
Brief explanation: -v value means passing shell variable named val value to awk variable variable here. Then using sub function of awk to substitute string .bam with string value along with .bam value which is denoted by & too. Then mentioning 1 means print edited/non-edtied line.
Why OP's attempt didn't work: Dear, OP. in awk we can't pass variables of shell directly without mentioning them in awk language. So what you are trying will NOT take it as an awk variable rather than it will take it as a string and printing it as it is. I have mentioned in my explanation above how to define shell variables in awk too.
NOTE: In case you have multiple occurences of .bam then please change sub to gsub in above code. Also in case your Input_file is TAB delmited then use awk -F'\t' in above code.
sed -i 's/\.bam/_string\.bam/g' myfile.txt
It's a single line with sed. Just replace the .bam with _string.bam
You can try this way with awk :
awk -v a='_string' 'BEGIN{FS=OFS="."}{$1=$1 a}1' infile
I'm fairly new to the world of writing Bash scripts and am needing some guidance. I've begun writing a script for work, and so far so good. However, I'm now at a part that needs to collect database names. The names are actually stored in a file, and I can grep them.
The command I was given is cat /etc/oratab which produces something like this:
# This file is used by ORACLE utilities. It is created by root.sh
# and updated by the Database Configuration Assistant when creating
# a database.
# A colon, ':', is used as the field terminator. A new line terminates
# the entry. Lines beginning with a pound sign, '#', are comments.
#
# The first and second fields are the system identifier and home
# directory of the database respectively. The third filed indicates
# to the dbstart utility that the database should , "Y", or should not,
# "N", be brought up at system boot time.
#
OEM:/software/oracle/agent/agent12c/core/12.1.0.3.0:N
*:/software/oracle/agent/agent11g:N
dev068:/software/oracle/ora-10.02.00.04.11:Y
dev299:/software/oracle/ora-10.02.00.04.11:Y
xtst036:/software/oracle/ora-10.02.00.04.11:Y
xtst161:/software/oracle/ora-10.02.00.04.11:Y
dev360:/software/oracle/ora-11.02.00.04.02:Y
dev361:/software/oracle/ora-11.02.00.04.02:Y
xtst215:/software/oracle/ora-11.02.00.04.02:Y
xtst216:/software/oracle/ora-11.02.00.04.02:Y
dev298:/software/oracle/ora-11.02.00.04.03:Y
xtst160:/software/oracle/ora-11.02.00.04.03:Y
I turn turned around and wrote grep ":/software/oracle/ora" /etc/oratab so it can grab everything I need, which is 10 databases. Not the most elegant way, but it gets what I need:
dev068:/software/oracle/ora-10.02.00.04.11:Y
dev299:/software/oracle/ora-10.02.00.04.11:Y
xtst036:/software/oracle/ora-10.02.00.04.11:Y
xtst161:/software/oracle/ora-10.02.00.04.11:Y
dev360:/software/oracle/ora-11.02.00.04.02:Y
dev361:/software/oracle/ora-11.02.00.04.02:Y
xtst215:/software/oracle/ora-11.02.00.04.02:Y
xtst216:/software/oracle/ora-11.02.00.04.02:Y
dev298:/software/oracle/ora-11.02.00.04.03:Y
xtst160:/software/oracle/ora-11.02.00.04.03:Y
So, if I want to grab the name, such as dev068 or xtst161, how do I? I think for what I need to do with this project moving forward, is storing them in an array. As mentioned in the documentation, a colon is the field terminator. How could I whip this together so I have an array, something like:
dev068
dev299
xtst036
xtst161
dev360
dev361
xtst215
xtst216
dev298
xtst160
I feel like I may be asking for too much assistance here but I'm truly at a loss. I would be happy to clarify if need be.
It is much simpler using awk:
awk -F: -v key='/software/oracle/ora' '$2 ~ key{print $1}' /etc/oratab
dev068
dev299
xtst036
xtst161
dev360
dev361
xtst215
xtst216
dev298
xtst160
To populate a BASH array with above output use:
mapfile -t arr < <(awk -F: -v key='/software/oracle/ora' '$2 ~ key{print $1}' /etc/oratab)
To check output:
declare -p arr
declare -a arr='([0]="dev068" [1]="dev299" [2]="xtst036" [3]="xtst161" [4]="dev360" [5]="dev361" [6]="xtst215" [7]="xtst216" [8]="dev298" [9]="xtst160")'
We can pipe the output of grep to the cut utility to extract the first field, taking colon as the field separator.
Then, assuming there are no whitespace or glob characters in any of the names (which would be subject to word splitting and filename expansion), we can use a command substitution to run the pipeline, and capture the output in an array by assigning it within the parentheses.
names=($(grep ':/software/oracle/ora' /etc/oratab| cut -d: -f1;));
Note that the above command actually makes use of word splitting on the command substitution output to split the names into separate elements of the resulting array. That is why we must be sure that no whitespace occurs within any single database name, otherwise that name would be internally split into separate elements of the array. The only characters within the command substitution output that we want to be taken as word splitting delimiters are the line feeds that delimit each line of output coming off the cut utility.
You could also use awk for this:
awk -F: '!/^#/ && $2 ~ /^\/software\/oracle\/ora-/ {print $1}' /etc/oratab
The first pattern excludes any commented-out lines (starting with a #). The second pattern looks for your expected directory pattern in the second field. If both conditions are met it prints the first field, which the Oracle SID. The -F: flag sets the field delimiter to a colon.
With your file that gets:
dev068
dev299
xtst036
xtst161
dev360
dev361
xtst215
xtst216
dev298
xtst160
Depending on what you're doing you could finesse it further and check the last flag is set to Y; although that is really to indicate automatic start-up, it can sometime be used to indicate that a database isn't active at all.
And you can put the results into an array with:
declare -a DBS=(`awk -F: -v key='/software/oracle/ora' '$2 ~ key{print $1}' /etc/oratab`)
and then refer to ${DBS[1]} (which evaluates to dev299) etc.
If you'd like them into a Bash array:
$ cat > toarr.bash
#!/bin/bash
while read -r line
do
if [[ $line =~ .*Y$ ]] # they seem to end in a "Y"
then
arr[$((i++))]=${line%%:*}
fi
done < file
echo ${arr[*]} # here we print the array arr
$ bash toarr.bash
dev068 dev299 xtst036 xtst161 dev360 dev361 xtst215 xtst216 dev298 xtst160
I have a file with an argument
testArgument=
It could have something equal to it or nothing but I want to comment it and add the new line with supplied info
Before:
testArgument=Something
Results:
#testVariable=Something
#Comments to let the user know of why the change
testVariable=NewSomething
Should I loop it or should I use something like sed? I need it to be compatible for Ubuntu and Debian and bash.
You could use sed like this:
sed 's/^\(testArgument\)=.*/#&\n\n#Comment here\n\1=NewSomething/' file
& prints the full match in the replacement and \1 refers to the first capture group "testArgument".
To perform the substitution on the file in-place (i.e. replace the contents of the original file), add the -i switch. Otherwise, if you want to output the command to a new file, do sed '...' file > newfile.
If you are using a different version of sed that doesn't support \n newlines in the replacement, see this answer for some ways to deal with it.
Alternatively, using GNU awk:
gawk '/^testArgument/ {$0 = gensub(/^(testArgument)=.*/, "#\\0\n\n#Comment here\n\\1=NewSomething", 1)}1' file
You can use awk
awk '/^testArugment/ {$0="#"$0"\n\n#Comments to let the user know of why the change\ntestVariable=NewSomething"}1' file
cat file
some data
testArugment=Something
more data
awk '/^testArugment/ {$0="#"$0"\n\n#Comments to let the user know of why the change\ntestVariable=NewSomething"}1' file
some data
#testArugment=Something
#Comments to let the user know of why the change
testVariable=NewSomething
more data
To change the original file
awk 'code....' file > tmp && mv tmp file