sed Pattern to Insert Comma after a character - bash

Add comma after closing bracket } using bash.
Using
sed 's/variable/&,/g;s/,$//'
adds comma after variable, however,
sed 's/}/&,/g;s/,$//'
doesn't work.
Input:
variable "policy_name1" {
description = "abc xyz"
type = string
default = [
"test1"
"test2"
]
}
variable "policy_name2" {
description = "abc xyz"
type = bool
default = false
}
Output:
variable "policy_name1" {
description = "abc xyz"
type = string
default = [
"test1"
"test2"
]
},
variable "policy_name2" {
description = "abc xyz"
type = bool
default = false
}

sed Add a comma after curly bracket, but not on last line:
sed -e '$!s/^ \{4\}}$/&,/'
$!s/ Not on last line replace
^ \{4\}}$ lines begin with 4 space, followed and terminated by }
/&,/ by matched content, followed by ,.
Will render:
variable "policy_name1" {
description = "abc xyz"
type = array
default = [
"test1"
"test2"
]
},
variable "policy_name2" {
description = "abc xyz"
type = bool
default = false
},
variable "policy_name3" {
description = "simple test string..."
type = int
default = 42
}
More complex: Add comma on each lines but not on last for each block:
sed -e ':a;N;/[^{\[(,] *\n/{/\n[ \o11]*[]})]/!s/\n/,\n/};P;D;$!ba'
Could render:
variable "policy_name1" {
description = "abc xyz",
type = array,
default = [
"test1",
"test2"
]
},
variable "policy_name2" {
description = "abc xyz",
type = bool,
default = false
},
variable "policy_name3" {
description = "simple test string...",
type = int,
default = 42
}
:a; Label "a" for furthen branch
N; Merge with next line
/[^{\[(,] *\n/ If buffer contain other than comma or open bracket, followed by newline
{ then do block:
/\n[ \o11]*[]})]/! If NOT buffer contain newline followed by space and other then close bracket
s/\n/,\n/ then place a comma before newline
}; end of block
P; print buffer up to newline
D; delete buffer up to newline
$!ba If not on last line, branch to "a"

Here is what you can do,
#!/usr/bin/env bash
FILENAME="test.tf"
COUNT=`wc -l $FILENAME | awk '{ print $1 }'`
COUNT=`expr $COUNT - 1`
sed "1,$COUNT s/}/},/" "$FILENAME"
I have provided a bash script so that we can avoid the last line in a right way. This script will append a comma whenever it finds an ending curly bracket ( } ) except for when it is last line.

Related

AWK print block that does NOT contain specific text

I have the following data file:
variable "ARM_CLIENT_ID" {
description = "Client ID for Service Principal"
}
variable "ARM_CLIENT_SECRET" {
description = "Client Secret for Service Principal"
}
# [.....loads of code]
variable "logging_settings" {
description = "Logging settings from TFVARs"
}
variable "azure_firewall_nat_rule_collections" {
default = {}
}
variable "azure_firewall_network_rule_collections" {
default = {}
}
variable "azure_firewall_application_rule_collections" {
default = {}
}
variable "build_route_tables" {
description = "List of Route Table keys that need direct internet prior to Egress FW build"
default = [
"shared_services",
"sub_to_afw"
]
}
There's a 2 things I wish to do:
print the variable names without the inverted commas
ONLY print the variables names if the code block does NOT contain default
I know I can print the variables names like so: awk '{ gsub("\"", "") }; (/variable/ && $2 !~ /^ARM_/) { print $2}'
I know I can print the code blocks with: awk '/variable/,/^}/', which results:
# [.....loads of code output before this]
variable "logging_settings" {
description = "Logging settings from TFVARs"
}
variable "azure_firewall_nat_rule_collections" {
default = {}
}
variable "azure_firewall_network_rule_collections" {
default = {}
}
variable "azure_firewall_application_rule_collections" {
default = {}
}
variable "build_route_tables" {
description = "List of Route Table keys that need direct internet prior to Egress FW build"
default = [
"shared_services",
"sub_to_afw"
]
}
However, I cannot find out how to print the code blocks "if" they don't contain default. I know I will need to use an if statement, and some variables perhaps, but I am unsure as of how.
This code block should NOT appear in the output for which I grab the variable name:
variable "build_route_tables" {
description = "List of Route Table keys that need direct internet prior to Egress FW build"
default = [
"shared_services",
"sub_to_afw"
]
}
End output should NOT contain those that had default:
# [.....loads of code output before this]
expressroute_settings
firewall_settings
global_settings
peering_settings
vnet_transit_object
vnet_shared_services_object
route_tables
logging_settings
Preferable I would like to keep this a single AWK command or file, no piping. I have uses for this that do prefer no piping.
EDIT: update the ideal outputs (missed some examples of those with default)
Assumptions and collection of notes from OP's question and comments:
all variable definition blocks end with a right brace (}) in the first column of a new line
we only display variable names (sans the double quotes)
we do not display the variable names if the body of the variable definition contains the string default
we do not display the variable name if it starts with the string ARM_
One (somewhat verbose) awk solution:
NOTE: I've copied the sample input data into my local file variables.dat
awk -F'"' ' # use double quotes as the input field separator
/^variable / && $2 !~ "^ARM_" { varname = $2 # if line starts with "^variable ", and field #2 is not like "^ARM_", save field #2 for later display
printme = 1 # enable our print flag
}
/variable/,/^}/ { if ( $0 ~ "default" ) # within the range of a variable definition, if we find the string "default" ...
printme = 0 # disable the print flag
next # skip to next line
}
printme { print varname # if the print flag is enabled then print the variable name and then ...
printme = 0 # disable the print flag
}
' variables.dat
This generates:
logging_settings
$ awk -v RS= '!/default =/{gsub(/"/,"",$2); print $2}' file
ARM_CLIENT_ID
ARM_CLIENT_SECRET
[.....loads
logging_settings
of course output doesn't match yours since it's inconsistent with the input data.
Using GNU awk:
awk -v RS="}" '/variable/ && !/default/ && !/ARN/ { var=gensub(/(^.*variable ")(.*)(".*{.*)/,"\\2",$0);print var }' file
Set the record separator to "}" and then check for records that contain "variable", don't contain default and don't contain "ARM". Use gensub to split the string into three sections based on regular expressions and set the variable var to the second section. Print the var variable.
Output:
logging_settings
Another variation on awk using skip variable to control the array index holding the variable names:
awk '
/^[[:blank:]]*#/ { next }
$1=="variable" { gsub(/["]/,"",$2); vars[skip?n:++n]=$2; skip=0 }
$1=="default" { skip=1 }
END { if (skip) n--; for(i=1; i<=n; i++) print vars[i] }
' code
The first rule just skips comment lines. If you want to skip "ARM_" variables, then you can add a test on $2.
Example Use/Output
With your example code in code, all variables without default are:
$ awk '
> /^[[:blank:]]*#/ { next }
> $1=="variable" { gsub(/["]/,"",$2); vars[skip?n:++n]=$2; skip=0 }
> $1=="default" { skip=1 }
> END { if (skip) n--; for(i=1; i<=n; i++) print vars[i] }
> ' code
ARM_CLIENT_ID
ARM_CLIENT_SECRET
logging_settings
Here's another maybe shorter solution.
$ awk -F'"' '/^variable/&&$2!~/^ARM_/{v=$2} /default =/{v=0} /}/&&v{print v; v=0}' file
logging_settings

Extract json value on regex on bash script

How can i get the values inner depends in bash script?
manifest.py
# Commented lines
{
'category': 'Sales/Subscription',
'depends': [
'sale_subscription',
'sale_timesheet',
],
'auto_install': True,
}
Expected response:
sale_subscription sale_timesheet
The major problem is linebreak, i have already tried | grep depends but i can not get the sale_timesheet value.
Im trying to add this values comming from files into a var, like:
DOWNLOADED_DEPS=($(ls -A $DOWNLOADED_APPS | while read -r file; do cat $DOWNLOADED_APPS/$file/__manifest__.py | [get depends value])
Example updated.
If this is your JSON file:
{
"category": "Sales/Subscription",
"depends": [
"sale_subscription",
"sale_timesheet"
],
"auto_install": true
}
You can get the desired result using jq like this:
jq -r '.depends | join(" ")' YOURFILE.json
This uses .depends to extract the value from the depends field, pipes it to join(" ") to join the array with a single space in between, and uses -r for raw (unquoted) output.
If it is not a json file and only string then you can use below Regex to find the values. If it's json file then you can use other methods like Thomas suggested.
^'depends':\s*(?:\[\s*)(.*?)(?:\])$
demo
you can use egrep for this as follows:
% egrep -M '^\'depends\':\s*(?:\[\s*)(.*?)(?:\])$' pathTo\jsonFile.txt
you can read about grep
As #Thomas has pointed out in a comment, the OPs input data is not in JSON format:
$ cat manifest.py
# Commented lines // comments not allowed in JSON
{
'category': 'Sales/Subscription', // single quotes should be replaced by double quotes
'depends': [
'sale_subscription',
'sale_timesheet', // trailing comma at end of section not allowed
],
'auto_install': True, // trailing comma issue; should be lower case "true"
}
And while the title of the question mentions regex, there is no sign of a regex in the question. I'll leave a regex based solution for someone else to come up with and instead ...
One (quite verbose) awk solution based on the input looking exactly like what's in the question:
$ awk -F"'" ' # use single quote as field separator
/depends/ { printme=1 ; next } # if we see the string "depends" then set printme=1
printme && /]/ { printme=0 ; next} # if printme=1 and line contains a right bracket then set printme=0
printme { printf pfx $2; pfx=" " } # if printme=1 then print a prefix + field #2;
# first time around pfx is undefined;
# subsequent passes will find pfx set to a space;
# since using "printf" with no "\n" in sight, all output will stay on a single line
END { print "" } # add a linefeed on the end of our output
' json.dat
This generates:
sale_subscription sale_timesheet

Remove duplicate lines in braces

I a have file that contains:
l1_lololo {
abcdef
vgjhklfgkchbnvu
gfuhjfythkjbgftyhkjgyftuihgt6
deefgik
abcdef
}
l2_blabla {
123456
vgghyfthjfgtrdygfhhbnvu
gfuhjgvftdyfgvjgyftuihgt6
deiulouk
123456
}
I need to check text in braces with sed/awk/bash/etc. and remove duplicates of lines, leaving only first of recurring line in each braces, I need to get this:
l1_lololo {
abcdef
vgjhklfgkchbnvu
gfuhjfythkjbgftyhkjgyftuihgt6
deefgik
}
l2_blabla {
123456
vgghyfthjfgtrdygfhhbnvu
gfuhjgvftdyfgvjgyftuihgt6
deiulouk
}
How I can do this?
If you can guarantee that the blocks end with a line containing only }, it could be done as simply as:
awk '/^}$/ {delete a} !a[$0]++' input
If you need a more robust solution, perhaps just add some whitespace to the pattern to match the end of a block. But if you want a full parser and want to match braces carefully, awk is probably not suited for the task.
If you're open to other languages, this is really easy to do in tcl thanks to the input being in tcl list format, allowing you to use it to do all the parsing without any potentially fragile regular expressions:
#!/usr/bin/env tclsh
package require Tcl 8.5
foreach {key lst} [read stdin] {
foreach item $lst { dict set seen $item 1 }
puts "$key {\n\t[join [dict keys $seen] \n\t]\n}\n"
unset seen
}
Example:
$ ./dedup < input.txt
l1_lololo {
abcdef
vgjhklfgkchbnvu
gfuhjfythkjbgftyhkjgyftuihgt6
deefgik
}
l2_blabla {
123456
vgghyfthjfgtrdygfhhbnvu
gfuhjgvftdyfgvjgyftuihgt6
deiulouk
}
Desired result can be achieved with following code (data stored in a hash)
use strict;
use warnings;
use feature 'say';
my $data = do{ local $/; <DATA> }; # read whole data
my %seen;
my %records = $data =~ /(\w+)\s+\{\s*(.*?)\s*\}/sg; # split into records
while( my($k,$v) = each %records ) { # for each record split into array
my #array = map { if( not $seen{$_} ) { $seen{$_} = 1; $_ } } split '\s+', $records{$k}; # store uniq elements
pop #array; # pop out last empty element
$records{$k} = \#array; # store array in hash
}
while( my($k,$v) = each %records ) { # each record
say "$k = {"; # output hash key
say "\t$_" for #{$v}; # output each element of array
say "}\n"; # done
}
__DATA__
l1_lololo {
abcdef
vgjhklfgkchbnvu
gfuhjfythkjbgftyhkjgyftuihgt6
deefgik
abcdef
}
l2_blabla {
123456
vgghyfthjfgtrdygfhhbnvu
gfuhjgvftdyfgvjgyftuihgt6
deiulouk
123456
}
Output
l1_lololo = {
abcdef
vgjhklfgkchbnvu
gfuhjfythkjbgftyhkjgyftuihgt6
deefgik
}
l2_blabla = {
123456
vgghyfthjfgtrdygfhhbnvu
gfuhjgvftdyfgvjgyftuihgt6
deiulouk
}
This might work for you (GNU sed):
sed -E '/^\S+ \{/{:a;N;s/((\n[^\n]*)(\n.*)*)\2$/\1/;/\n\}$/!ba}' file
If a line begins with some text followed by a {, append the next line and remove the last line if it matches a preceding line. Repeat the latter until a line containing only a } and print the result.

How to match string and print lines within curly braces { } from config file

I want to print the lines within { and } with assign where "mango" in Hostgroups
object Host "os.google.com" {
import "windows"
address = "linux.google.com"
groups = ["linux"]
}
object Host "mango.google.com" {
import "windows"
address = "mango.google.com"
groups = ["linux"]
assign where "mango" in Hostgroups
}
Desired output:
object Host "mango.google.com" {
import "windows"
address = "mango.google.com"
groups = ["linux"]
assign where "mango" in Hostgroups
}
Try this awk script
script.awk
/{/,/}/ { #define record range from { to }
if ($0 ~ "{") rec = $0; # if record opening reset rec variable with current line
else rec = rec "\n" $0; # else accumulate the current line in rec
if ($0 ~ /assign where "mango" in Hostgroups/) { # if found exit pattern in current line
print rec; # print the rec
exit; # terminate
}
}
executions:
awk -f script.awk input.txt
output:
object Host "mango.google.com" {
import "windows"
address = "mango.google.com"
groups = ["linux"]
assign where "mango" in Hostgroups
This might work for you (GNU sed):
sed -n '/{/h;//!H;/}/{g;/assign where "mango" in Hostgroups/p}' file
Turn off seds automatic printing using the -n option and gather up lines in the hold space between curly braces. Following the closing curly brace, replace it with the contents of the hold space and if there is a match for assign where "mango" in Hostgroup print it.
Assuming } doesn't appear in any other context in your input:
$ awk -v RS='}' '
/assign where "mango" in Hostgroups/ {
sub(/^[[:space:]]+\n/,"")
print $0 RS
}
' file
object Host "mango.google.com" {
import "windows"
address = "mango.google.com"
groups = ["linux"]
assign where "mango" in Hostgroups
}

bash: content of file to string var, preserve formatting, include char escaping if required

I have few lines of code in a file
(code has few new lines, tabs, string and pattern-string)
I want to get this content of file as a string value,
so that it can be sent as a string value of some parameter in json
{param1: "value1", code: "code-content-from-file-should-go-here"}
lets say file content is
function string.urlDecode(str)
if string.isEmpty(str) then return str end
str = string.gsub(str, "+", " ")
str = string.gsub(str, "%%(%x%x)", function(h) return string.char(tonumber(h, 16)) end)
str = string.gsub(str, "\r\n", "\n")
return str
end
which should get converted to (what I see here is newline, tabs, in general code formatting is preserved, " \ etc are escaped)
function string.urlDecode(str)\n if string.isEmpty(str) then return str end\n str = string.gsub(str, \"+\", \" \")\n str = string.gsub(str, \"%%(%x%x)\", function(h) return string.char(tonumber(h, 16)) end)\n str = string.gsub(str, \"\\r\\n\", \"\\n\")\n return str\nend
So that json becomes
{param1: "value1", code: "function string.urlDecode(str)\n if string.isEmpty(str) then return str end\n str = string.gsub(str, \"+\", \" \")\n str = string.gsub(str, \"%%(%x%x)\", function(h) return string.char(tonumber(h, 16)) end)\n str = string.gsub(str, \"\\r\\n\", \"\\n\")\n return str\nend"}
While conversion of file-content to string in above mentioned manner can be done
using sed (got from few related slackoverflow threads like How can I replace a newline (\n) using sed?),
but I will have to handle each scenario like newline, tabs, ", \, and if there are any other special characters that needs to be escaped (which I dont know)
Is there any bash command (or maybe python module) that can handle all such scenario's for code-content-from-file to string conversion?
As this sees like a common use case if someone wants to send code content in JSON
If content is in file.txt
function encode {
local input=$1
local output
for ((i=0;i<${#input};i+=1)); do
ic=${input:$i:1}
if [[ $ic = $'\n' ]]; then
oc='\n'
elif [[ $ic = '\' || $ic = '"' ]]; then
oc='\'$ic
# [[ $ic < $'\040' ]] # works only if LC_COLLATE=C or LC_ALL=C
elif (( $(printf "%d" "'$ic") < 32 )); then
oc='\0'$(printf "%02o" "'$ic")
else
oc=$ic
fi
output=$output$oc
done
echo "$output"
}
printf '{param1: "%s", code: "%s"}' "value1" "$(encode "$(<file.txt)")"

Resources