Sed print the items in square brackets that's after the occurence of a text - bash

I have the following Scenarios:
Scenario 1
foo_bar = ["123", "456", "789"]
Scenario 2
foo_bar = [
"123",
"456",
"789"
]
Scenario 3
variable "foo_bar" {
type = list(string)
default = ["123", "456", "789"]
}
So i'm trying to figure out how I can print with sed the items inside the brackets that are under foo_bar accounting scenario 2 which is a multiline
so the resulting matches here would be
Scenario 1
"123", "456", "789"
Scenario 2
"123",
"456",
"789"
Scenario 3
"123", "456", "789"
In the case of
not_foo_bar = [
"123",
"456",
"789"
]
This should not match, only match foo_bar
This is what I've tried so far
sed -e '1,/foo_bar/d' -e '/]/,$d' test.tf
And this
sed -n 's/.*\foo_bar\(.*\)\].*/\1/p' test.tf

This is a mouthful, but it’s POSIX sed and works.
sed -Ene \
'# scenario 1
s/(([^[:alnum:]_]|^)foo_bar[^[:alnum:]_][[:space:]]*=[[:space:]]*\[)([^]]+)(\]$)/\3/p
# scenario 2 and 3
/([^[:alnum:]_]|^)foo_bar[^[:alnum:]_][[:space:]]*=?[[:space:]]*[[{][[:space:]]*$/,/^[]}]$/ {
//!p
s/(([^[:alnum:]_]|^)default[^[:alnum:]_][[:space:]]*=[[:space:]]*\[)([^]]+)(\]$)/\3/p
}' |
# filter out unwanted lines from scenario 3 ("type =")
sed -n '/^[[:space:]]*"/p'
I couldn’t quite get it all in a single sed.
The first and last lines of the first sed are the same command (using default instead of foobar).
edit: in case it confuses someone, I left in that last [[:space:]]*, in the second really long regex, by mistake. I won’t edit it, but it’s not vital, nor consistent - I didn’t allow for any trailing whitespace in line ends in other patterns.

This might work for you (GNU sed):
sed -En '/foo_bar/{:a;/.*\[([^]]*)\].*/!{N;ba};s//\1/p}' file
Turn off implicit printing and on extended regexp -nE.
Pattern match on foo_bar, then gather up line(s) between the next [ and ] and print the result.

Related

How to convert [abc\tdef\tghi] to ["abc" "def" "ghi"] in Mac Terminal?

I have the following variable in Terminal
echo $VAR1
abc def ghi
<- Separated by tab
How can I convert this to
"abc" "def" "ghi"
with single space in-between?
In zsh, you can break your string up into an array of quoted words with
var1=$'abc\tdef\tghi'
words=( "${(qqq)=var1}" )
and then turn it back into a single string if wanted with
var2="${words[*]}"
printf "%s\n" "$var2" # prints "abc" "def" "ghi"
Or to skip the intermediate array if you don't need it:
var2=${(A)${(qqq)=var1}}
Assuming the variable contains actual tab characters (not backslashes followed by "t"), you can replace tabs with " " while expanding the variable (see here, a ways down in the list of modifiers), and also add quotes at the beginning and end, like this:
"\"${VAR1//$'\t'/\" \"}\""
There's a rather complex mix of quoting and escaping modes here. The double-quotes at the very beginning and end make the whole thing a double-quoted string, so the shell doesn't do anything weird to whitespace in it. The various escaped double-quotes in it will all be treated as literal characters (because they're escaped), and just be part of the output. And the pattern string, $'\t', is in ANSI-C quoting mode, so that the \t gets converted to an actual tab character.
Here's a couple of examples of using it:
% VAR1=$'abc\tdef\tghi' # Define a variable with actual tab characters
% echo "\"${VAR1//$'\t'/\" \"}\"" # Pass the converted version to a command
"abc" "def" "ghi"
% VAR2="\"${VAR1//$'\t'/\" \"}\"" # Store converted version in another variable
% echo "$VAR2"
"abc" "def" "ghi"
This could do what you want
echo -e "abc\tdef\tghi\tjhg\tmnb" | sed -ne 's/\t/" "/g; s/.*/"\0"/p'
Result:
"abc" "def" "ghi" "jhg" "mnb"
You may leverage awk. For example:
user$ $var='abc\tdef\tghi'
user$ $echo -e ${var}
(output)>>> abc def ghi
user$ $echo -e ${var} | awk -F '\t' '{ for (i=1; i <NF; i++) {printf "\"%s\" ", $i}; printf "\"%s\"\n", $NF}'
(output)>>> "abc" "def" "ghi"

sed replace every word with single quotes with double quotes

I'm trying to parse a file with single quotes, and want to change it to double quotes.
Sample data :
{'data': ['my',
'my_other',
'my_other',
'my_other',
'This could 'happen' <- and this ones i want to keep',
],
'name': 'MyName'},
'data2': {'type': 'x86_64',
'name': 'This',
'type': 'That',
'others': 'bla bla 'bla' <- keep this ones too',
'version': '21237821972'}}}
Desired output :
{"data": ["my",
"my_other",
"my_other",
"my_other",
"This could 'happen' <- and this ones i want to keep"
],
"name": "MyName"},
"data2": {"type": "x86_64",
"name": "This",
"type": "That",
"others": "bla bla 'bla' <- keep this ones too",
"version": "21237821972"}}}
I've already tried to do some regex with sed, but unlucky.
I understand why this is not working for me, just don't know how to go further to get data as i want.
sed -E -e "s/( |\{|\[)'(.*)'(\:|,|\]|\})/\1\"\2\"\3/g"
Cheers,
I am no expert in jq so as per OP's question trying to answer in awk to substitute ' to " here.
awk -v s1="\"" '!/This could/ && !/others/{gsub(/\047/,s1) } /This could/ || /others/{gsub(/\047/,s1,$1)} 1' Input_file
Output will be as follows.
{"data": ["my",
"my_other",
"my_other",
"my_other",
"This could 'happen' <- and this ones i want to keep',
],
"name": "MyName"},
"data2": {"type": "x86_64",
"name": "This",
"type": "That",
"others": 'bla bla 'bla' <- keep this ones too',
"version": "21237821972"}}}
We know that ‘sed’ command can search for a pattern and can replace that pattern with user provided new one
For example sed “s/pattern1/pattern2/g” filename.txt
Now the ‘sed’ command will search for pattern1 and if found it will replace with pattern2
For your requirement you just need to apply this rule. See below
First
sed "s/^\'/\"/g” yourfile
This will search for every newline with character ‘ in the file and replace with “
Next requirement is to search for pattern ‘: and replace with “:
So add one more condition to it separated by ;
sed "s/^\'/\"/g; s/\':/\":/g” yourfile
Just follow this algorithm till you reach you requirement
The final should be look like:-
sed "s/^\'/\"/g; s/\':/\":/g;s/{\'/{\"/g;s/\[\'/\[\"/g;s/\',/\",/g;s/\'}/\"}/g;s/: \'/: \"/g;" yourfile > newfil
(If the above command gives you error just use the command at the very beginning)
finally
mv newfile yourfile

Replace a multiline pattern using Perl, sed, awk

I need to concatenate multiple JSON files, so
...
"tag" : "description"
}
]
[
{
"tag" : "description"
...
into this :
...
"tag" : "description"
},
{
"tag" : "description"
...
So I need to replace the pattern ] [ with ,, but the new line character makes me crazy...
I used several methods, I list some of them:
sed
sed -i '/]/,/[/{s/./,/g}' file.json
but I get this error:
sed: -e expression #1, char 16: unterminated address regex
I tried to delete all the newlines
following this example
sed -i ':a;N;$!ba;s/\n/ /g' file.json
and the output file has "^M". Although I modified this file in unix, I used the dos2unix command on this file but nothing happens. I tried then to include the special character "^M" on the search but with worse results
Perl
(as proposed here)
perl -i -0pe 's/]\n[/\n,/' file.json
but I get this error:
Unmatched [ in regex; marked by <-- HERE in m/]\n[ <-- HERE / at -e line 1.
I would like to concatenate several JSON files.
If I understand correctly, you have something like the following (where letters represent valid JSON values):
to_combine/file1.json: [a,b,c]
to_combine/file2.json: [d,e,f]
And from that, you want the following:
combined.json: [a,b,c,d,e,f]
You can use the following to achieve this:
perl -MJSON::XS -0777ne'
push #data, #{ decode_json($_) };
END { print encode_json(\#data); }
' to_combine/*.json >combined.json
As for the problem with your Perl solution:
[ has a special meaning in regex patterns. You need to escape it.
You only perform one replacement.
-0 doesn't actually turn on slurp mode. Use -0777.
You place the comma after the newline, when it would be nicer before the newline.
Fix:
cat to_combine/*.json | perl -0777pe's/\]\n\[/,\n/g' >combined.json
Note that a better way to combine multiple JSON files is to parse them all, combine the parsed data structure, and reencode the result. Simply changing all occurrences of ][ to a comma , may alter data instead of markup
sed is a minimal program that will operate only on a single line of a file at a time. Perl encompasses everything that sed or awk will do and a huge amount more besides, so I suggest you stick with it
To change all ]...[ pairs in file.json (possibly separated by whitespace) to a single comma, use this
perl -0777 -pe "s/\]\s*\[/,/g" file.json > file2.json
The -0 option specifies an octal line separator, and giving it the value 777 makes perl read the entire file at once
One-liners are famously unintelligible, and I always prefer a proper program file, which would look like this
join_brackets.pl
use strict;
use warnings 'all';
my $data = do {
local $/;
<>;
}
$data =~ s/ \] \s* \[ /,/gx;
print $data;
and you would run it as
perl join_brackets.pl file.json > joined.json
I tried with example in your question.
$ sed -rn '
1{$!N;$!N}
$!N
/\s*}\s*\n\s*]\s*\n\s*\[\s*\n\s*\{\s*/M {
s//\},\n\{/
$!N;$!N
}
P;D
' file
...
"tag" : "description"
},
{
"tag" : "description"
...
...
"tag" : "description"
},
{
"tag" : "description"
...

Is it possible in Ruby to print a part of a regex (group) and instead of the whole matched substring?

Is it possible in sed may be even in Ruby to memorize the matched part of a pattern and print it instead of the full string which was matched:
"aaaaaa bbb ccc".strip.gsub(/([a-z])+/, \1) # \1 as a part of the regex which I would like to remember and print then instead of the matched string.
# => "a b c"
I thing in sed it should be possible with its h = holdspace command or similar, but what also about Ruby?
Not sure what you mean. Here are few example:
print "aaaaaa bbb ccc".strip.gsub(/([a-z])+/, '\1')
# => "a b c"
And,
print "aaaaaa bbb ccc".strip.scan(/([a-z])+/).flatten
# => ["a", "b", "c"]
The shortest answer is grep:
echo "aaaaaa bbb ccc" | grep -o '\<.'
You can do:
"aaaaaa bbb ccc".split
and then join that array back together with the first character of each element
[a[0][0,1], a[1][0,1], a[2][0,1], a[3][0,1], ... ].join(" ")
#glennjackman's suggestion: ruby -ne 'puts $_.split.map {|w| w[0]}.join(" ")'

Bash to match pattern in filename then add/edit

I'm sure this has been answered before, but I can't seem to use the right search terms to find it.
I'm trying to write a bash script that can recognize, sort, and rename files based on patterns in their names.
Take this filename, for example: BBC Something Something 3 of 5 Blah 2007.avi
I would like the script to recognize that since the filename starts with BBC and contains something that matches the pattern "DIGIT of DIGIT," the script should rename it by removing the BBC at the front, inserting the string "s01e0" in front of the 3, and removing the "of 5," turning it into Something Something s01e03 Blah 2007.avi
In addition, I'd like for the script to recognize and deal differently with a file named, for example, BBC Something Else 2009.mkv . In this case, I need the script to recognize that since the filename starts with BBC and ends with a year, but does not contain that "DIGIT of DIGIT" pattern, it should rename it by inserting the word "documentaries" after BBC and then copying and pasting the year after that, so that the filename would become BBC documentaries 2009 Something Else.mkv
I hope this isn't asking for too much help... I've been working on this myself all day, but this is literally all I've got:
topic1 () {
if [ "$2" = "bbc*[:digit:] of [:digit:]" ]; then
And then nothing. I'd love some help! Thanks!
Use grep to match filenames that need to be changed and then sed to actually change them:
#!/bin/bash
get_name()
{
local FILENAME="${1}"
local NEWNAME=""
# check if input matches our criteria
MATCH_EPISODE=$(echo "${FILENAME}" | grep -c "BBC.*[0-9] of [0-9]")
MATCH_DOCUMENTARY=$(echo "${FILENAME}" | grep -c "BBC.*[0-9]\{4\}")
# if it matches then modify
if [ "${MATCH_EPISODE}" = "1" ]; then
NEWNAME=$(echo "${FILENAME}" | sed -e 's/BBC\(.*\)\([0-9]\) of [0-9]\(.*\)/\1 s01e0\2 \3/')
elif [ "${MATCH_DOCUMENTARY}" = "1" ]; then
NEWNAME=$(echo "${FILENAME}" | sed -e 's/BBC\(.*\)\([0-9]\{4\}\)\(.*\)/BBC documentaries \2 \1 \3/')
fi
# clean up: remove trailing spaces, double spaces, spaces before dot
echo "${NEWNAME}" | sed -e 's/^ *//' -e 's/ / /g' -e 's/ \./\./g'
}
FN1="BBC Something Something 3 of 5 Blah 2007.avi"
FN2="BBC Something Else 2009.mkv"
FN3="Something Not From BBC.mkv"
NN1=$(get_name "${FN1}")
NN2=$(get_name "${FN2}")
NN3=$(get_name "${FN3}")
echo "${FN1} -> ${NN1}"
echo "${FN2} -> ${NN2}"
echo "${FN3} -> ${NN3}"
The output is:
BBC Something Something 3 of 5 Blah 2007.avi -> Something Something s01e03 Blah 2007.avi
BBC Something Else 2009.mkv -> BBC documentaries 2009 Something Else.mkv
Something Not From BBC.mkv ->
Let's see at one of sed invocations:
sed -e 's/BBC\(.*\)\([0-9]\) of [0-9]\(.*\)/\1 s01e0\2 \3/'
We use capture groups to match interesting portions of the filename:
BBC - match literal BBC,
\(.*\) - match everything and remember it in capture group 1, until
\([0-9]\) - a digit, remember it in capture group 2, then
of [0-9] - match literal " of " and digit,
\(.*\) - match rest and remember it in capture group 3
and then put them in positions we want:
\1 - content of capture group 1, i.e. everything between "BBC" and first digit
s01e0 - literal " s01e0"
\2 - content of capture group 2, i.e. episode number
\3 - content of capture group 3, i.e. everything else
This may result in many superfluous spaces so at the end there is another sed invocation to clean that up.

Resources