Format of the git diff is not correct when stored in Bash variable - bash

I want to calculate this in bash file
files : {
{
file {
name: "Bla,java"
line_changes : [45,146,14]
}
}
{
file {
name: "Foo.java"
line_changed : [7,8,9,10]
}
}
}
so I have this
gitOutput=$(git diff origin/master..origin/mybranch)
echo $gitOuput
My problem is :
The output is sooo, not formatted.
Everything is in one line
I cannot parse it logically...
Like Split by \n or split by "diff --git" etc
Also there are no new line .
So if there are some, it doesnot make sense.
So I want to know, is there any pretty format option for git diff
[UPDATE]
I have tried this weird approach
git diff origin/master..origin/mybranch > data.txt
data=$(cat data.txt)
Output is :
The data.txt is absolutely perfect
but the data var. is all messed up...
is it something related to IFS ???

The short answer is: you should add quotes:
gitOutput="$(git diff origin/master..origin/mybranch)"
echo "$gitOuput"
so that line returns are kept as is. This is usually what you want to do in a general matter when having variables in shell.
For a detailed explanation on the use of quotes, see
https://unix.stackexchange.com/questions/68694/when-is-double-quoting-necessary

Related

Shell Script msg() echo "${RED}$#${NOCOLOR}", What does it mean [duplicate]

Sometimes I have a one-liner that I am repeating many times for a particular task, but will likely never use again in the exact same form. It includes a file name that I am pasting in from a directory listing. Somewhere in between and creating a bash script I thought maybe I could just create a one-liner function at the command line like:
numresults(){ ls "$1"/RealignerTargetCreator | wc -l }
I've tried a few things like using eval, using numresults=function..., but haven't stumbled on the right syntax, and haven't found anything on the web so far. (Everything coming up is just tutorials on bash functions).
Quoting my answer for a similar question on Ask Ubuntu:
Functions in bash are essentially named compound commands (or code
blocks). From man bash:
Compound Commands
A compound command is one of the following:
...
{ list; }
list is simply executed in the current shell environment. list
must be terminated with a newline or semicolon. This is known
as a group command.
...
Shell Function Definitions
A shell function is an object that is called like a simple command and
executes a compound command with a new set of positional parameters.
... [C]ommand is usually a list of commands between { and }, but
may be any command listed under Compound Commands above.
There's no reason given, it's just the syntax.
Try with a semicolon after wc -l:
numresults(){ ls "$1"/RealignerTargetCreator | wc -l; }
Don't use ls | wc -l as it may give you wrong results if file names have newlines in it. You can use this function instead:
numresults() { find "$1" -mindepth 1 -printf '.' | wc -c; }
You can also count files without find. Using arrays,
numresults () { local files=( "$1"/* ); echo "${#files[#]}"; }
or using positional parameters
numresults () { set -- "$1"/*; echo "$#"; }
To match hidden files as well,
numresults () { local files=( "$1"/* "$1"/.* ); echo $(("${#files[#]}" - 2)); }
numresults () { set -- "$1"/* "$1"/.*; echo $(("$#" - 2)); }
(Subtracting 2 from the result compensates for . and ...)
You can get a
bash: syntax error near unexpected token `('
error if you already have an alias with the same name as the function you're trying to define.
The easiest way maybe is echoing what you want to get back.
function myfunc()
{
local myresult='some value'
echo "$myresult"
}
result=$(myfunc) # or result=`myfunc`
echo $result
Anyway here you can find a good how-to for more advanced purposes

save stream output as multiple files

I have a program (pull) which downloads files and emits their contents (JSON) to stdout, the input of the program is the id of every document I want to download, like so:
pull one two three
>
> { ...one }
> {
...two
}
> { ...three }
However, I now would like to pipe that output to a different file for each file it has emitted, ideally being able to reference the filename by the order of args initially used: one two three.
So, the outcome I am looking for, would something like the below.
pull one two three | > $1.json
>
> saved one.json
> saved two.json
> saved three.json
Is there any way to achieve this or something similar at all?
Update
I just would like to clarify how the program works and why it may not be ideal looping through arguments and executing the program multiple times for each argument declared.
Whenever pull gets executed, it performs two operations:
A: Expensive operation (timely to resolve): This retrieves all documents available in a database where we can lookup items by the argument names provided when invoking pull.
B: Operation specific to the provided argument: after A resolves, we will use its response in order to get the data needed for specifically retrieving the individual document.
This means that, having A+B called multiple times for every argument, wouldn't be ideal as A is an expensive operation.
So instead of having, AB AB AB AB I would like to have ABBBB.
You're doing it the hard way.
for f in one two three; do pull "$f" > "$f.json" & done
Unless something in the script is not compatible with multiple simultaneous copies, this will make the process faster as well. If it is, just change the & to ;.
Update
Try just always writing the individual files. If you also need to be able to send them to stdout, just cat the file afterwards, or use tee when writing it.
If that's not ok, then you will need to clearly identify and parse the data blocks. For example, if the start of a section is THE ONLY place { appears as the first character on a line, that's a decent sentinel value. Split your output to files using that.
For example, throw this into another script:
awk 'NR==FNR { ndx=1; split($0,fn); name=""; next; } /^{/ { name=fn[ndx++]; } { if (length(name)) print $0 > name".json"; }' <( echo "$#" ) <( pull "$#" )
call that script with one two three and it should do what you want.
Explanation
awk '...' <( echo "$#" ) <( pull "$#" )
This executes two commands and returns their outputs as "files", streams of input for awk to process. The first just puts the list of arguments provided on one line for awk to load into an array. The second executes your pull script with those args, which provides the streaming output you already get.
NR==FNR { ndx=1; split($0,fn); name=""; next; }
This tells awk to initialize a file-controlling index, read the single line from the echo command (the args) and split them into an array of filename bases desired, then skip the rest of processing for that record (it isn't "data", it's metadata, and we're done with it.) We initialize name to an empty string so that we can check for length - otherwise those leading blank lines end up in .json, which probably isn't what you want.
/^{/ { name=fn[ndx++]; }
This tells awk each time it sees { as the very first character on a line, set the output filename base to the current index (which we initialized at 1 above) and increment the index for the next time.
{ if (length(name)) print $0 > name".json"; }
This tells awk to print each line to a file named whatever the current index is pointing at, with ".json" appended. if (length(name)) throws away the leading blank line(s) before the first block of JSON.
The result is that each new set will trigger a new filename from your given arguments.
That work for you?
In Use
$: ls *.json
ls: cannot access '*.json': No such file or directory
$: pull one two three # my script to simulate output
{ ...one... }
{
...two...
}
{ ...three... }
$: splitstream one two three # the above command in a file to receive args
$: grep . one* two* three* # now they exist
one.json:{ ...one... }
two.json:{
two.json: ...two...
two.json:}
three.json:{ ...three... }

sed or awk remove multiple lines starting with `if` and ending with that `if`'s closing `fi`

An extension to this question Delete multiple lines - from "patternA" match, through second occurrence of "patternB" ... Thinking I should make it more robust.
So, rather than counting how many fi's, in the likelihood there may be an unknown amount, I'd like to be able to simply execute something where...
If I have the following file /somepath/somefile containing:
...
# Test something?
if something
then
do something
if somethingelse
then
do somethingelse
fi
fi
...
...with an unknown amount of possible if/fi statements, how can I remove everything from the line starting with the string containing "something?" through the line containing the closing "fi" to the first if? Any/All help is appreciated.
Generally speaking, you need a parser for this. However, if all of the commands in your script follow the pattern from the sample in the question (if/fi always at the beginning of the line), this somewhat crude if-counting solution should work.
awk 'BEGIN { del=0; level=0 } /Test something?/ { del=1 } del==0 { print } /^if / { level++ } /^fi( |$)/ { level--; if (level == 0) del=0 }' somefile

How to find complete file names in UNIX if i know only extention of file.

Suppose I have a file which contains other file names with some extention [.dat,.sum etc].
text file containt
gdsds sd8ef g/f/temp_temp.sum
ghfp hrwer h/y/test.text.dat
if[-r h/y/somefile.dat] then....
I want to get the complete file names, like for above file I should get output as
temp_temp.sum
test.text.dat
somefile.dat
I am using AIX unix in which grep -ow [a-zA-Z_] filename is not working as for AIX -o switch is not present.
sed is good, but as you have a range of types of 'records', maybe awk can help.
My target is any 'word' found by awk that has a '/' in it, then take that word, remove everything up to the last '/', leaving just the filename.
{
cat -<<EOS
gdsds sd8ef g/f/temp_temp.sum
ghfp hrwer h/y/test.text.dat
if[-r h/y/somefile.dat] then....
EOS
} \
| awk '{
for (i=1; i<=NF;i++) {
if ($i ~ /.*\//) {
fName=$i
sub(/.*\//, "", fName)
# put any other chars you to to delete inside the '[ ... ]' char list
sub(/[][]/, "", fName)
if (fName) {
print file
}
}
}
}'
output
temp_temp.sum
test.text.dat
somefile.dat
(Also, your question headline doesn't seem to match up with your description, if I'm missing something, please feel free to update your question and post a comment indicating the edits (or you can edit your headline). )
P.S. Welcome to StackOverflow. Please remember to accept the answer that best solves your problem, if any, by pressing the checkmark sign, http://i.imgur.com/uqJeW.png. When you see good Q&A, vote them up by using the gray triangles, http://i.imgur.com/kygEP.png. Note that 'giving' reputation points to others does not mean a deduction to your reputation points (unless you have posted a bounty).

Reading java .properties file from bash

I am thinking of using sed for reading .properties file, but was wondering if there is a smarter way to do that from bash script?
This would probably be the easiest way: grep + cut
# Usage: get_property FILE KEY
function get_property
{
grep "^$2=" "$1" | cut -d'=' -f2
}
The solutions mentioned above will work for the basics. I don't think they cover multi-line values though. Here is an awk program that will parse Java properties from stdin and produce shell environment variables to stdout:
BEGIN {
FS="=";
print "# BEGIN";
n="";
v="";
c=0; # Not a line continuation.
}
/^\#/ { # The line is a comment. Breaks line continuation.
c=0;
next;
}
/\\$/ && (c==0) && (NF>=2) { # Name value pair with a line continuation...
e=index($0,"=");
n=substr($0,1,e-1);
v=substr($0,e+1,length($0) - e - 1); # Trim off the backslash.
c=1; # Line continuation mode.
next;
}
/^[^\\]+\\$/ && (c==1) { # Line continuation. Accumulate the value.
v= "" v substr($0,1,length($0)-1);
next;
}
((c==1) || (NF>=2)) && !/^[^\\]+\\$/ { # End of line continuation, or a single line name/value pair
if (c==0) { # Single line name/value pair
e=index($0,"=");
n=substr($0,1,e-1);
v=substr($0,e+1,length($0) - e);
} else { # Line continuation mode - last line of the value.
c=0; # Turn off line continuation mode.
v= "" v $0;
}
# Make sure the name is a legal shell variable name
gsub(/[^A-Za-z0-9_]/,"_",n);
# Remove newlines from the value.
gsub(/[\n\r]/,"",v);
print n "=\"" v "\"";
n = "";
v = "";
}
END {
print "# END";
}
As you can see, multi-line values make things more complex. To see the values of the properties in shell, just source in the output:
cat myproperties.properties | awk -f readproperties.awk > temp.sh
source temp.sh
The variables will have '_' in the place of '.', so the property some.property will be some_property in shell.
If you have ANT properties files that have property interpolation (e.g. '${foo.bar}') then I recommend using Groovy with AntBuilder.
Here is my wiki page on this very topic.
I wrote a script to solve the problem and put it on my github.
See properties-parser
One option is to write a simple Java program to do it for you - then run the Java program in your script. That might seem silly if you're just reading properties from a single properties file. However, it becomes very useful when you're trying to get a configuration value from something like a Commons Configuration CompositeConfiguration backed by properties files. For a time, we went the route of implementing what we needed in our shell scripts to get the same behavior we were getting from CompositeConfiguration. Then we wisened up and realized we should just let CompositeConfiguration do the work for us! I don't expect this to be a popular answer, but hopefully you find it useful.
If you want to use sed to parse -any- .properties file, you may end up with a quite complex solution, since the format allows line breaks, unquoted strings, unicode, etc: http://en.wikipedia.org/wiki/.properties
One possible workaround would using java itself to preprocess the .properties file into something bash-friendly, then source it. E.g.:
.properties file:
line_a : "ABC"
line_b = Line\
With\
Breaks!
line_c = I'm unquoted :(
would be turned into:
line_a="ABC"
line_b=`echo -e "Line\nWith\nBreaks!"`
line_c="I'm unquoted :("
Of course, that would yield worse performance, but the implementation would be simpler/clearer.
In Perl:
while(<STDIN>) {
($prop,$val)=split(/[=: ]/, $_, 2);
# and do stuff for each prop/val
}
Not tested, and should be more tolerant of leading/trailing spaces, comments etc., but you get the idea. Whether you use Perl (or another language) over sed is really dependent upon what you want to do with the properties once you've parsed them out of the file.
Note that (as highlighted in the comments) Java properties files can have multiple forms of delimiters (although I've not seen anything used in practice other than colons). Hence the split uses a choice of characters to split upon.
Ultimately, you may be better off using the Config::Properties module in Perl, which is built to solve this specific problem.
I have some shell scripts that need to look up some .properties and use them as arguments to programs I didn't write. The heart of the script is a line like this:
dbUrlFile=$(grep database.url.file etc/zocalo.conf | sed -e "s/.*: //" -e "s/#.*//")
Effectively, that's grep for the key and filter out the stuff before the colon and after any hash.
if you want to use "shell", the best tool to parse files and have proper programming control is (g)awk. Use sed only simple substitution.
I have sometimes just sourced the properties file into the bash script. This will lead to environment variables being set in the script with the names and contents from the file. Maybe that is enough for you, too. If you have to do some "real" parsing, this is not the way to go, of course.
Hmm, I just run into the same problem today. This is poor man's solution, admittedly more straightforward than clever;)
decl=`ruby -ne 'puts chomp.sub(/=(.*)/,%q{="\1";}).gsub(".","_")' my.properties`
eval $decl
then, a property 'my.java.prop' can be accessed as $my_java_prop.
This can be done with sed or whatever, but I finally went with ruby for its 'irb' which was handy for experimenting.
It's quite limited (dots should be replaced only before '=',no comment handling), but could be a starting point.
#Daniel, I tried to source it, but Bash didn't like dots in variable names.
I have had some success with
PROPERTIES_FILE=project.properties
function source_property {
local name=$1
eval "$name=\"$(sed -n '/^'"$name"'=/,/^[A-Z]\+_*[A-Z]*=/p' $PROPERTIES_FILE|sed -e 's/^'"$name"'=//g' -e 's/"/\\"/g'|head -n -1)\""
}
source_property 'SOME_PROPERTY'
This is a solution that properly parses quotes and terminates at a space when not given quotes. It is safe: no eval is used.
I use this code in my .bashrc and .zshrc for importing variables from shell scripts:
# Usage: _getvar VARIABLE_NAME [sourcefile...]
# Echos the value that would be assigned to VARIABLE_NAME
_getvar() {
local VAR="$1"
shift
awk -v Q="'" -v QQ='"' -v VAR="$VAR" '
function loc(text) { return index($0, text) }
function unquote(d) { $0 = substr($0, eq+2) d; print substr($0, 1, loc(d)-1) }
{ sub(/^[ \t]+/, ""); eq = loc("=") }
substr($0, 1, eq-1) != VAR { next } # assignment is not for VAR: skip
loc("=" QQ) == eq { unquote(QQ); exit }
loc("=" Q) == eq { unquote( Q); exit }
{ print substr($1, eq + 1); exit }
' "$#"
}
This saves the desired variable name and then shifts the argument array so the rest can be passed as files to awk.
Because it's so hard to call shell variables and refer to quote characters inside awk, I'm defining them as awk variables on the command line. Q is a single quote (apostrophe) character, QQ is a double quote, and VAR is that first argument we saved earlier.
For further convenience, there are two helper functions. The first returns the location of the given text in the current line, and the second prints the content between the first two quotes in the line using quote character d (for "delimiter"). There's a stray d concatenated to the first substr as a safety against multi-line strings (see "Caveats" below).
While I wrote the code for POSIX shell syntax parsing, that appears to only differ from your format by whether there is white space around the asignment. You can add that functionality to the above code by adding sub(/[ \t]*=[ \t]*/, "="); before the sub(…) on awk's line 4 (note: line 1 is blank).
The fourth line strips off leading white space and saves the location of the first equals sign. Please verify that your awk supports \t as tab, this is not guaranteed on ancient UNIX systems.
The substr line compares the text before the equals sign to VAR. If that doesn't match, the line is assigning a different variable, so we skip it and move to the next line.
Now we know we've got the requested variable assignment, so it's just a matter of unraveling the quotes. We do this by searching for the first location of =" (line 6) or =' (line 7) or no quotes (line 8). Each of those lines prints the assigned value.
Caveats: If there is an escaped quote character, we'll return a value truncated to it. Detecting this is a bit nontrivial and I decided not to implement it. There's also a problem of multi-line quotes, which get truncated at the first line break (this is the purpose of the "stray d" mentioned above). Most solutions on this page suffer from these issues.
In order to let Java do the tricky parsing, here's a solution using jrunscript to print the keys and values in a bash read-friendy (key, tab character, value, null character) way:
#!/usr/bin/env bash
jrunscript -e '
p = new java.util.Properties();
p.load(java.lang.System.in);
p.forEach(function(k,v) { out.format("%s\t%s\000", k, v); });
' < /tmp/test.properties \
| while IFS=$'\t' read -d $'\0' -r key value; do
key=${key//./_}
printf -v "$key" %s "$value"
printf '=> %s = "%s"\n' "$key" "$value"
done
I found printf -v in this answer by #david-foerster.
To quote jrunscript: Warning: Nashorn engine is planned to be removed from a future JDK release

Resources