I've got bash script for counting rows in the reports. I have one array where all reports names are stored and in the loop I'm counting rows. However for some files my script receives binary operator expected error. Do anyone have a solution?
for i in ${ARRAY[#]}; do
if [ ! -f "$BASE_DIR/$i"* ];
then
echo "File not generated yet"
else
ARRAY2=$(wc -l < "$BASE_DIR/$i"*.tab | awk '{print $1-2}')
echo ${ARRAY2[$i]} $i
fi
Use double straight braces instead of ones as follows since you r using extended expressions.
if [[ ! -f "$BASE_DIR/$i"* ]];
Need to check with array contents. Special characters as ' ' (spaces) in file names must be escaped.
-f takes just one argument, so the error occurs when the pattern matches more than one file.
It seems to work with [[, although I can't find any documentation as to why it does.
The bigger problem is you can also only use one file with the < operator; if the pattern matches multiple files, you'll get an ambiguous redirect error. To fix that, you'll need to use cat:
cat "$BASE_DIR/$i"*.tab | wc -l
However, it's not clear what you are expecting from the output; ARRAY2 will not actually be an array.
Related
I'm trying to look for phone numbers in any of the following formats: +1.570.555.1212, 570.555.1212, (570)555-1212, and 570-555-1212. We also need to look in compressed folders using zgrep, however I would have my code come back "No matches found". The code is working as it is below to find phone numbers from txt files. It is very bad, but here it is below
Code:
#!/bin/bash
egrep '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}.[0-9]{3}.[0-9]{4}|([0-9]{3})[0-9]{3}-[0-9]{4}|+(1).[0-9]{3}.[0-9]{3}.[0-9]{4}' *
if [ $? -eq 0 ] ; then echo $1 ; else echo "No matches found" ; fi 2>/dev/null
zgrep without any options is equivalent in its regex capabilities to grep; you need to say zgrep -E if you want to use grep -E (aka egrep) regex syntax when searching compressed files.
#!/bin/bash
if zgrep -E -q '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}.[0-9]{3}.[0-9]{4}|([0-9]{3})[0-9]{3}-[0-9]{4}|+(1).[0-9]{3}.[0-9]{3}.[0-9]{4}' *
then
echo "$1"
else
echo "No matches found" >&2
fi
Notice also Why is testing “$?” to see if a command succeeded or not, an anti-pattern? and When to wrap quotes around a shell variable as well as the preference for -q over redirecting to /dev/null, and the displaying of error messages on standard error (>&2 redirection).
Your regex could also use some refactoring; maybe try
(\+\(1\).)?[0-9]{3}.[0-9]{3}.[0-9]{4}
Notice how round brackets and the plus sign need to be backslash-escaped to match literally, and how after refactoring out the +(1) prefix as optional the rest of the regex subsumes all the other variants you had enumerated, because . matches - and ( and . and many other characters. (The optional prefix could also be dropped completely and this would still match the same strings, but I had to guess some things so I am leaving it in with this remark.)
I have something like this
VIEWERS=[]
EDITORS=[]
ADMINS='["abc#email.com","xyz#email.com"]'
Which are then later passed to a .yaml file as a list
I want to add a check to ensure if none of the three are set then terminate and do not move on
However I cannot figure out how to check the list of such strings is empty or not via bash.
Whats the proper syntax. I cannot use round-brackets to these due to .yaml restrictions
There are two issues here. Looks like you want defines arrays.
That's done like this:
VIEWERS=() or VIEWERS=(abc#email.com)
Once they have been defined, you can check if ALL of those arrays are empty like so:
VIEWERS=()
EDITORS=()
#ADMINS=()
ADMINS=(abc#email.com bcd#email.com)
if [ ${#VIEWERS[#]} -eq 0 ] && [ ${#EDITORS[#]} -eq 0 ] && [ ${#ADMINS[#]} -eq 0 ] ; then
echo "all empty arrays"
else
echo "non-empty array"
fi
If you run the above , you will get: non-empty array, since ADMINS is not empty.
If, however, you comment out the 2nd ADMINS line and uncomment the first ADMINS line, you will get: all empty arrays.
Hope this helps.
This should be a simple matter of comparing them to the string "[]". Note that since anything containing [ and ] looks like a filename wildcard expression to the shell, you need to use appropriate quotes to keep it from "helpfully" turning them into lists of matching filenames. That means double-quotes around variable references, and probably single-quotes around literals. Also, I recommend using lower- or mixed-case variable names, to avoid conflicts with the many all-caps names that have special meanings/functions. So something like this:
viewers='[]'
editors='[]'
admins='["abc#email.com","xyz#email.com"]'
if [ "$viewers" = '[]' ] && [ "$editors" = '[]' ] && [ "$admins" = '[]' ]; then
echo "They're all empty"
else
echo "At least one array has contents"
fi
No I don't want to use round brackets since I have to pass it as inside square brackets to my kubernetes deployment further on
Is there a way I can do so with this format?
ADMINS='[]'
I would try it like so:
cat infile | cut -d"]" -f1 | cut -d"[" -f2 | sed -e /^$/d
(*NOTE: I ain't care, I like to start with cat sometimes for readability.)
First we feed the input with cat, then cut the field before delimeter ']', then out of the remaining bit we cut after the '['. Finally we throw in a sed line to remove empty lines returned by the null results from the cut operations.
*Edit:
After scanning your orphaned comment in the thread here... seems like you maybe want to keep the brackets. It would be helpful in the future to show your desired output imho. maybe this would work ok?
cat infile| cut -d"=" -f2 | sed 's/[[]]//g' | sed -e /^$/d
First, we read in the file... then cut for after the "=", then using sed, remove all results with empty brackets . and then finally another sed to remove empty lines from the results.
i have this script that reads a file, the file looks like this:
711324865,438918283,2
-333308476,886548365,2
1378685449,-911401007,2
-435117907,560922996,2
259073357,714183955,2
...
the script:
#!/bin/bash
while IFS=, read childId parentId parentLevel
do
grep "\$parentId" parent_child_output_level2.csv
resul=$?
echo "child is $childId, parent is $parentId parentLevel is $parentLevel resul is $resul"
done < parent_child_output_level1.csv
but it is not working, resul is allways returning me 1, which is a false positive.
I know that because i can launch the next command, equivalent, i think:
[core#dub-vcd-vms165 generated-and-saved-to-hdfs]$
grep "\-911401007"parent_child_output_level2.csv
-911401007,-157143722,3
Please help.
grep command to print only the negative numbers.
$ grep -oP '(^|,)\K-\d+' file.csv
-333308476
-911401007
-435117907
(^|,) matches the start of a line or comma.
\K discards the previously matched characters.
-\d+ Matches - plus the following one or more numbers.
Your title is inconsistent with your question. Your title asks for how to grep negative numbers, which Avinash Raj answered well, although I'd suggest you don't even need the (Perl-style) look-behind positive assertion (^|,)\K to match start-of-field, because if the file is well-formed, then -\d+ would match all numbers just as well. So you could just run (edit: realized that with a leading - you need -- to prevent grep from taking the pattern as an option):
grep -oP -- '-\d+' file.csv;
Your question includes a script whose intention seems to be to grep for any number (positive or negative) in the first field (childId) of one file (parent_child_output_level2.csv) that occurs in the second field (parentId) of another file (parent_child_output_level1.csv). To accomplish this, I wouldn't use grep, because you're trying to do an exact numerical equality test, which can even be done as an exact string equality test assuming your numbers are always consistently represented (e.g. no redundant leading zeroes). Repeatedly grepping through the entire file just to search for a number in one column is also wasteful of CPU.
Here's what I would do:
parentIdList=($(cut -d, -f2 parent_child_output_level1.csv));
childIdList=($(cut -d, -f1 parent_child_output_level2.csv));
for parentId in "${parentIdList[#]}"; do
for childId in "${childIdList[#]}"; do
if [[ "$childId" == "$parentId" ]]; then
echo "$parentId";
fi;
done;
done;
With this approach, you precompute both the parent id list and the child id list just once, using cut to extract the appropriate field from each file. Then you can use the shell-builtin for loop, shell-builtin if conditional, and shell-builtin [[ test command to accomplish the check, and finally finish with a shell-builtin echo to print the matches. Everything is shell-builtin, after the initial command substitutions that run the cut external executable.
If you also want to filter these results on negative numbers, you could grep for ^- in the results of the above script, or grep for it in the results of each (or just the first) cut command, or add the following line just inside the outer for loop:
if [[ "${parentId:0:1}" != '-' ]]; then continue; fi;
Alternative approach:
if [[ "$parentId" != -* ]]; then continue; fi;
Either approach will skip non-negatives.
I'm trying to create a very simple bash script that will open new link base on the input command
Use case #1
$ ./myscript longname55445
It should take the number 55445 and then assign that to a variable which will later be use to open new link based on the given number.
Use case #2
$ ./myscript l55445
It should do the exact same thing as above by taking the number and then open the same link.
Use case #3
$ ./myscript 55445
If no prefix given then we just simply open that same link as a fallback.
So far this is what I have
#!/bin/sh
BASE_URL=http://api.domain.com
input=$1
command=${input:0:1}
if [ "$command" == "longname" ]; then
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
elseif [ "$command" == "l" ]; then
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
else
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
fi
But this will always fallback to the elseif there.
I'm using zsh at the moment.
input=$1
command=${input:0:1}
sets command to the first character of the first argument. It's not possible for a one character string to be equal to an eight-character string ("longname"), so the if condition must always fail.
Furthermore, both your elseif and your else clauses set
number=${input:1:${#input}}
Which you could have written more simply as
number=${input:1}
But in both cases, you're dropping the first character of input. Presumably in the else case, you wanted the entire first argument.
see whether this construct is helpful for your purpose:
#!/bin/bash
name="longname55445"
echo "${name##*[A-Za-z]}"
this assumes a letter adjacent to number.
The following is NOT another way to write the same, because it is wrong.
Please see comments below by mklement0, who noticed this. Mea culpa.
echo "${name##*[:letter:]}"
You have command=${input:0:1}
It takes the first single char, and you compare it to "longname", of course it will fail, and go to elseif.
The key problem is to check if the input is beginning with l or longnameor nothing. If in one of the 3 cases, take the trailing numbers.
One grep line could do it, you can just grep on input and get the returned text:
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"l234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"longname234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"foobar234"
<we got nothing>
You can use regex matching in bash.
[[ $1 =~ [0-9]+ ]] && number=$BASH_REMATCH
You can also use regex matching in zsh.
[[ $1 =~ [0-9]+ ]] && number=$MATCH
Based on the OP's following clarification in a comment,
I'm only looking for the numbers [...] given in the input.
the solution can be simplified as follows:
#!/bin/bash
BASE_URL='http://api.domain.com'
# Strip all non-digits from the 1st argument to get the desired number.
number=$(tr -dC '[:digit:]' <<<"$1")
open "$BASE_URL?id=$number"
Note the use of a bash shebang, given the use of 'bashism' <<< (which could easily be restated in a POSIX-compliant manner).
Similarly, the OP's original code should use a bash shebang, too, due to use of non-POSIX substring extraction syntax.
However, judging by the use of open to open a URL, the OP appears to be on OSX, where sh is essentially bash (though invocation as sh does change behavior), so it'll still work there. Generally, though, it's safer to be explicit about the required shell.
I am currently trying to extract ALL matching expressions from a text which e.g. looks like this and put them into an array.
aaaaaaaaa${bbbbbbb}ccccccc${dddd}eeeee
ssssssssssssssssss${TTTTTT}efhsekfh ej
348653jlk3jß1094utß43t59ßgöelfl,-s-fko
The matching expressions are similar to this: ${}. Beware that I need the full expression, not only the word in between this expression! So in this case the result should be an array which contains:
${bbbbbbb}
${dddd}
${TTTTTTT}
Problems I have stumbled upon and couldn't solve:
It should NOT recognizes this as a whole
${bbbbbbb}ccccccc${dddd} but each for its own
grep -o is not installed on the old machine, Perl is not allowed either!
Many commands e.g. BASH_REMATCH only deliver the whole line or the first occurrence of the expression, instead of all matching expressions in the line!
The mentioned pattern \${[^}]*} seems to work partly, as it can extract the first occurrence of the expression, however it always omitts the ones following after that, if it's in the same text line. What I need is ALL matching expressions found in the line, not only the first one.
You could split the string on any of the characters $,{,}:
$ s='...blaaaaa${blabla}bloooo${bla}bluuuuu...'
$ echo "$s"
...blaaaaa${blabla}bloooo${bla}bluuuuu...
$ IFS='${}' read -ra words <<< "$s"
$ for ((i=0; i<${#words[#]}; i++)); do printf "%d %s\n" $i "${words[i]}"; done
0 ...blaaaaa
1
2 blabla
3 bloooo
4
5 bla
6 bluuuuu...
So if you're trying to extract the words inside the braces:
$ for ((i=2; i<${#words[#]}; i+=3)); do printf "%d %s\n" $i "${words[i]}"; done
2 blabla
5 bla
If the above doesn't suit you, grep will work:
$ echo '...blaaaaa${blabla}bloooo${bla}bluuuuu...' | grep -o '\${[^}]\+}'
${blabla}
${bla}
You still haven't told us exactly what output you want.
Since it bugged me a lot I have asked directly on www.unix.com and was kindly provided with a solution which fits for my ancient shell. So if anyone got the same problem here is the solution:
line='aaaa$aa{yyy}aaa${important}xxxxxxxx${important2}oo{o$}oo$oo${importantstring3}'
IFS=\$ read -a words <<< "$line"
regex='^(\{[^}]+})'
for e in "${words[#]}"; do
if [[ $e =~ $regex ]]; then
echo "\$${BASH_REMATCH[0]}";
fi;
done
which prints then the following - without even getting disturbed by random occurrences of $ and { or } between the syntactically correct expressions:
${important}
${important2}
${importantstring3}
I have updated the full solution after I got another update from the forums: now it also ignores this: aaa$aa{yyy}aaaa - which it previously printed as ${yyy} - but which it should completely ignore as there are characters between $ and {. Now with the additional anchoring on the beginning of the regexp it works as expected.
I just found another issue: theoretically using the above approach I would still get a wrong output if the read line looks like this line='{ccc}aaaa${important}aaa'. The IFS would split it and the REGEX would match {ccc} although this hadn't the $ sign in front. This is suboptimal.
However following approach could solve it: after getting the BASH_REMATCH I would need to do a search in the original line - the one I gave to the IFS - for this exact expression ${ccc} - with the difference, that the $ is included! And only if it finds this exact match, only then, it counts as a valid match; otherwise it should be ignored. Kind of a reverse search method...
Updated - add this reverse search to ignore the trap on the beginning of the line:
pattern="\$${BASH_REMATCH[0]}";
searchresult="";
searchresult=`echo "$line" | grep "$pattern"`;
if [ "$searchresult" != "" ]; then echo "It was found!"; fi;
Neglectable issue: If the line looks like this line='{ccc}aaaaaa${ccc}bbbbb' it would recognize the first {ccc} as a valid match (although it isn't) and print it, because the reverse search found the second ${ccc}. Although this is not intended it's irrelevant for my specific purpose as it implies that this pattern does in fact exist at least once in the same line.