Find specific cell from space-separated CSV file in bash - bash

I have a question related to bash operating on comma-separated value files (.csv) saved with spaces as the selected separator.
As an example I'll put small .csv file here:
A B C D
1 a b c d
2 e f g h
3 i j k l
4 m n o p
And here is my question: Is that possible in bash to read specific value for example from cell C4?
Tried to find any topic with similar problem but cannot it.
Thanks in advance!

Can do this very easily in awk :
example.sh
#!/bin/bash
awk '
{
if(NR==5){ print $4; }
}
' < "$1"
output
$ ./example.sh input.txt
o
details
NR filters the line number
$4 refers to the fourth field ( under C column )

The tricky part is converting "C4" into column 3, row 4. Here's one way with bash:
#!/bin/bash
cell=$1
file=$2
colnum() {
local -u col=$1
local val=0 i
for ((i=0; i<${#col}; i++)); do
# ascii value of a char, ref: http://stackoverflow.com/q/890262/7552
printf -v ascii "%d" "'${col:i:1}"
val=$(( val*26 + ascii - 64 ))
done
echo "$val"
}
if ! [[ $cell =~ ^([A-Za-z]+)([0-9]+)$ ]]; then
echo "error: invalid cell '$cell'"
exit 1
fi
col=$(colnum "${BASH_REMATCH[1]}")
row=${BASH_REMATCH[2]}
lineno=0
while read -ra fields; do
if (( ++lineno == row )); then
echo "${fields[col-1]}"
fi
done < "$file"

Related

Using Bash shell, I am attempting to print only the lines not divisible by 3 from an input file

I am attempting to print only the lines that are not divisible by 3 from an input file. The issue is that when I try to feed an input file into my shell script using the command bash script.sh < input.txt I get "line 7: [: too many arguments", many times. How can I go about printing only lines that are not divisible by 3 from the input file?
line_number=1;
while read line
do
if [ $line_number % 3 -ne 0 ];
then
echo "$line"
fi
let "line_number += 1"
done
the input file contains the following:
named: input.txt
a
b
c
d
e
f
g
h
i
j
k
l
m
You are close, but you cannot do arithmetic in [ ... ], instead you need to use the POSIX arithmetic syntax of (( ... )) and within (( ... )) you do not need to prefix the variable name with '$' as a dereference.
In your case you can do:
#!/bin/bash
[ -z "$1" ] && { ## validate one argument give for filename
printf "error: filename required.\n" >&2
exit 1
}
[ -s "$1" ] || { ## validate file non-empty and readable
printf "error: file '%s' empty or not readable.\n" "$1" >&2
exit 1
}
line_number=1 ## initialize line number
## loop reading each line (protect agains non-POSIX eof)
while read -r line || [ -n "$line" ]
do
if (( line_number % 3 != 0 )) ## check modulo of line number
then
echo "$line"
fi
((line_number++)) ## increment line number
done < "$1"
Example Use/Output
With your example input in file
$ example.sh file
a
b
d
e
g
h
j
k
m
Look things over and let me know if you have questions.

How can I highlight given values in a generated numeric sequence?

I often receive unordered lists of document IDs. I can sort and print them easy enough, but I'd like to print a line for each available document and show an asterisk (or anything really, just to highlight) next to all values in the given list.
Such as ...
$ ./t.sh "1,4,3" 5
1*
2
3*
4*
5
$
The first parameter is the unordered list, and the second is the total number of documents.
If by "available document" you mean an "existing file on disk", then assuming you have 5 total files, and you are checking to see if you have 1, 4 and 3. The following script will produce sorted output.
#!/bin/bash
#Store the original IFS
ORGIFS=$IFS
#Now Set the Internal File Separater to a comma
IFS=","
###Identify which elements of the array we do have and store the results
### in a separate array
#Begin a loop to process each array element
for X in ${1} ; do
if [[ -f ${X} ]] ; then
vHAVE[$X]=YES
fi
done
#Now restore IFS
IFS=$ORGIFS
#Process the sequence of documents, starting at 1 and ending at $2.
for Y in $(seq 1 1 $2) ; do
#Check if the sequence exists in our inventoried array and mark accordingly.
if [[ ${vHAVE[$Y]} == YES ]] ; then
echo "$Y*"
else
echo "$Y"
fi
done
Returns the result:
rtcg#testserver:/temp/test# ls
rtcg#testserver:/temp/test# touch 1 3 4
rtcg#testserver:/temp/test# /usr/local/bin/t "1,4,3" 5
1*
2
3*
4*
5
The following code works for me on your example.
Generate a sequence of the length given by the user
Split the first argument of your script (it will gives you an array A for example)
Use the function contains to check if one element from A is in the sequence generated by the step one
I don't check the arguments length and you should do that to have a more proper script.
#!/bin/bash
function contains() {
local n=$#
local value=${!n}
for ((i=1;i < $#;i++)) {
if [ "${!i}" == "${value}" ]; then
echo "y"
return 0
fi
}
echo "n"
return 1
}
IFS=', ' read -a array <<< $1
for i in $(seq $2); do
if [ $(contains "${array[#]}" "${i}") == "y" ]; then
echo "${i}*"
else
echo "${i}"
fi
done
You can use parameter substitution to build an extended pattern that can be used to match document numbers to the list of documents to mark.
#!/bin/bash
# 1,4,3 -> 1|4|3
to_mark=${1//,/|}
for(( doc=1; doc <= $2; doc++)); do
# #(1|4|3) matches 1, 4 or 3
printf "%s%s\n" "$doc" "$( [[ $doc = #($to_mark) ]] && printf "*" )"
done

Assigning a value from csv file to variable in bash [duplicate]

I have a question related to bash operating on comma-separated value files (.csv) saved with spaces as the selected separator.
As an example I'll put small .csv file here:
A B C D
1 a b c d
2 e f g h
3 i j k l
4 m n o p
And here is my question: Is that possible in bash to read specific value for example from cell C4?
Tried to find any topic with similar problem but cannot it.
Thanks in advance!
Can do this very easily in awk :
example.sh
#!/bin/bash
awk '
{
if(NR==5){ print $4; }
}
' < "$1"
output
$ ./example.sh input.txt
o
details
NR filters the line number
$4 refers to the fourth field ( under C column )
The tricky part is converting "C4" into column 3, row 4. Here's one way with bash:
#!/bin/bash
cell=$1
file=$2
colnum() {
local -u col=$1
local val=0 i
for ((i=0; i<${#col}; i++)); do
# ascii value of a char, ref: http://stackoverflow.com/q/890262/7552
printf -v ascii "%d" "'${col:i:1}"
val=$(( val*26 + ascii - 64 ))
done
echo "$val"
}
if ! [[ $cell =~ ^([A-Za-z]+)([0-9]+)$ ]]; then
echo "error: invalid cell '$cell'"
exit 1
fi
col=$(colnum "${BASH_REMATCH[1]}")
row=${BASH_REMATCH[2]}
lineno=0
while read -ra fields; do
if (( ++lineno == row )); then
echo "${fields[col-1]}"
fi
done < "$file"

Bash/Awk - Store in a variable part of one argument

For a given script I can supply one argument that has the following form:
-u[number][letter][...]
Examples: -u2T34T120F -u1T2T10F
Letters are either T or F, and the number is an integer number, which can be up to 999.
I would like a write loop where in each iteration the number is stored in variable "a" and the corresponding letter in variable "b". The loop goes through all the number-letter pairs in the argument.
For the first example, the argument is -u2T34T120F the iterations would be:
First: a=2 b=T
Second: a=34 b=T
Third: a=120 b=F
End of loop
Any suggestion is most welcome.
Here's one way to do it with GNU awk:
<<<"2T34T120F" \
awk -v RS='[TF]' 'NF { printf "a: %3d b: %s\n", $0, RT }'
Output:
a: 2 b: T
a: 34 b: T
a: 120 b: F
To use this in a bash while-loop do something like this:
<<<"2T34T120F" \
awk 'NF { print $0, RT }' RS='[TF]' |
while read a b; do
echo Do something with $a and $b
done
Output:
Do something with 2 and T
Do something with 34 and T
Do something with 120 and F
$ var='-u2T34T120F'
$ a=($(grep -o '[0-9]*' <<< "$var"))
$ b=($(grep -o '[TF]' <<< "$var"))
$ echo ${a[0]} ${a[1]} ${a[2]}
2 34 120
$ echo ${b[0]} ${b[1]} ${b[2]}
T T F
how about this:
kent$ while IFS=' ' read a b; do echo "we have a:$a,b:$b\n---"; done<<< $(echo '-u2T34T120F'|sed 's/^-u//;s/[TF]/ &\n/g')
we have a:2,b:T
---
we have a:34,b:T
---
we have a:120,b:F
---
clear version:
while IFS=' ' read a b
do
echo "we have a:$a,b:$b\n---";
done<<< $(echo '-u2T34T120F'|sed 's/^-u//;s/[TF]/ &\n/g')
You can use parameter expansion in bash:
#! /bin/bash
set -- -u2T34T120F # Set the $1.
string=${1#-u} # Remove "-u".
while [[ $string ]] ; do
a=${string%%[FT]*} # Everything before the first F or T.
string=${string#$a} # Remove the $a from the beginning of the string.
b=${string%%[0-9]*} # Everything before the first number.
string=${string#$b} # Remove the $b from the beginning of the string.
echo $a $b
done
Or, using the same technique, but with arrays:
a=(${string//[TF]/ }) # Remove letters.
b=(${string//[0-9]/ }) # Remove numbers.
for (( i=0; i<${#a[#]}; i++ )) ; do
echo ${a[i]} ${b[i]}
done

Bash Select position of array element by args

I want make a bash script which returns the position of an element from an array by give an arg. See code below, I use:
#!/bin/bash
args=("$#")
echo ${args[0]}
test_array=('AA' 'BB' 'CC' 'DD' 'EE')
echo $test_array
elem_array=${#test_array[#]}
for args in $test_array
do
echo
done
Finally I should have output like:
$script.sh DD
4
#!/bin/bash
A=(AA BB CC DD EE)
for i in "${!A[#]}"; do
if [[ "${A[i]}" = "$1" ]]; then
echo "$i"
fi
done
Note the "${!A[#]}" notation that gives the list of valid indexes in the array. In general you cannot just go from 0 to "${#A[#]}" - 1, because the indexes are not necessarily contiguous. There can be gaps in the index range if there were gaps in the array element assignments or if some elements have been unset.
The script above will output all indexes of the array for which its content is equal to the first command line argument of the script.
EDIT:
In your question, you seem to want the result as a one-based array index. In that case you can just increment the result by one:
#!/bin/bash
A=(AA BB CC DD EE)
for i in "${!A[#]}"; do
if [[ "${A[i]}" = "$1" ]]; then
let i++;
echo "$i"
fi
done
Keep in mind, though, that this index will have to be decremented before being used with a zero-based array.
Trying to avoid complex tools:
test_array=('AA' 'BB' 'CC' 'D D' 'EE')
OLD_IFS="$IFS"
IFS="
"
element=$(grep -n '^D D$' <<< "${test_array[*]}" | cut -d ":" -f 1)
IFS="$OLD_IFS"
echo $element
However, it consumes 2 processes. If we allow ourselves sed, we could do it with a single process:
test_array=('AA' 'BB' 'CC' 'D D' 'EE')
OLD_IFS="$IFS"
IFS="
"
element=$(sed -n -e '/^D D$/=' <<< "${test_array[*]}")
IFS="$OLD_IFS"
echo $element
Update:
As pointed out by thkala in the comments, this solution is broken in 3 cases. Be careful not to use it if:
You want zero indexed offset.
You have newlines in your array elements.
And you have a sparse array, or have other keys than integers.
Loop over the array and keep track of the position.
When you find the element matching the input argument, print out the position of the element. You need to add one to the position, because arrays have zero-based indexing.
#! /bin/sh
arg=$1
echo $arg
test_array=('AA' 'BB' 'CC' 'DD' 'EE')
element_count=${#test_array[#]}
index=0
while [ $index -lt $element_count ]
do
if [ "${test_array[index]}" = "$arg" ]
then
echo $((index+1))
break
fi
((index++))
done
Without loop:
#!/bin/bash
index() {
local IFS=$'\n';
echo "${*:2}" | awk '$0 == "'"${1//\"/\\\"}"'" { print NR-1; exit; }'
}
array=("D A D" "A D" bBb "D WW" D "\" D \"" e1e " D " E1E D AA "" BB)
element=${array[5]}
index "$element" "${array[#]}"
Output:
5

Resources