how to parse a row after reading it using unix shell - shell

csv file with the first row has header file. I need to read this first row and then parse it to see if it has the elements i'm looking for.
First row has 4 elements. 1. HDR 2. today's date 3. From date 4. To date.
here is the code i used to get the first row.
read -r header < "1" -- this game me the first row into header variable.
I tried to read this 'header' variable to further split the row.
read f1 f2 f3 < “$header”
echo "OS is : $f1"
echo "Company is: $f2"
echo "Value is : $f3"
i'm getting no values displayed. I think the reason could be 'header' is not coming in as a string.
I'm new to unix shell scripting. Please help.

read -r header < "1"
No. You are reading from a file named 1 (which probably does not exist, so read is failing) because < is used for redirection. Try typing that very command in a terminal, you'll get an error ("no such file or directory").
read f1 f2 f3 < “$header”
This is wrong. If the header shell variable contains 1 2 3 you are reading from the file named 1 2 3 (a six character file path with two spaces inside) because < is used for redirection
Consider using awk to process files made of lines containing several columns.
Consider starting your (executable) shell script with #!/bin/bash -vx (to get traces) during the debugging phase. Spend a few hours reading some shell scripting tutorial and then the bash reference manual.
Be sure to understand how shells work, notably their globbing. Be aware that every program (e.g. started by some shell) is started by execve(2) (done in some process, e.g. your shell process).
BTW, before the read f1 f2 f3 < “$header” you might (temporarily) add debugging outputs, e.g.
echo "header=" "$header" > /dev/stderr
in your shell script, to understand what is going on.
(You really should spend days in reading more about shells)

Related

How to search a string of text and insert it as a variable using sed?

I need to update potentially hundreds of configuration files for a program by adding new lines to text files. I will be adding additional properties, such as background color, to these files and want to automate the process with bash. All of these properties are contained in ".m" files. Its essentially updating the properties of widgets on GUIs. Each object's properties in a gui is labeled with an object type followed by a name. One example of an object is called a Form.
The problem is, each name that follows the object is different so I need to add the line based off of the name of the other properties in each section of the .m file. For example, one file has two form objects. The section for one is called "*FO_mm" while the second object's section is named "*FO_test_area". After the name is an extension for what property is specified, such as "*FO_mm.class:". While the properties each object has tends to vary, I found that all objects share a property called ".parent" so I am using that as a search reference. I want to use the sed command to add a line after the .parent line with the new property, in this case background color. So the idea is to search for a string that starts with "*FO_" and ends with ".parent", with everything inbetween being something different for each section. I want to use a loop to capture the string preceding ".parent" as a variable and attach it to the beginning of the new property line so it matches the current section. Here is my current script:
//The top level directory
script_dir="/project/guis/"
//The extension to look for
file_ext="*.m"
fileList=$(find $script_dir -type f -name "$file_ext")
declare -a file_list
readarray -t file_list < <(printf '%s\n' "$fileList")
cd $script_dir
//Loop through each m file
for m_file in ${file_list[#]}; do
var1=($(grep '*FO_.*.parent:' $m_file))
declare -a var_list
readarray -t var_list < <(printf '%s\n' "$var1")
for i in ${var_list[#]}; do
echo $i
sed -i "/^*FO_.*.parent:.*/a\$i.background: #2b3856 " $m_file
done
done
When I run it, the script adds the line "$i.background: #2b3856" below the .parent line. And the "echo $i" line returns "*FO_mm.parent: FO_mm". So there are several problems.
The value of the variable is not being substituted into the sed statement.
As the echo states, only the first section "*FO_mm" is being saved as a variable, which means the second section "*FO_test_area" is not being implemented.
I only want the object name to be stored and placed into the new line. So the result should give me for the first section "*FO_mm.background: #2b3856" with everything from .background on being tacked on by the last part of the sed statement. Since I am still fairly new to bash and especially sed, I have no idea how to strip the variable down to just the object name.
Here is an example of what a single object section looks like prior to running the script:
*FO_test_area.class: Form
*FO_test_area.static: true
*FO_test_area.parent: FO_mm
*FO_test_area.resizePolicy: "resize_none"
And here is what this section looks like after running the WIP script:
*FO_test_area.class: Form
*FO_test_area.static: true
*FO_test_area.parent: FO_mm
$i.background: #33b342
*FO_test_area.resizePolicy: "resize_none"
Its a lot to describe, but I've hit a wall and I would really appreciate any help you can provide.
If ed is available/acceptable.
The script named script.ed (name it whatever you like).
g/^\*FO_.*\.parent:.*/t.\
s/^\(\*\).*parent: */\1/
s/$/.background: #33b342/
%p
Q
The g/^\*FO_.*\.parent:.*/ will match every line that starts with *FO_ and with .parent: somewhere after it. It will match either *FO_test_area.parent and *FO_mm.class.parent. You gonna have to be specific about the regex to match a specific *FO_.*\.parent: pattern to be able to do a specific search & replace/insert
Here is a specific script for the *FO_test_area.parent:
g/^\*FO_test_area\.parent:.*/t.\
s/^\(\*\).*parent: */\1/
s/$/.background: #33b342/
%p
Q
Modify the script above and add another pattern before the line where %p is at, do the rest of the substitution after that.
Your sample file/data.
*FO_test_area.class: Form
*FO_test_area.static: true
*FO_test_area.parent: FO_mm
*FO_test_area.resizePolicy: "resize_none"
Running the script against your data/file. (file ending with a .m)
ed -s file.m < script.ed
Output
*FO_test_area.class: Form
*FO_test_area.static: true
*FO_test_area.parent: FO_mm
*FO_mm.background: #33b342
*FO_test_area.resizePolicy: "resize_none"
If you're satisfied with the output, next thing is to do the loop.
Doing some adjustment to your script. Instead of a nested for loop, the script is using a while + read loop and Process Substitution. See How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
#!/usr/bin/env bash
script_dir="/project/guis/"
file_ext="*.m"
while IFS= read -r m_file; do
if grep -q '^\*FO_.*.parent:' "$m_file"; then
ed -s "$m_file" < script.ed
fi
done < <(find "$script_dir" -type f -name "$file_ext")
The script.ed is inside the current directory where your script is at. It can be anywhere just need to give it the correct absolute path, e.g.
/path/to/script.ed
If in-place editing is needed change the Q to w inside the ed script.
Remove the line where %p is at if the output is not needed to stdout.
See:
GNU ed
POSIX ed
ed
ed in pdf
MirBSD ed
bash hackers wiki ed
Also your local man pages.
man 1p ed

kickstart five concurrent processes in bash

I have a folder named datafolder which contains five csv files aa.csv ab.csv ac.csv ad.csv ae.csv Each csv file contains data from an excel sheet in the format: date, product type, name, address etc. and I am only interested in the second column which is named product. Basically what I want to happen is for the jobmaster script to count the number of files in datafolder and then to start a map process for each individual file. I have the following scripts:
The jobmaster script runs without problems, however once the map script starts, only the first echo mapping $1 is displaying and the process is stuck in an infinite loop (my guess). When I run the ps command I expect to see 5 map.sh running, however there are none.
I suspect you missed an input redirection in map.sh:
file=$1
echo "mapping $file"
while IFS="," read -r value1 product remainder; do
# ...
done < "$file"
# ^^^^^ provide the standard input to from this file to `read`

Comparing two sets of variables line by line in unix, code only prints out the very last line

this is my first stackoverflow question, regarding bash scripting. I am a beginner in this language, so be kind with me.
I am trying to write a comparison script. I tried to store all the outputs into variables, but only the last one is stored.
Example code:
me:1234567
you:2345678
us:3456789
My code:
#!bin/bash
while read -r forName forNumber
do
aName="$forName"
echo "$aName"
aNumber="$forNumber"
echo "$aNumber"
done < "exampleCodeFile.txt"
echo "$aNumber"
For the first time, everything will be printed out fine. However, the second echo will only print out "3456789", but not all the numbers again. Same with $aName. This is a problem because i have another file, which i stored a bunch of numbers to compare $aNumber with, using the same method listed above, called $aMatcher, consisting:
aMatcher:
1234567
2345678
3456789
So if i tried to run a comparison:
if [ "$aNumber" == "$aMatcher" ]; then
echo "match found!"
fi
Expected output (with bash -x "scriptname"):
'['1234567 == 1234567']'
echo "match found!"
Actual output (with bash -x "scriptname"):
'['3456789 == 3456789']'
echo "match found!"
Of course my end product would wish to list out all the matches, but i wish to solve my current issue before attempting anything else. Thanks!
When you run your following code
aNumber="$forNumber"
You are over-writing the variable $aNumber for every line of the file exampleCodeFile.txt rather than appending.
If you really want the values to be appended, change the above line to
aNumber="$aNumber $forNumber"
And while matching with $aMatcher, you again have to use a for/while loop to iterate through every value in $aNumber and $aMatcher.

Extracting lines with specific character count

I have a python script that is pulling URLs from pastebin.com/archive, which has links to pastes (which have 8 random digits after pastbin.com in the url). My current output is a .txt with the below data in it, I only want the links to pastes present (Example: http://pastebin.com///Y5JhyKQT) and not links to other pages such as pastebin.com/tools). This is so I can set wget to go pull each individual paste.
The only way I can think of doing this is writing a bash script to count the number of characters in each line and only keep lines with 30 characters exactly (this is the length of the URLs linking to pastes).
I have no idea how I'd go about implementing something like this using grep or awk, perhaps using a while do loop? Any help would be appreciated!
http://pastebin.com///tools
http://pastebin.com//top.location.href
http://pastebin.com///trends
http://pastebin.com///Y5JhyKQT <<< I want to keep this
http://pastebin.com//=
http://pastebin.com///>
From the sample you posted it looks like all you need is:
grep -E '/[[:alnum:]]{8}$' file
or maybe:
grep -E '^.{30}$' file
If that doesn't work for you, explain why and provide a better sample.
This is the algorithm
Find all characters between new line characters or read one line at a time.
Count them or store them in variable and get its count. This is the length of your line.
Only process those lines that are exactly same count as you want.
In python there is both functions character count of string and reading line as well.
#!/usr/bin/env zsh
while read aline
do
if [[ ${#aline} == 30 ]]; then
#do something
fi
done
This is documented in the bash man pages under the "Parameter Expansion" section.
EDIT=this solution is zsh-only

batch job submission upon completion of job

I would like to write a script to execute the steps outlined below. If someone can provide simple examples on how to modify files and search through folders using a script (not necessarily solving my problem below), I will greatly appreciate it.
submit job MyJob in currentDirectory using myJobShellFile.sh to a queue
upon completion of MyJob, goto to currentDirectory/myJobDataFolder.
In myJobDataFolder, there are folders
myJobData.0000 myJobData.0001 myJobData.0002 myJobData.0003
I want to find the maximum number maxIteration of all the listed folders. Here it would be maxIteration=0003.\
In file myJobShellFile.sh, at the last line says
mpiexec ./main input myJobDataFolder
I want to append this line to
'mpiexec ./main input myJobDataFolder 0003'
I want to submit MyJob to the que while maxIteration < 10
Upon completion of MyJob, find the new maxIteration and change this number in myJobShellFile.sh and goto step 4.
I think people write python scripts typically to do this stuff, but am having a hard time finding out how. I probably don't know the correct terminology for this procedure. I am also aware that the script will vary slightly depending on the queing system, but any help will be greatly appreciated.
Quite a few aspects of your question are unclear, such as the meaning of “submit job MyJob in currentDirectory using myJobShellFile.sh to a que”, “append this line to
'mpiexec ./main input myJobDataFolder 0003'”, how you detect when a job is done, relevant parts of myJobShellFile.sh, and some other details. If you can list the specific shell commands you use in each iteration of job submission, then you can post a better question, with a bash tag instead of python.
In the following script, I put a ### at the end of any line where I am guessing what you are talking about. Lines ending with ### may be irrelevant to whatever you actually do, or may be pseudocode. Anyway, the general idea is that the script is supposed to do the things you listed in your items 1 to 5. This script assumes that you have modified myJobShellFile.sh to say
mpiexec ./main input $1 $2
instead of
mpiexec ./main input
because it is simpler to use parameters to modify what you tell mpiexec than it is to keep modifying a shell script. Also, it seems to me you would want to increment maxIter before submitting next job, instead of after. If so, remove the # from the t=$((1$maxIter+1)); maxIter=${t#1} line. Note, see the “Parameter Expansion” section of man bash re expansion of the ${var#txt} form, and the “Arithmetic Expansion” section re $((expression)) form. The 1$maxIter and similar forms are used to change text like 0018 (which is not a valid bash number because 8 is not an octal digit) to 10018.
#!/bin/sh
./myJobShellFile.sh MyJob ###
maxIter=0
while true; do
waitforjobcompletion ###
cd ./myJobDataFolder
maxFile= $(ls myJobData* | tail -1)
maxIter= ${maxFile#myJobData.} #Get max extension
# If you want to increment maxIter, uncomment next line
# t=$((1$maxIter+1)); maxIter=${t#1}
cd ..
if [[ 1$maxIter -lt 11000 ]] ; then
./myJobShellFile.sh MyJobDataFolder $maxIter
else
break
fi
done
Notes: (1) To test with smaller runs than 1000 submissions, replace 11000 by 10000+n; for example, to do 123 runs, replace it with 10123. (2) In writing the above script, I assumed that not-previously-known numbers of output files appear in the output directory from time to time. If instead exactly one output file appears per run, and you just want to do one run per value for the values 0000, 0001, 0002, 0999, 1000, then use a script like the following. (For testing with a smaller number than 1000, replace 1000 with (eg) 0020. The leading zeroes in these numbers tell bash to fill the generated numbers with leading zeroes.)
#!/bin/sh
for iter in {0000..1000}; do
./myJobShellFile.sh MyJobDataFolder $iter
waitforjobcompletion ###
done
(3) If the system has a command that sleeps while it waits for a job to complete on the supercomputing resource, it is reasonable to use that command in place of waitforjobcompletion in the above scripts. Otherwise, if the system has a command jobisrunning that returns true if a job is still running, replace waitforjobcompletion with something like the following:
while jobisrunning ; do sleep 15; done
This will run the jobisrunning command; if it returns true, the shell will sleep for 15 seconds and then retest. Here is an example that illustrates waiting for a file to appear and then for it to go away:
while [ ! -f abc ]; do sleep 3; echo no abc; done
while ls abc >/dev/null 2>&1; do sleep 3; echo an abc; done
The second line's test could be [ -f abc ] instead; I showed a longer example to illustrate how to suppress output and error messages by routing them to /dev/null. (4) To reverse the sense of a while statement's test, replace the word while with until. For example, while [ ! -f abc ]; ... is equivalent to until [ -f abc ]; ....

Resources