Bash Script Help Needed - bash

So I'm working on an assignment and I'm very close to getting it. Just having issues with the last part. Here is the whole problem I guess, just so you know what I'm trying to do -
Write a shell script called make_uid which creates user login names given a file containing the user's full name. Your script needs to read the newusers file, and for each name in the file create a login name which consists of the first character of the users first name, and up to 7 characters of their last name. If the last name is less than seven characters, use the entire last name. If the user only has one name, use whatever is provided as a name (in the newusers file) to generate an 8 character long login name. Note: login names need to be all lower case!
Once you have created a login name, you need to check the passwd file to make sure that the login name which you just created does not exist. If the name exists, chop off the last character of the name that you created, and add a digit (starting at 1) and check the passwd file again. Repeat this process until you create a unique user login name. Once you have a unique user name, append it to the passwd file, and continue processing the newusers file.
This is my code so far. At this point, it makes a full passwd file with all of the login names. I'm just having trouble with the final step of sorting through the list and editing duplicates accordingly.
#!/bin/bash
#Let's make some login names!
declare -a first
declare -a last
declare -a password
file=newusers
first=( $(cat $file | cut -b1 | tr "[:upper:]" "[:lower:]" | tr '\n' ' ') )
for (( i=0; i<${#first[#]}; i++)); do
echo ${first[i]} >> temp1
done
last=( $(cat $file | awk '{print $NF}' $file | cut -b1-7 | tr "[:upper:]" "[:lower:]"))
for (( i=0; i<${#last[#]}; i++)); do
echo ${last[i]} >> temp2
done
paste -d "" temp1 temp2 >> passwd
sort -o passwd passwd
more passwd
rm temp1 temp2

Well, I probably shouldn't be answering a homework assignment but maybe it will help you learn.
#!/bin/bash
infile=./newusers
outfile=./passwd
echo -n "" > $outfile
cat $infile | while read line; do
read firstName lastName < <(echo $line)
if [ -z "$lastName" ]; then
login=${firstName:0:8}
else
login=${firstName:0:1}${lastName:0:7}
fi
digit=1
while fgrep -q $login $outfile; do
login=${login%?}$digit
let digit++
done
echo $login >> $outfile
done
There may be some way to do the fgrep check in a single command instead of a loop but this is the most readable. Also, your problem statement didn't say what to do if the name was less than 8 characters so this solution doesn't address that and will produce passwords that are short if the names are short.
Edit: The fgrep loop assumes that there will be fewer than 10 duplicates. If not, you need to make it a bit more robust:
lastDigit="?"
nextDigit=1
while fgrep -q $login $outfile; do
login=${login%$lastDigit}$nextDigit
let lastDigit=nextDigit
let nextDigit++
done

Add all user names into another file before adding the digit. Use fgrep -xc theusername thisotherfile, this returns a digit. Append the digit to the login name if it's not 0.

Related

How to filter text data in bash more efficiently

I have data file which I need to filter with bash script, see data example:
name=pencils
name=apples
value=10
name=rocks
value=3
name=tables
value=6
name=beds
name=cups
value=89
I need to group name value pairs like so apples=10, if current line starts with name and next line starts with name, first line should be omitted entirely. So result file should look like this:
apples=10
rocks=3
tables=6
cups=89
I came with this simple solution which works but is very slow, it takes 5 min to complete for file with 2000 lines.
VALUES=$(cat input.txt)
for x in $VALUES; do
if [[ -n $(echo $x | grep 'name=') ]]; then
name=$(echo $x | sed "s/name=//")
elif [[ -n $(echo $x | grep 'value=') ]]; then
value=$(echo $x | sed "s/value=//")
echo "${name}=${value}" >> output.txt
fi
done
I'm aware that this kind of task is not very suitable for bash, but script is already written and this is just small part of it.
How can I optimize this task in bash?
Do not run any commands in subshells, it slows your script a lot. You can do everything in the current shell.
#! /bin/bash
while IFS== read k v ; do
if [[ $k == name ]] ; then
name=$v
elif [[ $k == value ]] ; then
printf '%s=%s\n' "$name" "$v"
fi
done
There are three easy optimizations you can make that will greatly speed up the script without requiring a major rethink.
1. Replace for with while read
Loading input.txt into a string, and then looping over that string with for x in $VALUES is slow. It requires the whole file to be read into memory even though this task could be done in a streaming fashion, reading a line at a time.
A common replacement for for line in $(cat file) is while read line; do ... done < file. It turns out that loops are compound commands, and like the normal one-line commands we're used to, compound commands can have < and > redirections. Redirecting a file into a loop means that for the duration of the loop, stdin comes from the file. So if you call read line inside the loop then it will read one line each iteration.
while IFS= read -r x; do
if [[ -n $(echo $x | grep 'name=') ]]; then
name=$(echo $x | sed "s/name=//")
elif [[ -n $(echo $x | grep 'value=') ]]; then
value=$(echo $x | sed "s/value=//")
echo "${name}=${value}" >> output.txt
fi
done < input.txt
2. Redirect output outside loop
It's not just input that can be redirected. We can do the same thing for the >> output.txt redirection. Here's where you'll see the biggest speedup. When >> output.txt is inside the loop output.txt must be opened and closed every iteration, which is crazy slow. Moving it to the outside means it only needs to be opened once. Much, much faster.
while IFS= read -r x; do
if [[ -n $(echo $x | grep 'name=') ]]; then
name=$(echo $x | sed "s/name=//")
elif [[ -n $(echo $x | grep 'value=') ]]; then
value=$(echo $x | sed "s/value=//")
echo "${name}=${value}"
fi
done < input.txt > output.txt
3. Shell string processing
One final improvement is to use faster string processing. Calling grep requires forking a subprocess every time just to do a simple string split. It'd be a lot faster if we could do the string splitting using just shell constructs. Well, as it happens that's easy now that we've switched to read. read can do more than read whole lines; it can also split on a delimiter from the variable $IFS (inter-field separator).
while IFS='=' read -r key value; do
case "$key" in
name) name="$value";;
value) echo "$name=$value";;
fi
done < input.txt > output.txt
Further reading
BashFAQ/001 - How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
This explains why I have IFS= read -r in the first two iterations.
BashFAQ/024 - I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates? Or, why can't I pipe data to read?
cmd | while read; do ... done is another popular use of while read, but it has unique pitfalls.
BashFAQ/100 - How do I do string manipulations in bash?
More in-shell string processing options.
If you have performance issues do not use bash at all. Use a text processing tool like, for instance, awk:
$ awk -F= '{name = $2} $1 == "value" {print name "=" $2}' data.txt
apples=10
rocks=3
tables=6
cups=89
Explanation: -F= defines the field separator as character =. The first block is executed only if the first field of a line ($1) is equal to string value. It prints variable name followed by character = and the second field ($2). The second block is executed on each line and it stores the second field ($2) in variable name.
Normally, if your input resembles what you show, this should automatically skip the first line. Else, we can exclude it explicitly using a test on the NR variable which value is the line number, starting at 1:
awk -F= 'NR != 1 && $1 == "value" {print name "=" $2}
NR != 1 {name = $2}' data.txt
All this works on inputs like the one you show but not on inputs where you would have other types of lines or several value=... consecutive lines. If you really want to test that the name/value pair is on two consecutive lines we need something more. For instance, test if the first field is name and use another variable n to store the line number of the last encountered name=... line. With all these tests we can now put the 2 blocks in a slightly more intuitive order (but the opposite would work the same):
awk -F= 'NR != 1 && $1 == "name" {name = $2; n = NR}
NR != 1 && NR == n+1 && $1 == "value" {print name "=" $2}' data.txt
With awk there might be a more elegant solution but you can have:
awk 'BEGIN{RS="\n?name=";FS="\nvalue="} {if($2) printf "%s=%s\n",$1,$2}' inputs.txt
RS="\n?name=" says that the record separator is name=
FS="\nvalue=" says that the field separator for each record is value=
if($2) says to only proceed the printf is the second field exists

Download URLs from CSV into subdirectory given in first field

So I want to export my products into my new website. I have an csv file with these data:
product id,image1,image2,image3,image4,image5
1,https://img.url/img1-1.png,https://img.url/img1-2.png,https://img.url/img1-3.png,https://img.url/img1-4.png,https://img.url/img1-5.png
2,https://img.url/img2-1.png,https://img.url/img2-2.png,https://img.url/img2-3.png,https://img.url/img2-4.png,https://img.url/img2-5.png
What I want to do is to make a script to read from that file, make directory named with product id, download images of the product and put them inside their own folder (folder 1 => image1-image5 of product id 1, folder 2 => image1-image5 of product id 2, and so on).
I can make a normal text file instead of using the excel format if it's easier to do. Thanks before.
Sorry I'm really new here. I haven't done the code yet because I'm clueless, but what I want to do is something like this:
for id in $product_id; do
mkdir $id && cd $id && curl -o $img1 $img2 $img3 $img4 $img5 && cd ..
done
Here is a quick and dirty attempt which should hopefully at least give you an idea of how to handle this.
#!/bin/bash
tr ',' ' ' <products.csv |
while read -r prod urls; do
mkdir -p "$prod"
# Potential bug: urls mustn't contain shell metacharacters
for url in $urls; do
wget -P "$prod" "$url"
done
done
You could equivalently do ( cd "$prod" && curl -O "$url" ) if you prefer curl; I generally do, though the availability of an option to set the output directory with wget is convenient.
If your CSV contains quotes around the fields or you need to handle URLs which contain shell metacharacters (irregular spaces, wildcards which happen to match files in the current directory, etc; but most prominently & which means to run a shell command in the background) perhaps try something like
while IFS=, read -r prod url1 url2 url3 url4 url5; do
mkdir -p "$prod"
wget -P "$prod" "$url1"
wget -P "$prod" "$url2"
: etc
done <products.csv
which (modulo the fixed quoting) is pretty close to your attempt.
Or perhaps switch to a less wacky input format, maybe generate it on the fly from the CSV with
awk -F , 'function trim (value) {
# Trim leading and trailing double quotes
sub(/^"/, "", value); sub(/"$/, "", value);
return value; }
{ prod=trim($1);
for(i=2; i<=NF; ++i) {
# print space-separated prod, url
print prod, trim($i) } }' products.csv |
while read -r prod url; do
mkdir -p "$prod"
wget -P "$prod" "$url"
done
which splits the CSV into repeated lines with the same product ID and one URL each, and any CSV quoting removed, then just loops over that instead. mkdir with the -p option helfully doesn't mind if the directory already exists.
If you followed the good advice that #Aaron gave you, this code can help you, as you seem to be new with bash I commented out the code for better comprehension.
#!/bin/bash
# your csv file
myFile=products.csv
# number of lines of file
nLines=$(wc -l $myFile | awk '{print $1}')
echo "Total Lines=$nLines"
# loop over the lines of file
for i in `seq 1 $nLines`;
do
# first column value
id=$(sed -n $(($i+1))p $myFile | awk -F ";" '{print $1}')
line=$(sed -n $(($i+1))p $myFile)
#create the folder if not exist
mkdir $id 2>/dev/null
# number of images in the line
nImgs=$(($(echo $line | awk -F ";" '{print NF-1}')-1))
# go to id folder
cd $id
#loop inside the line values
for j in `seq 2 $nImgs`;
do
# getting the image url to download it
img=$(echo $line | cut -d ";" -f $j)
echo "Downloading image $img**";echo
# downloading the image
wget $img
done
# go back path
cd ..
done

How to process values from for loop in shell script

I have below for loop in shell script
#!/bin/bash
#Get the year
curr_year=$(date +"%Y")
FILE_NAME=/test/codebase/wt.properties
key=wt.cache.master.slaveHosts=
prop_value=""
getproperty(){
prop_key=$1
prop_value=`cat ${FILE_NAME} | grep ${prop_key} | cut -d'=' -f2`
}
#echo ${prop_value}
getproperty ${key}
#echo "Key = ${key}; Value="${prop_value}
arr=( $prop_value )
for i in "${arr[#]}"; do
echo $i | head -n1 | cut -d "." -f1
done
The output I am getting is as below.
test1
test2
test3
I want to process the test2 from above results to below script in place of 'ABCD'
grep test12345 /home/ptc/storage/**'ABCD'**/apache/$curr_year/logs/access.log* | grep GET > /tmp/test.access.txt
I tried all the options but could not able to succeed as I am new to shell scripting.
Ignoring the many bugs elsewhere and focusing on the one piece of code you say you want to change:
for i in "${arr[#]}"; do
val=$(echo "$i" | head -n1 | cut -d "." -f1)
grep test12345 /dev/null "/home/ptc/storage/$val/apache/$curr_year/logs/access.log"* \
| grep GET
done > /tmp/test.access.txt
Notes:
Always quote your expansions. "$i", "/path/with/$val/"*, etc. (The * should not be quoted on the assumption that you want it to be expanded).
for i in $prop_value would have the exact same (buggy) behavior; using arr buys you nothing. If you want using arr to increase correctness, populate it correctly: read -r -a arr <<<"$prop_value"
The redirection is moved outside the loop -- that way the second iteration through the loop doesn't overwrite the file written by the first one.
The extra /dev/null passed to grep ensures that its behavior is consistent regardless of the number of matches; otherwise, it would display filenames only if more than one matching log file existed, and not otherwise.

Need to read the values of a config file from a shell script

I have a shell script and a common configuration file where all the generic path, username and other values are stored. I want to get the value from this configuration file while I am running the sh script.
example:
sample.conf
pt_user_name=>xxxx
pt_passwd=>Junly#2014
jrnl_source_folder=>x/y/v
pt_source_folder=>/x/y/r/g
css_source_folder=>/home/d/g/h
Now i want get some thing like this in my sh script.
cd $css_source_folder
this command inside the shell script should take me to the location d/g/h while the script is running.
Is there any way to achieve this other than with grep and awk??
Thanks
Rinu
If you want to read from conf file everytime then grep and cut might help you,
suppose you need value for css_source_folder property
prop1="css_source_folder" (I am assuming you know property name whose value you want)
value_of_prop1=`grep $prop1 sample.conf| cut -f2 -d "=" | cut -f2 -d ">"`
like,
[db2inst2#pegdb2 ~]$ vi con.conf
[db2inst2#pegdb2 ~]$ grep css_source_folder con.conf
css_source_folder=>/home/d/g/h
[db2inst2#pegdb2 ~]$ value=`grep css_source_folder con.conf | cut -f2 -d "="`
[db2inst2#pegdb2 ~]$ echo $value
>/home/d/g/h
[db2inst2#pegdb2 ~]$ value=`grep css_source_folder con.conf | cut -f2 -d "=" | cut -f2 -d ">"`
[db2inst2#pegdb2 ~]$ echo $value
/home/d/g/h
If you want to read all properties at once, then apply loop and this will solve the purpose
Yes, you can get the configuration names and values relatively simple and associate them through array indexes. Reading your config can be done like this:
#!/bin/bash
test -r "$1" || { echo "error: unable to read conf file [$1]\n"; exit 1; }
declare -a tag
declare -a data
let index=0
while read line || test -n "$line"; do
tag[index]="${line%%\=*}"
data[index]="${line##*\>}"
((index++))
done < "$1"
for ((i=0; i<${#tag[#]}; i++)); do
printf " %18s %s\n" "${tag[$i]}" "${data[$i]}"
done
After reading the config file, you then have the config name tags and config values stored in the arrays tag and value, respectively:
pt_user_name xxxx
pt_passwd Junly#2014
jrnl_source_folder x/y/v
pt_source_folder /x/y/r/g
css_source_folder /home/d/g/h
At that point, it is a matter of determining how you will use them, whether as a password or as a directory. You may have to write a couple of functions, but the basic function of given a tag, get the correct data can be done like this:
function getvalue {
test -n "$1" || { echo "error in getvalue, no data supplied"; return 1; }
for ((i=0; i<${#tag[#]}; i++)); do
if test "$1" = "${tag[$i]}"; then
echo " eval cmd ${data[$i]}"
return $i
fi
done
return -1
}
echo -e "\nget value for 'jrnl_source_folder'\n"
getvalue "jrnl_source_folder"
The function will return the index of the data value and can execute any command needed. You seem to have directory paths and passwords, so you may need a function for each. To illustrate, the output of the example is:
get value for jrnl_source_folder
eval cmd x/y/v
You can also use an associative array in later versions of BASH to store the tag and data in a single associative array. You may also be able to use indirect references on the tag and data values to process them. I simply took the straight forward approach in the example.
Try this eval $(awk -F'=>' '{print $1"=\""$2"\";"}' sample.conf):
EX:
eval $(awk -F'=>' '{print $1"=\""$2"\";"}' sample.conf); echo $pt_user_name
xxxx
Using sed :
eval $(sed -re 's/=>/="/g' -e 's/$/";/g' sample.conf); echo $pt_passwd
Junly#2014
Using perl :
eval $(perl -F'=>' -alne 'print "$F[0]=\"$F[1]\";"' sample.conf); echo $pt_source_folder
/x/y/r/g
Using tr :
eval $(tr -d '>' <sample.conf); echo "$css_source_folder"
/home/d/g/h
PS. Using tr blindly to remove > may cause undesirable results depending on the content of sample.conf, but for the one provided works fine.

Create users based on a text file using Bash script?

So, I have a text file that is organized like this
<username>:<fullname>:<usergroups>
I need to create a new user for each line and put them into their groups. I am stuck with trying to set username into a variable to use with useradd. I have tried using cut but it needs a file name, I can't just pass it a line.
Here is what I currently have:
#! /bin/bash
linesNum=1
while read line
do
echo
name=$( cut -d ":" -f1 $( line ) )
((lineNum+=1))
done < "users.txt"
Thanks for your help!
#!/bin/bash
while IFS=: read username fullname usergroups
do
useradd -G $usergroups -c "$fullname" $username
done < users.txt
fullname is the only string that should contains whitespace (hence the quotes), A list of usergroups should be separated from the next by a comma, with no intervening whitespace (so no quotes on that argument) and your username should not contain whitespace either.
Upate:
To get the list of usergroups to create first you can do this...
#!/bin/bash
while IFS=: read username fullname usergroups
do
echo $usergroups >> allgrouplist.txt
done < users.txt
while IFS=, read group
do
echo $group >> groups.txt
done < allgrouplist.txt
sort -u groups.txt | while read group
do
groupadd $group
done
This is a bit long winded, and could be compacted to avoid the use of the additional files allgrouplist.txt and groups.txt but I wanted to make this easy to read. For reference here's a more compact version.
sort -u < $(
echo $(while IFS=: read a b groups; do echo $groups; done < users.txt )
| while IFS=, read group; do echo $group; done )
| while read group
do
groupadd $group
done
(I screwed the compact version up a bit at first, it should be fine now, but please note I haven't tested this!)
IFS=: while read username fullname usergroups
do
useradd -G "$usergroups" -c "$fullname" "$username"
done < users.txt

Resources