Using the first column of a file as input in a script - bash

I am having some problems with using the first column ${1} as input to a script.
Currently the portions of the script looks like this.
#!/bin/bash
INPUT="${1}"
for NAME in `cat ${INPUT}`
do
SIZE="`du -sm /FAServer/na3250-a/homes/${NAME} | sed 's|/FAServer/na3250-a/homes/||'`"
DATESTAMP=`ls -ld /FAServer/na3250-a/homes/${NAME} | awk '{print $6}'`
echo "${SIZE} ${DATESTAMP}"
done
However, I want to modify the INPUT="${1}" to take the first {1} within a specific file. This is so I can run the lines above in another script and use a file that is previously generated as the input. Also to have the output go out to a new file.
So something like:
INPUT="$location/DisabledActiveHome ${1}" ???
Here's my full script below.
#!/bin/bash
# This script will search through Disabled Users OU and compare that list of
# names against the current active Home directories. This is to find out
# how much space those Home directories take up and which need to be removed.
# MUST BE RUN AS SUDO!
# Setting variables for _adm and storage path.
echo "Please provide your _adm account name:"
read _adm
echo "Please state where you want the files to be generated: (absolute path)"
read location
# String of commands to lookup information using ldapsearch
ldapsearch -x -LLL -h "REDACTED" -D $_adm#"REDACTED" -W -b "OU=Accounts,OU=Disabled_Objects,DC="XX",DC="XX",DC="XX"" "cn=*" | grep 'sAMAccountName'| egrep -v '_adm$' | cut -d' ' -f2 > $location/DisabledHome
# Get a list of all the active Home directories
ls /FAServer/na3250-a/homes > $location/ActiveHome
# Compare the Disabled accounts against Active Home directories
grep -o -f $location/DisabledHome $location/ActiveHome > $location/DisabledActiveHome
# Now get the size and datestamp for the disabled folders
INPUT="${1}"
for NAME in `cat ${INPUT}`
do
SIZE="`du -sm /FAServer/na3250-a/homes/${NAME} | sed 's|/FAServer/na3250-a/homes/||'`"
DATESTAMP=`ls -ld /FAServer/na3250-a/homes/${NAME} | awk '{print $6}'`
echo "${SIZE} ${DATESTAMP}"
done
I'm new to all of this so any help is welcome. I will be happy to clarify any and all questions you might have.
EDIT: A little more explanation because I'm terrible at these things.
The lines of code below came from a previous script are a FOR loop:
INPUT="${1}"
for NAME in `cat ${INPUT}`
do
SIZE="`du -sm /FAServer/na3250-a/homes/${NAME} | sed 's|/FAServer/na3250-a/homes/||'`"
DATESTAMP=`ls -ld /FAServer/na3250-a/homes/${NAME} | awk '{print $6}'`
echo "${SIZE} ${DATESTAMP}"
done
It is executed by typing:
./Script ./file
The FILE that is being referenced has one column of user names and no other data:
User1
User2
User3
etc.
The Script would take the file and look at the first users name, which is reference by
INPUT=${1}
then run a DU command on that user and find out what the size of their HOME drive is. That would be reported by the SIZE variable. It will do the same thing with the DATESTAMP in regards to when the HOME drive was created for the user. When it is done doing the tasks for that user, it would move on to the next one in the column until it is done.
So following that logic, I want to automate the entire process. Instead of doing this in two steps, I would like to make this all a one step process.
The first process would be to generate the $location/DisabledActiveHome file, which would have all of the disabled users names. Then to run the last portion to get the Size and creation date of each HOME drive for all the users in the DisabledActiveHome file.
So to do that, I need to modify the
INPUT=${1}
line to reflect the previously generated file.
$location/DisabledActiveHome

I don't understand your question really, but I think you want this. Say your file is called file.txt and looks like this:
1 99
2 98
3 97
4 96
You can get the first column like this:
awk '{print $1}' file.txt
1
2
3
4
If you want to use that in your script, do this
while read NAME; do
echo $NAME
done < <(awk '{print $1}' file.txt)
1
2
3
4
Or you may prefer cut like this:
while read NAME; do
echo $NAME
done < <(cut -d" " -f1 file.txt)
1
2
3
4
Or this may suit even better
while read NAME OtherUnwantedJunk; do
echo $NAME
done < file.txt
1
2
3
4
This last, and probably best, solution above uses IFS, which is bash's Input Field Separator, so if your file looked like this
1:99
2:98
3:97
4:96
you would do this
while IFS=":" read NAME OtherUnwantedJunk; do
echo $NAME
done < file.txt
1
2
3
4

INPUT="$location/DisabledActiveHome" worked like a charm. I was confused about the syntax and the proper usage and output

Related

How do I use for command to give me an output by file?

I have a folder named Chapters which contains 20 files with 1 chapter of a book per file.
I have a file named book_chap.list which contains the list of chapters. It contains something like this:
chap_00
chap_01
I have a third file called book.names10 which contains a list of names. It contains something like this:
Name1
Name2
What I need from the output is a file that indicates by chapter the times each name has been said in each chapter. Something like this:
chapters/chap_01:Name1
chapters/chap_01:Name1
chapters/chap_01:Name2
I am using this:
for a in chapters/chap_* ;
do
echo -n $a;
ggrep -F -f book.names10 -w -o $a | wc -l ;
done
but the only thing I got is a list of the number of times the names were used in each chapter in general. I don't know where to integrate the file book_chap.list on this command.
Quick and dirty (you might want to prettify the output):
#!/bin/bash
for chapter in $(ls chapters/chap_*); do
echo "Chapter ${chapter}:" >> nameOccurences.txt
for name in $(cat book.names10); do
echo -n "${name}: " >> nameOccurences.txt
grep -o ${name} ${chapter} | wc -l >> nameOccurences.txt
done
echo "" >> nameOccurences.txt
done
If you want to group the result by name rather than chapter, you have to exchange the loops and output accordingly.

Input from command line with incorrect result

I am trying to read a file that was outputted with table format. I am able to read the file that has Volume description, snapshotId and TimeStarted list from an aws region. I am asking user to input a volume name and output the snapshotId for the volume entered. The list contains volumes0 through volumes30.
The issue is, when user enters Volume0, it outputs the snapshotId correctly but if the user enters Volume20, it will only output |. My guess is because the original file being read is in table format, it is doing this. Initially I though I could put in a condition, where if user enters Volume0, print snapshotId else if user enters Volume20 then print snapshotid.
I am looking for a better way to do this. How can I ignore table format when reading the file, should I convert it to text format? How? Or how can I ignore any format when reading? Here is my bash script:
readoutput() {
echo "Hello, please tell me what Volume you are searching for..(Volume?):"
read volSearch
echo "Searching for newest SnapshotIds from /Users/User/Downloads/GetSnapId for:" $volSearch
sleep 5
input="/Users/User/Downloads/GetSnapId"
if x=$(grep -m 1 "$volSearch" "$input")
then
echo "$x"
else
echo "$volSearch not found..ending search"
fi
extractSnap=$(echo "$x" | grep "snap-" | awk '{print $7}')
echo $extractSnap
}
readoutput
The issue that awk is not too smart, and tries to determine table separator automatically. In first row with Volume0 you have space before vertical line, so it things that both of space and vertical line are separators, but in next row, you have no space, so it takes wrong column.
Try next:
extractSnap=$(echo "$x" | cut -d'|' -f3 | cut -d'-' -f 2)

How do change all filenames with a similar but not identical structure?

Due to a variety of complex photo library migrations that had to be done using a combination of manual copying and importing tools that renamed the files, it seems I wound up with a ton of files with a similar structure. Here's an example:
2009-05-05 - 2009-05-05 - IMG_0486 - 2009-05-05 at 10-13-43 - 4209 - 2009-05-05.JPG
What it should be:
2009-05-05 - IMG_0486.jpg
The other files have the same structure, but obviously the individual dates and IMG numbers are different.
Is there any way I can do some command line magic in Terminal to automatically rename these files to the shortened/correct version?
I assume you may have sub-directories and want to find all files inside this directory tree.
This first code block (which you could put in a script) is "safe" (does nothing), but will help you see what would be done.
datep="[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]"
dir="PUT_THE_FULL_PATH_OF_YOUR_MAIN_DIRECTORY"
while IFS= read -r file
do
name="$(basename "$file")"
[[ "$name" =~ ^($datep)\ -\ $datep\ -\ ([^[:space:]]+)\ -\ $datep.*[.](.+)$ ]] || continue
date="${BASH_REMATCH[1]}"
imgname="${BASH_REMATCH[2]}"
ext="${BASH_REMATCH[3],,}"
dir_of_file="$(dirname "$file")"
target="$dir_of_file/$date - $imgname.$ext"
echo "$file"
echo " would me moved to..."
echo " $target"
done < <(find "$dir" -type f)
Make sure the output is what you want and are expecting. I cannot test on your actual files, and if this script does not produce results that are entirely satisfactory, I do not take any responsibility for hair being pulled out. Do not blindly let anyone (including me) mess with your precious data by copy and pasting code from the internet if you have no reliable, checked backup.
Once you are sure, decide if you want to take a chance on some guy's code written without any opportunity for testing and replace the three consecutive lines beginning with echo with this :
mv "$file" "$target"
Note that file names have to match to a pretty strict pattern to be considered for processing, so if you notice that some files are not being processed, then the pattern may need to be modified.
Assuming they are all the exact same structure, spaces and everything, you can use awk to split the names up using the spaces as break points. Here's a quick and dirty example:
#!/bin/bash
output=""
for file in /path/to/files/*; do
unset output #clear variable from previous loop
output="$(echo $file | awk '{print $1}')" #Assign the first field to the output variable
output="$output"" - " #Append with [space][dash][space]
output="$output""$(echo $file | awk '{print $5}')" #Append with IMG_* field
output="$output""." #Append with period
#Use -F '.' to split by period, and $NF to grab the last field (to get the extension)
output="$output""$(echo $file | awk -F '.' '{print $NF}')"
done
From there, something like mv /path/to/files/$file /path/to/files/$output as a final line in the file loop will rename the file. I'd copy a few files into another folder to test with first, since we're dealing with file manipulation.
All the output assigning lines can be consolidated into a single line, as well, but it's less easy to read.
output="$(echo $file | awk '{print $1 " - " $5 "."}')""$(echo $file | awk -F '.' '{print $NF}')"
You'll still want a file loop, though.
Assuming that you want to convert the filename with the first date and the IMG* name, you can run the following on the folder:
IFS=$'\n'
for file in *
do
printf "mv '$file' '"
printf '%s' $(cut -d" " -f1,4,5 <<< "$file")
printf "'.jpg"
done | sh

Grep outputs multiple lines, need while loop

I have a script which uses grep to find lines in a text file (ics calendar to be specific)
My script finds a date match, then goes up and down a few lines to copy the summary and start time of the appointment into a separate variable. The problem I have is that I'm going to have multiple appointments at the same time, and I need to run through the whole process for each result in grep.
Example:
LINE=`grep -F -n 20130304T232200 /path/to/calendar.ics | cut -f1 d:`
And it outputs only the lines, such as
86 89
Then it goes on to capture my other variables, as such:
SUMMARYLINE=$(( $LINE + 5 ))
SUMMARY:`sed -n "$SUMMARYLINE"p /path/to/calendar.ics
my script runs fine with one output, but it obviously won't work with more than 1 and I need for it to. should I send the grep results into an array? a separate text file to read from? I'm sure I'll need a while loop in here somehow. Need some help please.
You can call grep from a loop quite easily:
while IFS=':' read -r LINE notused # avoids the use of cut
do
# First field is now in $LINE
# Further processing
done < <(grep -F -n 20130304T232200 /path/to/calendar.ics)
However, if the file is not too large then it might be easier to read the whole file into an array and more around that.
With your proposed solution, you are reading through the file several times. Using awk, you can do it in one pass:
awk -F: -v time=20130304T232200 '
$1 == "SUMMARY" {summary = substr($0,9)}
/^DTSTART/ {start = $2}
/^END:VEVENT/ && start == time {print summary}
' calendar.ics

bash: shortest way to get n-th column of output

Let's say that during your workday you repeatedly encounter the following form of columnized output from some command in bash (in my case from executing svn st in my Rails working directory):
? changes.patch
M app/models/superman.rb
A app/models/superwoman.rb
in order to work with the output of your command - in this case the filenames - some sort of parsing is required so that the second column can be used as input for the next command.
What I've been doing is to use awk to get at the second column, e.g. when I want to remove all files (not that that's a typical usecase :), I would do:
svn st | awk '{print $2}' | xargs rm
Since I type this a lot, a natural question is: is there a shorter (thus cooler) way of accomplishing this in bash?
NOTE:
What I am asking is essentially a shell command question even though my concrete example is on my svn workflow. If you feel that workflow is silly and suggest an alternative approach, I probably won't vote you down, but others might, since the question here is really how to get the n-th column command output in bash, in the shortest manner possible. Thanks :)
You can use cut to access the second field:
cut -f2
Edit:
Sorry, didn't realise that SVN doesn't use tabs in its output, so that's a bit useless. You can tailor cut to the output but it's a bit fragile - something like cut -c 10- would work, but the exact value will depend on your setup.
Another option is something like: sed 's/.\s\+//'
To accomplish the same thing as:
svn st | awk '{print $2}' | xargs rm
using only bash you can use:
svn st | while read a b; do rm "$b"; done
Granted, it's not shorter, but it's a bit more efficient and it handles whitespace in your filenames correctly.
I found myself in the same situation and ended up adding these aliases to my .profile file:
alias c1="awk '{print \$1}'"
alias c2="awk '{print \$2}'"
alias c3="awk '{print \$3}'"
alias c4="awk '{print \$4}'"
alias c5="awk '{print \$5}'"
alias c6="awk '{print \$6}'"
alias c7="awk '{print \$7}'"
alias c8="awk '{print \$8}'"
alias c9="awk '{print \$9}'"
Which allows me to write things like this:
svn st | c2 | xargs rm
Try the zsh. It supports suffix alias, so you can define X in your .zshrc to be
alias -g X="| cut -d' ' -f2"
then you can do:
cat file X
You can take it one step further and define it for the nth column:
alias -g X2="| cut -d' ' -f2"
alias -g X1="| cut -d' ' -f1"
alias -g X3="| cut -d' ' -f3"
which will output the nth column of file "file". You can do this for grep output or less output, too. This is very handy and a killer feature of the zsh.
You can go one step further and define D to be:
alias -g D="|xargs rm"
Now you can type:
cat file X1 D
to delete all files mentioned in the first column of file "file".
If you know the bash, the zsh is not much of a change except for some new features.
HTH Chris
Because you seem to be unfamiliar with scripts, here is an example.
#!/bin/sh
# usage: svn st | x 2 | xargs rm
col=$1
shift
awk -v col="$col" '{print $col}' "${#--}"
If you save this in ~/bin/x and make sure ~/bin is in your PATH (now that is something you can and should put in your .bashrc) you have the shortest possible command for generally extracting column n; x n.
The script should do proper error checking and bail if invoked with a non-numeric argument or the incorrect number of arguments, etc; but expanding on this bare-bones essential version will be in unit 102.
Maybe you will want to extend the script to allow a different column delimiter. Awk by default parses input into fields on whitespace; to use a different delimiter, use -F ':' where : is the new delimiter. Implementing this as an option to the script makes it slightly longer, so I'm leaving that as an exercise for the reader.
Usage
Given a file file:
1 2 3
4 5 6
You can either pass it via stdin (using a useless cat merely as a placeholder for something more useful);
$ cat file | sh script.sh 2
2
5
Or provide it as an argument to the script:
$ sh script.sh 2 file
2
5
Here, sh script.sh is assuming that the script is saved as script.sh in the current directory; if you save it with a more useful name somewhere in your PATH and mark it executable, as in the instructions above, obviously use the useful name instead (and no sh).
It looks like you already have a solution. To make things easier, why not just put your command in a bash script (with a short name) and just run that instead of typing out that 'long' command every time?
If you are ok with manually selecting the column, you could be very fast using pick:
svn st | pick | xargs rm
Just go to any cell of the 2nd column, press c and then hit enter
Note, that file path does not have to be in second column of svn st output. For example if you modify file, and modify it's property, it will be 3rd column.
See possible output examples in:
svn help st
Example output:
M wc/bar.c
A + wc/qax.c
I suggest to cut first 8 characters by:
svn st | cut -c8- | while read FILE; do echo whatever with "$FILE"; done
If you want to be 100% sure, and deal with fancy filenames with white space at the end for example, you need to parse xml output:
svn st --xml | grep -o 'path=".*"' | sed 's/^path="//; s/"$//'
Of course you may want to use some real XML parser instead of grep/sed.

Resources