Unix Shell Script to take multiple files from standard input (csh) - shell

Using either the for loop or the pipe (both work with one filename), I need to figure out how to accept unlimited specified files from standard input. I have tried regular expressions, and various wildcard forms. The two main issues I'm running into: only the first file is put through the script or every single file in the directory is put through. This is an assignment for a basic Unix Course and my problem thus far is over-complication. Based on the rest of the semester, there's a simple fix for what I'm wanting to do and here I've spent two hours perusing hundreds of websites and posts making my head spin.
EDIT: The command line prompt would be something like this ~/dir/script currentWord newWord fileName1 fileName2 fileName3
#!/bin/csh
set currentWord=$1
set newWord=$2
set fileName=$3
if { grep -q $1 *$3 } then
sed -i.bak -e "s/$1/$2/g" $3
else
echo "The string is not found."
endif
#grep -q $1 $3 | sed -i.bak -e "s/$1/$2/g" $3

You can access the command line arguments using $argv[]. To loop over them but skip the first two, you can use this construct:
foreach file ($argv[3-])
# do stuff here, eg
echo $file
end
You shouldn't use csh though, if you have been instructed to do so by your professor I would question this.

Related

Bash File names will not append to file from script

Hello I am trying to get all files with Jane's name to a separate file called oldFiles.txt. In a directory called "data" I am reading from a list of file names from a file called list.txt, from which I put all the file names containing the name Jane into the files variable. Then I'm trying to test the files variable with the files in list.txt to ensure they are in the file system, then append the all the files containing jane to the oldFiles.txt file(which will be in the scripts directory), after it tests to make sure the item within the files variable passes.
#!/bin/bash
> oldFiles.txt
files= grep " jane " ../data/list.txt | cut -d' ' -f 3
if test -e ~data/$files; then
for file in $files; do
if test -e ~/scripts/$file; then
echo $file>> oldFiles.txt
else
echo "no files"
fi
done
fi
The above code gets the desired files and displays them correctly, as well as creates the oldFiles.txt file, but when I open the file after running the script I find that nothing was appended to the file. I tried changing the file assignment to a pointer instead files= grep " jane " ../data/list.txt | cut -d' ' -f 3 ---> files=$(grep " jane " ../data/list.txt) to see if that would help by just capturing raw data to write to file, but then the error comes up "too many arguments on line 5" which is the 1st if test statement. The only way I get the script to work semi-properly is when I do ./findJane.sh > oldFiles.txt on the shell command line, which is me essentially manually creating the file. How would I go about this so that I create oldFiles.txt and append to the oldFiles.txt all within the script?
The biggest problem you have is matching names like "jane" or "Jane's", etc. while not matching "Janes". grep provides the options -i (case insensitive match) and -w (whole-word match) which can tailor your search to what you appear to want without having to use the kludge (" jane ") of appending spaces before an after your search term. (to properly do that you would use [[:space:]]jane[[:space:]])
You also have the problem of what is your "script dir" if you call your script from a directory other than the one containing your script, such as calling your script from your $HOME directory with bash script/findJane.sh. In that case your script will attempt to append to $HOME/oldFiles.txt. The positional parameter $0 always contains the full pathname to the current script being run, so you can capture the script directory no matter where you call the script from with:
dirname "$0"
You are using bash, so store all the filenames resulting from your grep command in an array, not some general variable (especially since your use of " jane " suggests that your filenames contain whitespace)
You can make your script much more flexible if you take the information of your input file (e.g list.txt), the term to search for (e.g. "jane"), the location where to check for existence of the files (e.g. $HOME/data) and the output filename to append the names to (e.g. "oldFile.txt") as command line [positonal] parameters. You can give each default values so it behaves as you currently desire without providing any arguments.
Even with the additional scripting flexibility of taking the command line arguments, the script actually has fewer lines simply filling an array using mapfile (synonymous with readarray) and then looping over the contents of the array. You also avoid the additional subshell for dirname with a simple parameter expansion and test whether the path component is empty -- to replace with '.', up to you.
If I've understood your goal correctly, you can put all the pieces together with:
#!/bin/bash
# positional parameters
src="${1:-../data/list.txt}" # 1st param - input (default: ../data/list.txt)
term="${2:-jane}" # 2nd param - search term (default: jane)
data="${3:-$HOME/data}" # 3rd param - file location (defaut: ../data)
outfn="${4:-oldFiles.txt}" # 4th param - output (default: oldFiles.txt)
# save the path to the current script in script
script="$(dirname "$0")"
# if outfn not given, prepend path to script to outfn to output
# in script directory (if script called from elsewhere)
[ -z "$4" ] && outfn="$script/$outfn"
# split names w/term into array
# using the -iw option for case-insensitive whole-word match
mapfile -t files < <(grep -iw "$term" "$src" | cut -d' ' -f 3)
# loop over files array
for ((i=0; i<${#files[#]}; i++)); do
# test existence of file in data directory, redirect name to outfn
[ -e "$data/${files[i]}" ] && printf "%s\n" "${files[i]}" >> "$outfn"
done
(note: test expression and [ expression ] are synonymous, use what you like, though you may find [ expression ] a bit more readable)
(further note: "Janes" being plural is not considered the same as the singular -- adjust the grep expression as desired)
Example Use/Output
As was pointed out in the comment, without a sample of your input file, we cannot provide an exact test to confirm your desired behavior.
Let me know if you have questions.
As far as I can tell, this is what you're going for. This is totally a community effort based on the comments, catching your bugs. Obviously credit to Mark and Jetchisel for finding most of the issues. Notable changes:
Fixed $files to use command substitution
Fixed path to data/$file, assuming you have a directory at ~/data full of files
Fixed the test to not test for a string of files, but just the single file (also using -f to make sure it's a regular file)
Using double brackets — you could also use double quotes instead, but you explicitly have a Bash shebang so there's no harm in using Bash syntax
Adding a second message about not matching files, because there are two possible cases there; you may need to adapt depending on the output you're looking for
Removed the initial empty redirection — if you need to ensure that the file is clear before the rest of the script, then it should be added back, but if not, it's not doing any useful work
Changed the shebang to make sure you're using the user's preferred Bash, and added set -e because you should always add set -e
#!/usr/bin/env bash
set -e
files=$(grep " jane " ../data/list.txt | cut -d' ' -f 3)
for file in $files; do
if [[ -f $HOME/data/$file ]]; then
if [[ -f $HOME/scripts/$file ]]; then
echo "$file" >> oldFiles.txt
else
echo "no matching file"
fi
else
echo "no files"
fi
done

How do I use `sed` to alter a variable in a bash script?

I'm trying to use enscript to print PDFs from Mutt, and hitting character encoding issues. One way around them seems to be to just use sed to replace the problem characters: sed -ir 's/[“”]/"/g' {input}
My test input file is this:
“very dirty”
we’re
I'm hoping to get "very dirty" and we're but instead I'm still getting
â\200\234very dirtyâ\200\235
weâ\200\231re
I found a nice little post on printing to PDFs from Mutt that I used as a starting point. I have a bash script that I point to from my .muttrc with set print_command="$HOME/.mutt/print.sh" -- the script currently reads about like this:
#!/bin/bash
input="$1" pdir="$HOME/Desktop" open_pdf=evince
# Straighten out curly quotes
sed -ir 's/[“”]/"/g' $input
sed -ir "s/[’]/'/g" $input
tmpfile="`mktemp $pdir/mutt_XXXXXXXX.pdf`"
enscript --font=Courier8 $input -2r --word-wrap --fancy-header=mutt -p - 2>/dev/null | ps2pdf - $tmpfile
$open_pdf $tmpfile >/dev/null 2>&1 &
sleep 1
rm $tmpfile
It does a fine job of creating a PDF (and works fine if you give it a file as an argument) but I can't figure out how to fix the curly quotes.
I've tried a bunch of variations on the sed line:
input=sed -r 's/[“”]/"/g' $input
$input=sed -ir "s/[’]/'/g" $input
Per the suggestion at Can I use sed to manipulate a variable in bash? I also tried input=$(sed -r 's/[“”]/"/g' <<< $input) and I get an error: "Syntax error: redirection unexpected"
But none manages to actually change $input -- what is the correct syntax to change $input with sed?
Note: I accepted an answer that resolved the question I asked, but as you can see from the comments there are a couple of other issues here. enscript is taking in a whole file as a variable, not just the text of the file. So trying to tweak the text inside the file is going to take a few extra steps. I'm still learning.
On Editing Variables In General
BashFAQ #21 is a comprehensive reference on performing search-and-replace operations in bash, including within variables, and is thus recommended reading. On this particular case:
Use the shell's native string manipulation instead; this is far higher performance than forking off a subshell, launching an external process inside it, and reading that external process's output. BashFAQ #100 covers this topic in detail, and is well worth reading.
Depending on your version of bash and configured locale, it might be possible to use a bracket expression (ie. [“”], as your original code did). However, the most portable thing is to treat “ and ” separately, which will work even without multi-byte character support available.
input='“hello ’cruel’ world”'
input=${input//'“'/'"'}
input=${input//'”'/'"'}
input=${input//'’'/"'"}
printf '%s\n' "$input"
...correctly outputs:
"hello 'cruel' world"
On Using sed
To provide a literal answer -- you almost had a working sed-based approach in your question.
input=$(sed -r 's/[“”]/"/g' <<<"$input")
...adds the missing syntactic double quotes around the parameter expansion of $input, ensuring that it's treated as a single token regardless of how it might be string-split or glob-expanded.
But All That May Not Help...
The below is mentioned because your test script is manipulating content passed on the command line; if that's not the case in production, you can probably disregard the below.
If your script is invoked as ./yourscript “hello * ’cruel’ * world”, then information about exactly what the user entered is lost before the script is started, and nothing you can do here will fix that.
This is because $1, in that scenario, will only contain “hello; ’cruel’ and world” are in their own argv locations, and the *s will have been replaced with lists of files in the current directory (each such file substituted as a separate argument) before the script was even started. Because the shell responsible for parsing the user's command line (which is not the same shell running your script!) did not recognize the quotes as valid at the time when it ran this parsing, by the time the script is running, there's nothing you can do to recover the original data.
Abstract: The way to use sed to change a variable is explored, but what you really need is a way to use and edit a file. It is covered ahead.
Sed
The (two) sed line(s) could be solved with this (note that -i is not used, it is not a file but a value):
input='“very dirty”
we’re'
sed 's/[“”]/\"/g;s/’/'\''/g' <<<"$input"
But it should be faster (for small strings) to use the internals of the shell:
input='“very dirty”
we’re'
input=${input//[“”]/\"}
input=${input//[’]/\'}
printf '%s\n' "$input"
$1
But there is an underlying problem with your script, you are trying to clean an input received from the command line. You are using $1 as the source of the string. Once somebody writes:
./script “very dirty”
we’re
That input is lost. It is broken into shell's tokens and "$1" will be “very only.
But I do not believe that is what you really have.
file
However, you are also saying that the input comes from a file. If that is the case, then read it in with:
input="$(<infile)" # not $1
sed 's/[“”]/\"/g;s/’/'\''/g' <<<"$input"
Or, if you don't mind to edit (change) the file, do this instead:
sed -i 's/[“”]/\"/g;s/’/'\''/g' infile
input="$(<infile)"
Or, if you are clear and certain that what is being given to the script is a filename, like:
./script infile
You can use:
infile="$1"
sed -i 's/[“”]/\"/g;s/’/'\''/g' "$infile"
input="$(<"$infile")"
Other comments:
Then:
Quote your variables.
Do not use the very old `…` syntax, use $(…) instead.
Do not use variables in UPPER case, those are reserved for environment variables.
And (unless you actually meant sh) use a shebang (first line) that targets bash.
The command enscript most definitively requires a file, not a variable.
Maybe you should use evince to open the PS file, there is no need of the step to make a pdf, unless you know you really need it.
I believe that is better use a file to store the output of enscript and ps2pdf.
Do not hide the errors printed by the commands until everything is working as desired, then, just call the script as:
./script infile 2>/dev/null
Or as required to make it less verbose.
Final script.
If you call the script with the name of the file that enscript is going to use, something like:
./script infile
Then, the whole script will look like this (runs both in bash or sh):
#!/usr/bin/env bash
Usage(){ echo "$0; This script require a source file"; exit 1; }
[ $# -lt 1 ] && Usage
[ ! -e $1 ] && Usage
infile="$1"
pdir="$HOME/Desktop"
open_pdf=evince
# Straighten out curly quotes
sed -i 's/[“”]/\"/g;s/’/'\''/g' "$infile"
tmpfile="$(mktemp "$pdir"/mutt_XXXXXXXX.pdf)"
outfile="${tmpfile%.*}.ps"
enscript --font=Courier10 "$infile" -2r \
--word-wrap --fancy-header=mutt -p "$outfile"
ps2pdf "$outfile" "$tmpfile"
"$open_pdf" "$tmpfile" >/dev/null 2>&1 &
sleep 5
rm "$tmpfile" "$outfile"

Bash script to replace or append

I'm new to Bash scripting and I'm having a bit of a hard time. I'm trying to alter the configuration values of a config file. If it finds an existing value I want it to update it, but if it doesn't exist I want it to append it. This is as far I as I got from various tutorials and snippets online:
# FUNCTION TO MODIFY CONFIG BY APPEND OR REPLACE
# $1 File
# $2 Find
# $3 Replace / Append
function replaceappend() {
grep -q '^$2' $1
sed -i 's/^$2.*/$3/' $1
echo '$3' >> $1
}
replaceappend "/etc/test.conf" "Port 20" "Port 10"
However as you might imagine this doesn't work. It seems to be with the logic behind it, I'm not sure how to capture the result of grep in order to choose either sed or echo.
Just use the return value of the command and use double-quotes instead of single quotes:
if ! sed -i "/$2/{s//$3/;h};"'${x;/./{x;q0};x;q1}' $1
then
echo "$3" >> $1
fi
SOURCE: Return code of sed for no match for the q command
This is treading outside my normal use of sed, so let me give an explanation of how this works, as I understand it:
sed "/$2/{s//$3/;h};"'${x;/./{x;q0};x;q1}' $1
The first /$2/ is an address - we will do the commands within {...} for any lines that match this. As a by-product it also sets the pattern-space to $2.
The command {s//$3/;h} says to substitute whatever is in the pattern-space with $3 and then save the pattern-space in the "hold-space", a type of buffer within sed.
The $ after the single quote is another address - it says to do this next command on the LAST line.
The command {x;/./{x;q0};x;q1} says:
x = swap the hold-space and the pattern-space
/./ = an address which matches anything
{x;q0} = swap the hold-space and the pattern-space - if this is successful (there was something in the hold-space) then q0=exit with 0 status (success)
x;q1 = swap the hold-space and the pattern-space - since this is now successful (due to the previous x) then q1=exit with 1 status (fail)
The double-quotes around the first part allow substitution for $2 and $3. The single quotes around the latter part prevents erroneous substitution for the $.
A bit complicated, but it seems to work AS LONG AS YOU HAVE SOMETHING IN THE FILE. An empty file will still succeed since you don't get any match on the last line.
To be honest, after all this complication... Unless the files you are working with are really long so that a double-pass would be really bad I would probably go back to the grep solution like this:
if grep -q "^$2" $1
then
sed -i "s/^$2.*$/$3/" $1
else
echo "$3" >>$1
fi
That's a WHOLE lot easier to understand and maintain later...

Use first 3 characters of a filename as a variable in shell script

this is my first post so hopefully I will make my question clear.
I am new to shell scripts and my task with this one is to add a new value to every line of a csv file. The value that needs added is based on the first 3 digits of the filename.
I bit of background. The csv files I am receiving are eventually being loaded into partitioned oracle tables. The start of the file name (e.g. BATTESTFILE.txt) contains the partitioned site so I need to write a script that takes the first 3 characters of the filename (in this example BAT) and add this to the end of each line of the file.
The closest I have got so far is when I stripped the code to the bare basics of what I need to do:
build_files()
{
OLDFILE=${filename[#]}.txt
NEWFILE=${filename[#]}.NEW.txt
ABSOLUTE='path/scripts/'
FULLOLD=$ABSOLUTE$OLDFILE
FULLNEW=$ABSOLUTE$NEWFILE
sed -e s/$/",${j}"/ "${FULLOLD}" > "${FULLNEW}"
}
set -A site 'BAT'
set -A filename 'BATTESTFILE'
for j in ${site[#]}; do
for i in ${filename[#]}; do
build_files ${j}
done
done
Here I have set up an array site as there will be 6 'sites' and this will make it easy to add additionals sits to the code as the files come through to me. The same is to be siad for the filename array.
This codes works, but it isn't as automated as I need. One of my most recent attempts has been below:
build_files()
{
OLDFILE=${filename[#]}.txt
NEWFILE=${filename[#]}.NEW.txt
ABSOLUTE='/app/dss/dsssis/sis/scripts/'
FULLOLD=$ABSOLUTE$OLDFILE
FULLNEW=$ABSOLUTE$NEWFILE
sed -e s/$/",${j}"/ "${FULLOLD}" > "${FULLNEW}"
}
set -A site 'BAT'
set -A filename 'BATTESTFILE'
for j in ${site[#]}; do
for i in ${filename[#]}; do
trust=echo "$filename" | cut -c1-3
echo "$trust"
if ["$trust" = 'BAT']; then
${j} = 'BAT'
fi
build_files ${j}
done
done
I found the code trust=echo "$filename" | cut -c1-3 through another question on StackOverflow as I was researching, but it doesn't seem to work for me. I added in the echo to test what trust was holding, but it was empty.
I am getting 2 errors back:
Line 17 - BATTESTFILE: not found
Line 19 - test: ] missing
Sorry for the long winded questions. Hopefully It contains helpful info and shows the steps I have taken. Any questions, comment away. Any help or guidance is very much appreciated. Thanks.
When you are new with shells, try avoiding arrays.
In an if statement use spaces before and after the [ and ] characters.
Get used to surrounding your shell variables with {} like ${trust}
I do not know how you fill your array, when the array is hardcoded, try te replace with
SITE=file1
SITE="${SITE} file2"
And you must tell unix you want to have the rightside eveluated with $(..) (better than backtics):
trust=$(echo "${filename}" | cut -c1-3)
Some guidelines and syntax help can be found at Google
Just use shell parameter expansion:
$ var=abcdefg
$ echo "${var:0:3}"
abc
Assuming you're using a reasonably capable shell like bash or ksh, for example
Just in case it is useful for anyone else now or in the future, I got my code to work as desired by using the below. Thanks Walter A below for his answer to my main problem of getting the first 3 characters from the filename and using them as a variable.
This gave me the desired output of taking the first 3 characters of the filename, and adding them to the end of each line in my csv file.
## Get the current Directory and file name, create a new file name
build_files()
{
OLDFILE=${i}.txt
NEWFILE=${i}.NEW.txt
ABSOLUTE='/app/dss/dsssis/sis/scripts/'
FULLOLD=$ABSOLUTE$OLDFILE
FULLNEW=$ABSOLUTE$NEWFILE
## Take the 3 characters from the filename and
## add them onto the end of each line in the csv file.
sed -e s/$/";${j}"/ "${FULLOLD}" > "${FULLNEW}"
}
## Loop to take the first 3 characters from the file names held in
## an array to be added into the new file above
set -A filename 'BATTESTFILE'
for i in ${filename[#]}; do
trust=$(echo "${i}" | cut -c1-3)
echo "${trust}"
j="${trust}"
echo "${i} ${j}"
build_files ${i} ${j}
done
Hope this is useful for someone else.

Handle special characters in bash for...in loop

Suppose I've got a list of files
file1
"file 1"
file2
a for...in loop breaks it up between whitespace, not newlines:
for x in $( ls ); do
echo $x
done
results:
file
1
file1
file2
I want to execute a command on each file. "file" and "1" above are not actual files. How can I do that if the filenames contains things like spaces or commas?
It's a little trickier than I think find -print0 | xargs -0 could handle, because I actually want the command to be something like "convert input/file1.jpg .... output/file1.jpg" so I need to permutate the filename in the process.
Actually, Mark's suggestion works fine without even doing anything to the internal field separator. The problem is running ls in a subshell, whether by backticks or $( ) causes the for loop to be unable to distinguish between spaces in names. Simply using
for f in *
instead of the ls solves the problem.
#!/bin/bash
for f in *
do
echo "$f"
done
UPDATE BY OP: this answer sucks and shouldn't be on top ... #Jordan's post below should be the accepted answer.
one possible way:
ls -1 | while read x; do
echo $x
done
I know this one is LONG past "answered", and with all due respect to eduffy, I came up with a better way and I thought I'd share it.
What's "wrong" with eduffy's answer isn't that it's wrong, but that it imposes what for me is a painful limitation: there's an implied creation of a subshell when the output of the ls is piped and this means that variables set inside the loop are lost after the loop exits. Thus, if you want to write some more sophisticated code, you have a pain in the buttocks to deal with.
My solution was to take the "readline" function and write a program out of it in which you can specify any specific line number that you may want that results from any given function call. ... As a simple example, starting with eduffy's:
ls_output=$(ls -1)
# The cut at the end of the following line removes any trailing new line character
declare -i line_count=$(echo "$ls_output" | wc -l | cut -d ' ' -f 1)
declare -i cur_line=1
while [ $cur_line -le $line_count ] ;
do
# NONE of the values in the variables inside this do loop are trapped here.
filename=$(echo "$ls_output" | readline -n $cur_line)
# Now line contains a filename from the preceeding ls command
cur_line=cur_line+1
done
Now you have wrapped up all the subshell activity into neat little contained packages and can go about your shell coding without having to worry about the scope of your variable values getting trapped in subshells.
I wrote my version of readline in gnuc if anyone wants a copy, it's a little big to post here, but maybe we can find a way...
Hope this helps,
RT

Resources