With GNU Make, how can I combine multiple files into one? - makefile

I have some SQL files that I would like to process with sed and concatenate into a single file. Is there a slick way to do this with a single GNU Make recipe?
If I know the set of files at the time that I'm writing the Makefile, I could just write a multi-line recipe.
combined.sql: main.sql table1.sql table2.sql
sed -e 's/latin1/utf8/' main.sql > $#
sed -e 's/latin1/utf8/' table1.sql >> $#
sed -e 's/latin1/utf8/' table2.sql >> $#
This seems too repetitious, and also won't be workable if I have a dynamically generated list of input files. How can I nicely do this in a minimally-redundant way that can extend to an arbitrary number of input files?

Using the fact that you have an automatic variable for your prerequisites and the fact that sed takes arbitrarily many input files you get:
combined.sql: main.sql table1.sql table2.sql
sed -e 's/latin1/utf8/' $^ > $#
For the general case of a tool that doesn't take multiple file inputs you can still use $^ in a shell loop:
combined.sql: main.sql table1.sql table2.sql
for file in $^; do \
some_other_tool $$file; \
done > $#

How about:
cat *.sql | sed -e 's/latin1/utf8/' > output.sql
You can but that in to a shell script and have it take file name parameters etc. if you like.

Related

How to execute shell scripting in md file

We have a MarkDown file where we are storing versions of multiple components.
Below is the sample .md file
component1=1.2
components2=2.3
component3=`cat file1 | grep 'App_Version' | grep -P '(?<==).*' && rm -rf file1`
Here the component3 version is dynamic, so we are executing the command to get the version.
Need help on accomplishing this in correct way.
Markdown is not a scripting language, so you probably need one form or another of preprocessing. Example with GNU m4 (but any preprocessor with similar capabilities would do the job):
$ cat sample.m4
m4_changequote(`"""', `"""')m4_dnl
component1=1.2
components2=2.3
component3=m4_esyscmd("""grep -Po '(?<=App_Version=).*' file1 && rm -f file1""")m4_dnl
component4=foo
$ cat file1
App_Version=4.0.2
$ m4 -P sample.m4 > sample.md
$ cat sample.md
component1=1.2
components2=2.3
component3=4.0.2
component4=foo
$ ls file1
ls: cannot access 'file1': No such file or directory
Explanations:
The -P option of m4 modifies all builtin macro names so they all start with the m4_ prefix. It is not absolutely needed but it makes the source code easier to read.
The sample.m4 file is your source file, the one you edit. The:
m4 -P sample.m4 > sample.md
command preprocesses the source file to produce the markdown file.
The m4_changequote macro at the beginning of sample.m4 changes the quotes that m4 uses for text strings. Use any left and right quotes you want (""" in our example) as long as it is not used in your markdown text.
m4_dnl is the macro that suppresses the rest of the line, including the newline character.
m4_esyscmd("""cmd""") substitutes the output of the cmd shell script.
Note: I assumed that you wanted grep -Po '(?<=App_Version=).*' file1 instead of cat file1 | grep 'App_Version' | grep -P '(?<==).*' which looks like several anti-patterns at once.

Convert directory of epub/pdf (and other text files) to .txt using pandoc

This is how far I've gotten:
ls $1 | while read x; do echo "pandoc '$x' -o $x" | sed 's/\.[^.]*$//' | sed 's/$/.txt/' | sed 's/$/ --wrap=preserve/' ; done
What this does is it prints out the command that you'd have to run for every file to convert a file to TXT using pandoc.
The problem is if you replace echo with eval it doesn't work. I hypothesize because you are modifying the command after you run it ... somehow? Yet still it's printing properly?
The result I got when I ran it with eval was just it copying EPUB files as opposed to converting them which explains my hypothesis.
So my question is, how can I run every command as it's actually printed like so:
pandoc 'Complicity - Iain Banks.epub' -o Complicity - Iain Banks.txt --wrap=preserve
Something like this should do that:
for f in *; do
pandoc "$f" -o "${f%.*}.txt" --wrap=preserve
done
What do you need ls/eval for?

need to clean file via SED or GREP

I have these files
NotRequired.txt (having lines which need to be remove)
Need2CleanSED.txt (big file , need to clean)
Need2CleanGRP.txt (big file , need to clean)
content:
more NotRequired.txt
[abc-xyz_pqr-pe2_123]
[lon-abc-tkt_1202]
[wat-7600-1_414]
[indo-pak_isu-5_761]
I am reading above file and want to remove lines from Need2Clean???.txt, trying via SED and GREP but no success.
myFile="NotRequired.txt"
while IFS= read -r HKline
do
sed -i '/$HKline/d' Need2CleanSED.txt
done < "$myFile"
myFile="NotRequired.txt"
while IFS= read -r HKline
do
grep -vE \"$HKline\" Need2CleanGRP.txt > Need2CleanGRP.txt
done < "$myFile"
Looks as if the Variable and characters [] making some problem.
What you're doing is extremely inefficient and error prone. Just do this:
grep -vF -f NotRequired.txt Need2CleanGRP.txt > tmp &&
mv tmp Need2CleanGRP.txt
Thanks to grep -F the above treats each line of NotRequired.txt as a string rather than a regexp so you don't have to worry about escaping RE metachars like [ and you don't need to wrap it in a shell loop - that one command will remove all undesirable lines in one execution of grep.
Never do command file > file btw as the shell might decide to execute the > file first and so empty file before command gets a chance to read it! Always do command file > tmp && mv tmp file instead.
Your assumption is correct. The [...] construct looks for any characters in that set, so you have to preface ("escape") them with \. The easiest way is to do that in your original file:
sed -i -e 's:\[:\\[:' -e 's:\]:\\]:' "${myFile}"
If you don't like that, you can probably put the sed command in where you're directing the file in:
done < replace.txt|sed -e 's:\[:\\[:' -e 's:\]:\\]:'
Finally, you can use sed on each HKline variable:
HKline=$( echo $HKline | sed -e 's:\[:\\[:' -e 's:\]:\\]:' )
try gnu sed:
sed -Ez 's/\n/\|/g;s!\[!\\[!g;s!\]!\\]!g; s!(.*).!/\1/d!' NotRequired.txt| sed -Ef - Need2CleanSED.txt
Two sed process are chained into one by shell pipe
NotRequired.txt is 'slurped' by sed -z all at once and substituted its \n and [ meta-char with | and \[ respectively of which the 2nd process uses it as regex script for the input file, ie. Need2CleanSED.txt. 1st process output;
/\[abc-xyz_pqr-pe2_123\]|\[lon-abc-tkt_1202\]|\[wat-7600-1_414\]|\[indo-pak_isu-5_761\]/d
add -u ie. unbuffered, option to evade from batch process, sort of direct i/o

Using sed in makefile

I am trying to use sed in makefile as shown below. But it doesn't seem to produce the modified file. I have tried the sed command in the shell and made sure it works.
ana:
-for ana1 in $(anas) ; do \
for ana2 in $(anas) ; do \
sed "s/STF1/$$ana1/g" ./planalysis/src/analysis.arr > ./planalysis/src/spanalysis.arr ; \
sed "s/STF2/$$ana2/g" ./planalysis/src/spanalysis.arr > ./planalysis/src/spanalysis.arr ; \
# ... perform some analysis with the modified file..
# ...
done \
done
Is there something I'm doing wrong?
This command won't do whst you expect:
sed "s/STF2/$$ana2/g" ./planalysis/src/spanalysis.arr > ./planalysis/src/spanalysis.arr ;
regardless of whether you execute it in a Makefile or in your shell.
If you execute
some-command my.file > my.file
You are likely to end up with an empty my.file, regardless of the command, because redirections are set up before the command executes. As soon as the shell does the redirection > my.file, that file is emptied, because that's what output redirection does. When the command eventually executes and attempts to read my.file, it will find thar the file is empty. In the case of sed, and many other commands, an empty input produces an empty output, and I suppse that is what you are seeing.
Use a temporary file, or use sed -i (see man sed) or, as suggested by #AlexeySemenyuk in a comment, combine the edits into a single sed invocation:
sed -e "s/STF1/$$ana1/g" -e "s/STF2/$$ana2/g" \
./planalysis/src/analysis.arr > ./planalysis/src/spanalysis.arr

using sed to find and replace in bash for loop

I have a large number of words in a text file to replace.
This script is working up until the sed command where I get:
sed: 1: "*.js": invalid command code *
PS... Bash isn't one of my strong points - this doesn't need to be pretty or efficient
cd '/Users/xxxxxx/Sites/xxxxxx'
echo `pwd`;
for line in `cat myFile.txt`
do
export IFS=":"
i=0
list=()
for word in $line; do
list[$i]=$word
i=$[i+1]
done
echo ${list[0]}
echo ${list[1]}
sed -i "s/{$list[0]}/{$list[1]}/g" *.js
done
You're running BSD sed (under OS X), therefore the -i flag requires an argument specifying what you want the suffix to be.
Also, no files match the glob *.js.
This looks like a simple typo:
sed -i "s/{$list[0]}/{$list[1]}/g" *.js
Should be:
sed -i "s/${list[0]}/${list[1]}/g" *.js
(just like the echo lines above)
So myFile.txt contains a list of from:to substitutions, and you are looping over each of those. Why don't you create a sed script from this file instead?
cd '/Users/xxxxxx/Sites/xxxxxx'
sed -e 's/^/s:/' -e 's/$/:/' myFile.txt |
# Output from first sed script is a sed script!
# It contains substitutions like this:
# s:from:to:
# s:other:substitute:
sed -f - -i~ *.js
Your sed might not like the -f - which means sed should read its script from standard input. If that is the case, perhaps you can create a temporary script like this instead;
sed -e 's/^/s:/' -e 's/$/:/' myFile.txt >script.sed
sed -f script.sed -i~ *.js
Another approach, if you don't feel very confident with sed and think you are going to forget in a week what the meaning of that voodoo symbols is, could be using IFS in a more efficient way:
IFS=":"
cat myFile.txt | while read PATTERN REPLACEMENT # You feed the while loop with stdout lines and read fields separated by ":"
do
sed -i "s/${PATTERN}/${REPLACEMENT}/g"
done
The only pitfall I can see (it may be more) is that if whether PATTERN or REPLACEMENT contain a slash (/) they are going to destroy your sed expression.
You can change the sed separator with a non-printable character and you should be safe.
Anyway, if you know whats on your myFile.txt you can just use any.

Resources