running external program in TCL - events

After developing an elaborate TCL code to do smoothing based on Gabriel Taubin's smoothing without shape shrinkage, the code runs extremely slow. This is likely due to the size of unstructured grid I am smoothing. I have to use TCL because the grid generator I am using is Pointwise and Pointwise's "macro language" is TCL based. I'm still a bit new to this, but is there a way to run an external code from TCL where TCL sends the data to the software, the software runs the smoothing operation, and output is sent back to TCL to update the internal data inside the Pointwise grid generation tool? I will be writing the smoothing tool in another language which is significantly faster.

There are a number of options to deal with code that "runs extremely show". I would start with determining how fast it must run. Are we talking milliseconds, seconds, minutes, hours or days. Next it is necessary to determine which part is slow. The time command is useful here.
But assuming you have decided that more performance is necessary and you have some metrics for your current program so you will know if you are improving, here are some things to try:
Try to improve the existing code. If you are using the expr command, make sure your expressions are given to the command as a single argument enclosed in braces. Beginners sometimes forget this and the improvement can be substantial.
Use the critcl package to code parts of the program in "C". Critcl allows you to put "C" code directly into your Tcl program and have that code pulled out, compiled and loaded into your program.
Write a traditional "C" based Tcl extension. Tcl is very extensible and has a clean API for building extensions. There is sample code for extensions and source to many extensions is readily available.
Write a program to do the time consuming part of the job and execute it as a separate process and obtain the output back into your Tcl script. This is where the exec command comes in useful. Presumably you will have to write data out to some where the program can get it and read the output of the program back into your Tcl script. If you want to get fancy you can do two-way communications across a localhost TCP port. The set up in Tcl is quite simple. The "C" code in a program to do it is a bit more tedious, but many examples exist out on the Internet.
Which option to choose depends very much on how much improvement is required and the amount of code that must be improved. You haven't given us much idea what those things are in your case, so all I can offer is rather vague general solutions.

For a loadable module, you can write a Tcl extension. An example is here:
File Last Modified Time with Milliseconds Precision
Alternatively, just write your program to take input from a file. Have Tcl write the input data to the file, run the program, then collect the output from the external program.

Related

Compile shell script to make it totally unreadable

I need to compile my shell script, because I want to protect its source code. I already read about shc, but I also read, that it isn't completely safe, because with a small amount of knowledge (or brain and google) any user can 'decompile' it. Is there a way to compile my script to make it executable, but completely unreadable and 'undecompileable'?
You can only make it harder to read by a human being. Scripts are plain text files, they have to be readable by the script interpreter.
I made my own bash obfuscator (obash) in a way that the keys required to extract the script are not stored inside the script (but that makes the script non distributable). There is an option to generate static reusable binary but that will only execute on systems that have several levels of compatibility (kernel system call interface, glibc, basic binary compatibility).
Making the re-distributable binary also implies storing a key in the generated binary but the stored keys need to be manipulated before use.
You might like to try and see if obash serves you any better then shc.

jq or xsltproc alternative for s-expressions?

I have a project which contains a bunch of small programs tied together using bash scripts, as per the Unix philosophy. Their exchange format originally looked like this:
meta1a:meta1b:meta1c AST1
meta2a:meta2b:meta2c AST2
Where the :-separated fields are metadata and the ASTs are s-expressions which the scripts pass along as-is. This worked fine, as I could use cut -d ' ' to split the metadata from the ASTs, and cut -d ':' to dig into the metadata. However, I then needed to add a metadata field containing spaces, which breaks this format. Since no field uses tabs, I switched to the following:
meta1a:meta1b:meta1c:meta 1 d\tAST1
meta2a:meta2b:meta2c:meta 2 d\tAST2
Since I envision more metadata fields being added in the future, I think it's time to switch to a more structured format rather than playing a game of "guess the punctuation".
Instead of delimiters and cut I could use JSON and jq, or I could use XML and xsltproc, but since I'm already using s-expressions for the ASTs, I'm wondering if there's a nice way to use them here instead?
For example, something which looks like this:
(echo '(("foo1" "bar1" "baz1" "quux 1") ast1)'
echo '(("foo2" "bar2" "baz2" "quux 2") ast2)') | sexpr 'caar'
"foo1"
"foo2"
My requirements are:
Straightforward use of stdio with minimal boilerplate, since that's where my programs read/write their data
Easily callable from shell scripts or provide a very compelling alternative to bash's process invocation and pipelining
Streaming I/O if possible; ie. I'd rather work with one AST at a time rather than consuming the whole input looking for a closing )
Fast and lightweight, especially if it's being invoked a few times; each AST is only a few KB, but they can add up to hundreds of MB
Should work on Linux at least; cross-platform would be nice
The obvious choice is to use a Lisp/Scheme interpreter, but the only one I'm experienced with is Emacs, which is far too heavyweight. Perhaps another implementation is more lightweight and suited to this?
In Haskell I've played with shelly, turtle and atto-lisp, but most of my code was spent converting between String/Text/ByteString, wrapping/unwrapping Lisps, implementing my own car, cdr, cons, etc.
I've read a little about scsh, but don't know if that would be appropriate either.
You might give Common Lisp a try.
Straightforward use of stdio with minimal boilerplate, since that's
where my programs read/write their data
(loop for (attributes ast) = (safe-read) do (print ...)
Read/write from standard input and output.
safe-read should disable execution of code at read-time. There is at least one implementation. Don't eval your AST directly unless you perfectly know what's in there.
Easily callable from shell scripts or provide a very compelling
alternative to bash's process invocation and pipelining
In the same spirit as java -jar ..., you can launch your Common Lisp executable, e.g. sbcl, with a script in argument: sbcl --load file.lisp. You can even dump a core or an executable core of your application with everything preloaded (save-lisp-and-die).
Or, use cl-launch which does the above automatically, and portably, and generates shell scripts and/or makes executable programs from your code.
Streaming I/O if possible; ie. I'd rather work with one AST at a time
rather than consuming the whole input looking for a closing )
If the whole input stream starts with a (, then read will read up-to the closing ) character, but in practice this is rarely done: source code in Common Lisp is not enclosed in one pair of parenthesis per-file, but as a sequence of forms. If your stream produces not one but many s-exps, the reader will read them one at a time.
Fast and lightweight, especially if it's being invoked a few times;
each AST is only a few KB, but they can add up to hundreds of MB
Fast it will be, especially if you save a core. Lightweight, well, it is well-known that lisp images can take some disk space (e.g. 46MB), but this is rarely an issue. Why is is important? Maybe you have another definition about what lightweight means, because this is unrelated to the size of the AST you will be parsing. There should be no problem reading those AST, though.
Should work on Linux at least; cross-platform would be nice
See Wikipedia. For example, Clozure CL (CCL) runs on Mac OS X, FreeBSD, Linux, Solaris and Windows, 32/64 bits.
Working on a slightly different task, I again found the need to process a bunch of s-expressions. This time I needed to perform some non-trivial processing of the given s-expressions (extracting lists of symbols used, etc.), rather than having the option to pass them along as opaque strings.
I gave Racket a try and was pleasantly surprised; it was much nicer than the other Lisps I've used before (Emacs Lisp and various application-specific Scheme scripts), since it has nice documentation and a batteries included standard library.
Some of the relevant points for this kind of task:
"Ports" for reading and writing data. These can be (dynamically?) scoped across an expression, and default to stdio (i.e. (current-input-port) defaults to stdin and (current-output-port) defaults to stdout). Ports make stdio and file access about as nice to use as a shell: more verbose, but fewer gnarly edge-cases.
Various conversion functions like port->string, file->lines, read, etc. make it easy to get data at the appropriate form of granularity (characters, lines, strings, expressions, etc.).
I couldn't find a "standard" way to read multiple s-expressions, since read only returns one, so iteration/recursion would be needed to do this in a streaming fashion.
If streaming isn't needed, I found it easiest to read the whole input as a string, append "(\n" and "\n)", then use (with-input-from-string my-modified-input read) to get one big list.
I found Racket's startup time to be pretty slow, so I wouldn't recommend invoking a script over and over as part of a loop if speed is a concern. It was easy enough to move my looping into Racket and have the script invoked once though.

How can I have different states of expansion in LuaTex/LuaLaTex for debugging for instance?

I am preparing LaTex/Tex fragments with lua programs getting information from SQL request to a database (LuaSQL).
I wish I could see intermediate states of expansion for debugging purpose but also to control what has been brought from SQL requests and lua processings.
My dream would be for instance to see the code of my Latex page as if I had typed it myself manually with all the information given by the SQL requests and the lua processing.
I would then have a first processing with my lua programs and SQL request to build a valid and readable luaLatex code I could amend if necessary ; then I would compile again that file to obtain the wanted pdf document.
Today, I use a lua development environment, ZeroBrane Studio, to execute and test the lua chunk before I integrate it in my luaLatex code. For instance:
my lua chunk :
for k,v in pairs(data.param) do
print('\\def\\'..k..'{'..data.param[k]..'}')
end
lua print out:
\gdef\pB{0.7}
\gdef\pAinterB{0.5}
\gdef\pA{0.4}
\gdef\pAuB{0.6}
luaLaTex code :
nothing visible ! except that now I can use \pA for instance in the code
my dream would be to have, in the luaLatex code :
\gdef\pB{0.7}
\gdef\pAinterB{0.5}
\gdef\pA{0.4}
\gdef\pAuB{0.6}
May be a solution would stand in the use of the expl3 extension ? But since I am not familiar with it nor with the precise Tex expansion process, I prefer to ask you experts before I invest heavily in the understanding of this module.
Addition :
Pushing forward the reflection, a consequence would be that from a Latex code I get a Latex code instead of a pdf file for instance. This implies that we use only the first steps of the four TeX processors as described by Veijkhout in "TeX by Topic" : the input processor, the expansion processor (but with a controlled depth of expansion), not the execution processor nor the visual processor. Moreover, there would be need to show the intermediate state, that means a new processor able to show tokens back into readable strings and correct Tex/Latex code that can be processed later.
Unless somebody has already done or seen something like that, I feel that my wish may be unfeasible in the short and middle terms. What is your feeling, should I abandon any hope ?

Debugging a program without source code (Unix / GDB)

This is homework. Tips only, no exact answers please.
I have a compiled program (no source code) that takes in command line arguments. There is a correct sequence of a given number of command line arguments that will make the program print out "Success." Given the wrong arguments it will print out "Failure."
One thing that is confusing me is that the instructions mention two system tools (doesn't name them) which will help in figuring out the correct arguments. The only tool I'm familiar with (unless I'm overlooking something) is GDB so I believe I am missing a critical component of this challenge.
The challenge is to figure out the correct arguments. So far I've run the program in GDB and set a breakpoint at main but I really don't know where to go from there. Any pro tips?
Are you sure you have to debug it? It would be easier to disassemble it. When you disassemble it look for cmp
There exists not only tools to decompile X86 binaries to Assembler code listings, but also some which attempt to show a more high level or readable listing. Try googling and see what you find. I'd be specific, but then, that would be counterproductive if your job is to learn some reverse engineering skills.
It is possible that the code is something like this: If Arg(1)='FOO' then print "Success". So you might not need to disassemble at all. Instead you only might need to find a tool which dumps out all strings in the executable that look like sequences of ASCII characters. If the sequence you are supposed to input is not in the set of characters easily input from the keyboard, there exist many utilities that will do this. If the program has been very carefully constructed, the author won't have left "FOO" if that was the "password" in plain sight, but will have tried to obscure it somewhat.
Personally I would start with an ltrace of the program with any arbitrary set of arguments. I'd then use the strings command and guess from that what some of the hidden argument literals might be. (Let's assume, for the moment, that the professor hasn't encrypted or obfuscated the strings and that they appear in the binary as literals). Then try again with one or two (or the requisite number, if number).
If you're lucky the program was compiled and provided to you without running strip. In that case you might have the symbol table to help. Then you could try single stepping through the program (read the gdb manuals). It might be tedious but there are ways to set a breakpoint and tell the debugger to run through some function call (such as any from the standard libraries) and stop upon return. Doing this repeatedly (identify where it's calling into standard or external libraries, set a breakpoint for the next instruction after the return, let gdb run the process through the call, and then inspect what the code is doing besides that.
Coupled with the ltrace it should be fairly easy to see the sequencing of the strcmp() (or similar) calls. As you see the string against which your input is being compared you can break out of the whole process and re-invoke the gdb and the program with that one argument, trace through 'til the next one and so on. Or you might learn some more advanced gdb tricks and actually modify your argument vector and restart main() from scratch.
It actually sounds like fun and I might have my wife whip up a simple binary for me to try this on. It might also create a little program to generate binaries of this sort. I'm thinking of a little #INCLUDE in the sources which provides the "passphrase" of arguments, and a make file that selects three to five words from /usr/dict/words, generates that #INCLUDE file from a template, then compiles the binary using that sequence.

BASH shell process control - any other examples of controlling/scheduling work

I've inherited a medium sized project in which the main (batch) program is fed work through a large set of shell scripts that do a lot of process control (waiting for process to complete, sleeping, checking for conditions, etc) [ and reprocessed through perl scripts ]
Are there other examples of process control by shell scripts ? I would like to see what other people have done as a comparison. (as i'm not really fond of the 6,668 line shell script)
It may lead to that the current program works and doesn't need to be messed with or for maintenance reasons - it's too cumbersome and doing it another way will be easier to maintain, but I need other examples.
To reduce the "generality" of the question here's an example of what I'm looking for: procsup
Inquisitor project relies on process control from shell scripts extensively. You might want to see it's directory with main function set or directory with tests (i.e. slave processes) that it runs.
This is quite general question, and therefore giving specific answers may be a little bit difficult. (And you wont be happy with 5000 lines long example.) Most probably architecture of your application is faulty, and requires rather complete rework.
As you probably already know, process control with bash is pretty simple:
./test_script.sh &
test_script_pid=$!
wait $test_script_pid # waits until it's done
./test_script2.sh
echo $? # Prints return code of previous command
You can do same things with for example Python subprocess (or with Perl, obviously). If you have complex architecture with large number of different programs, then process is obviously non-trivial.
That is an awfully bug shell script. Have you considered refactoring it?
From the sound of it, there may be a lot of instances where you could replace several lines of code with a call to a shell function. If you can simplify the code in this way, then it will be easier to see where there are errors in the logic.
I've used this tactic successfully with a humongous PERL script and it turned out to have some serious logic errors and to be a security risk because it had embedded passwords that were obfuscated in an easily reversible way. The passwords that were exposed could have been used by persons unknown (well, a disgruntled employee) to shut down an entire global network.
Some managers were leaning towards making a security exception because this script was so important, but when the logic error was explained and it was clear that this script was providing incorrect data, it was decided that no data was better than dirty data. The guy who wrote that script taught himself programming with a PERL book and the writing of the script.

Resources