Why is the function compose defined in the interpreter csi and not in the compiler csc? I know I can easily define it by myself, but I was just wondering why there is such a difference.
The compose procedure is from the unit data-structures (see the manual or API docs). You can load it by typing (use data-structures).
As to why it's available in csi and not in csc by default is a bit of a hairy implementation detail; the csi interpreter simply needs to load a few modules itself in order to provide an interpreter environment. Due to the way it's implemented, everything that's loaded by csi also becomes available at the top-level. In CHICKEN 5, this situation has improved quite a bit and the interpreter starts with a clean top-level environment.
So what this means in practice is that you should always explicitly (use) all modules that your program needs, to ensure that it works in compiled mode as well as interpreted mode. Instead of putting your program at the top-level environment, you can also wrap your entire program inside a module. Modules always have a completely clean environment, so there will be no difference when you compile it versus when you interpret it.
Related
In python, you can use pdb.set_trace() in the code to launch a pdb debugger right there when the code reaches that point, without having to deal with debuggers or breakpoints. Is there such an equivalent with gdb or any other debugger for go? I see https://golang.org/doc/gdb#Naming but I don't see how to apply it the same way.
No, there is no such equivalent. Python is inherently interpreted1 and pdb is simply part of any running instance of Python, so this is a lot easier there.
Once you are running under gdb or dlv, though, it's not that hard to set a breakpoint in some known function. Calling that function from the point at which you want to drop into the debugger will drop you into the debugger. So instead of pdb.set_trace just call debugging.Stop() and write a debugging package with a Stop function that just returns. Set your breakpoint there and run your program.
1Python can in theory be compiled or JIT-ted, but this tends not to work as well as with other languages due to the extremely dynamic nature of the language, e.g., method invocation for instance. Adding a few small restrictions to the language, none of which make it less usable, would make compilation to fast code much easier. For further details see Does the Python 3 interpreter have a JIT feature? (Removing the Global Interpreter Lock would have a big payoff as well, but is also hard: see the PyPy FAQ.)
TLDP's Advanced Bash Scripting Guide states that shell scripts shouldn't be used for "situations where security is important, where you need to guarantee the integrity of your system and protect against intrusion, cracking, and vandalism."
What makes shell scripts unsuitable for such a use case?
Because of the malleability of the shell, it is difficult to verify that a shell script performs its intended function and only that function in the face of adversarial input. The way the shell behaves depends on the environment, plus the settings of its own numerous configuration variables. Each command line is subject to multiple levels of expansion, evaluation and interpolation. Some shell constructs run in subprocesses while the variables the construct contains are expanded in the parent process. All of this is counter to the KISS principle when designing systems that might be attacked.
Probably because it's just easy to screw up. When the PATH is not set correctly, your script will start executing the wrong commands. Putting a space somewhere in a string might cause it to become two strings later on. These can lead to exploitable security holes. In short: shells give you some guarantees as to how your script will behave, but they're too weak or too complex for truly secure programming.
(To this I would like to add that secure programming is an art in itself, and screwing up is possible in any language.)
I would disagree with that statement, as there is nothing about scripts that make them inherently unsafe. Bash scripting are perfectly safe if some simple guidelines are followed:
Does the script contain info that others shouldn't be able to view?
If so, make sure it's only readable by the owner.
Does the script depend on input data from somewherE? If so, ensure that input data
can not be tainted in any way, or that tainted data can be detected
and discarded.
Does it matter if others were to try and run the
script? If so, as with the first point, ensure that nobody can execute it, and preferably not read from it. chmod 0700 is generally a good idea for scripts that perform system functions.
And the cases where you'd want a script to have a setuid (via its interpreter) are
extremely rare
The two points that separate a script from a compiled program would be that the source is visible, and that an interpreter executes it. As long as the interpreter hasn't been compromised (such as having a setuid bit on it), you'd be fine.
When writing scripts to do system tasks, typos and screwups and general human error when writing it do to some extent represent a potential security failure, but that would also be the case with compiled programs (and a lot of people tend to ignore the fact that compiled programs can also be disassembled)
It is worth noting that in most (if not all) linux flavors, most (if not all, in fact, can't think of any that aren't) services are started via a shellscript.
it's easier for bad boys to make shell script work differently (it interacts a lot with other processes, PATH, shell functions, prifile)
it's harder for good boys to deal with sensitive data (passing passwords, etc)
Is there a way to hide a system call from strace and a dynamic library call from ltrace? For example, the use of system (<stdlib.h>).
In the last class for my software construction this semester, the instructor revealed to us that we could have gotten away with using the system library function call in many parts of the command shell project we were assigned instead of the more complicated fork, exec, readdir, stat, dup, and pipe system calls we were told to use.
The way system works, he said, is you simply pass in a string of the command you want to execute: system("cmd [flags] [args]; cmd && cmd"); and there you are.
We were not supposed to use this function, but he said he didn't check our programs for it. One way to hide its use would have been to obscure it through Macro definitions and such. However, ltrace is still able to track system down when used through Macros. I believe it even finds it when its called from a separate program, like `execvp( "./prgrm_with_system", ...).
My chance to use it is gone, but I am really curious about whether there is a way to hide system from even ltrace.
system() doesn't do anything that's magic. It doesn't even do anything that's smart (and using it is often a code smell). It also isn't a system call in the sense that the term "syscall" refers to.
You could trivially create your own version of system() using the underlying syscalls fork() and execve(), and bypass detection with ltrace... but strace would still show those calls happening.
You also could bypass ltrace with static linking, but since syscalls are by definition for things that require the OS kernel's help, you can't do without them entirely -- so tools such as strace, sysdig, truss, dtrace, and local equivalents can't be so easily avoided (without exploiting security vulnerabilities in the OS or the tools themselves).
The Unix philosophy teaches that we should develop small programs that do one thing well. It also teaches that we should separate policy from mechanics. I guess one way to take this is to design a text-based shell command first and build a gui on top of that later (if desired).
I truly like the idea that small programs can be composed (piped together) into more complex systems. I also like the fact that simple, focused designs should theoretically need less maintenance than a monolithic system that binds all its rules together.
How sound would it be to program something (in Ruby or Python for example) that relegates some of its functionality to shell commands called straight from the code? Taking this a step further, does it make sense to deliberately design a shell command that is intended to be called directly from code (compiled or scripted)? Obviously, this would only make sense if the shell command had some worthy console use.
I can't say from my experience that this is a practice I've seen much of. More times than not task-specific code relies on task-specific libraries. Of course, it's possible that, unbeknownst to me, I have made use of libraries which are actually just wrappers around shell commands. (Or rather the shell command is a wrapper around some library.)
The unix paradigm is modularity. You should write your program as a bunch of modules, which can then be extracted into multiple programs if you want to. However, executing a new program whenever you'd like to make a function call is slow and unpractical.
I am aware that this is nothing new and has been done several times. But I am looking for some reference implementation (or even just reference design) as a "best practices guide". We have a real-time embedded environment and the idea is to be able to use a "debug shell" in order to invoke some commands. Example: "SomeDevice print reg xyz" will request the SomeDevice sub-system to print the value of the register named xyz.
I have a small set of routines that is essentially made up of 3 functions and a lookup table:
a function that gathers a command line - it's simple; there's no command line history or anything, just the ability to backspace or press escape to discard the whole thing. But if I thought fancier editing capabilities were needed, it wouldn't be too hard to add them here.
a function that parses a line of text argc/argv style (see Parse string into argv/argc for some ideas on this)
a function that takes the first arg on the parsed command line and looks it up in a table of commands & function pointers to determine which function to call for the command, so the command handlers just need to match the prototype:
int command_handler( int argc, char* argv[]);
Then that function is called with the appropriate argc/argv parameters.
Actually, the lookup table also has pointers to basic help text for each command, and if the command is followed by '-?' or '/?' that bit of help text is displayed. Also, if 'help' is used for a command, the command table is dumped (possible only a subset if a parameter is passed to the 'help' command).
Sorry, I can't post the actual source - but it's pretty simple and straight forward to implement, and functional enough for pretty much all the command line handling needs I've had for embedded systems development.
You might bristle at this response, but many years ago we did something like this for a large-scale embedded telecom system using lex/yacc (nowadays I guess it would be flex/bison, this was literally 20 years ago).
Define your grammar, define ranges for parameters, etc... and then let lex/yacc generate the code.
There is a bit of a learning curve, as opposed to rolling a 1-off custom implementation, but then you can extend the grammar, add new commands & parameters, change ranges, etc... extremely quickly.
You could check out libcli. It emulates Cisco's CLI and apparently also includes a telnet server. That might be more than you are looking for, but it might still be useful as a reference.
If your needs are quite basic, a debug menu which accepts simple keystrokes, rather than a command shell, is one way of doing this.
For registers and RAM, you could have a sub-menu which just does a memory dump on demand.
Likewise, to enable or disable individual features, you can control them via keystrokes from the main menu or sub-menus.
One way of implementing this is via a simple state machine. Each screen has a corresponding state which waits for a keystroke, and then changes state and/or updates the screen as required.
vxWorks includes a command shell, that embeds the symbol table and implements a C expression evaluator so that you can call functions, evaluate expressions, and access global symbols at runtime. The expression evaluator supports integer and string constants.
When I worked on a project that migrated from vxWorks to embOS, I implemented the same functionality. Embedding the symbol table required a bit of gymnastics since it does not exist until after linking. I used a post-build step to parse the output of the GNU nm tool for create a symbol table as a separate load module. In an earlier version I did not embed the symbol table at all, but rather created a host-shell program that ran on the development host where the symbol table resided, and communicated with a debug stub on the target that could perform function calls to arbitrary addresses and read/write arbitrary memory. This approach is better suited to memory constrained devices, but you have to be careful that the symbol table you are using and the code on the target are for the same build. Again that was an idea I borrowed from vxWorks, which supports both teh target and host based shell with the same functionality. For the host shell vxWorks checksums the code to ensure the symbol table matches; in my case it was a manual (and error prone) process, which is why I implemented the embedded symbol table.
Although initially I only implemented memory read/write and function call capability I later added an expression evaluator based on the algorithm (but not the code) described here. Then after that I added simple scripting capabilities in the form of if-else, while, and procedure call constructs (using a very simple non-C syntax). So if you wanted new functionality or test, you could either write a new function, or create a script (if performance was not an issue), so the functions were rather like 'built-ins' to the scripting language.
To perform the arbitrary function calls, I used a function pointer typedef that took an arbitrarily large (24) number of arguments, then using the symbol table, you find the function address, cast it to the function pointer type, and pass it the real arguments, plus enough dummy arguments to make up the expected number and thus create a suitable (if wasteful) maintain stack frame.
On other systems I have implemented a Forth threaded interpreter, which is a very simple language to implement, but has a less than user friendly syntax perhaps. You could equally embed an existing solution such as Lua or Ch.
For a small lightweight thing you could use forth. Its easy to get going ( forth kernels are SMALL)
look at figForth, LINa and GnuForth.
Disclaimer: I don't Forth, but openboot and the PCI bus do, and I;ve used them and they work really well.
Alternative UI's
Deploy a web sever on your embedded device instead. Even serial will work with SLIP and the UI can be reasonably complex ( or even serve up a JAR and get really really complex.
If you really need a CLI, then you can point at a link and get a telnet.
One alternative is to use a very simple binary protocol to transfer the data you need, and then make a user interface on the PC, using e.g. Python or whatever is your favourite development tool.
The advantage is that it minimises the code in the embedded device, and shifts as much of it as possible to the PC side. That's good because:
It uses up less embedded code space—much of the code is on the PC instead.
In many cases it's easier to develop a given functionality on the PC, with the PC's greater tools and resources.
It gives you more interface options. You can use just a command line interface if you want. Or, you could go for a GUI, with graphs, data logging, whatever fancy stuff you might want.
It gives you flexibility. Embedded code is harder to upgrade than PC code. You can change and improve your PC-based tool whenever you want, without having to make any changes to the embedded device.
If you want to look at variables—If your PC tool is able to read the ELF file generated by the linker, then it can find out a variable's location from the symbol table. Even better, read the DWARF debug data and know the variable's type as well. Then all you need is a "read-memory" protocol message on the embedded device to get the data, and the PC does the decoding and displaying.