Is there a way to see the original code of a executable sh script. (I am very new to Linux and trying to understand what things do and such.)
If you know how I need very clear step by step process so I can just type i the commands and run them.
Thanks for your help. Trying to learn (Windows man for 25 years here)
A shell script specifically can be seen in the original text form by simply printing the contents of the file:
cat disk-space.sh.x
Several caveats:
If you mean an executable rather than a script the situation is different. Scripts are read by an interpreter at runtime, which then executes it line by line. Executables may be either scripts or ELF binaries. The latter have been transformed from the original source code to a machine readable form which is very much harder to read for humans.
The extension of the file (.sh.x or .x) does not control whether the file contents are executed as a binary or script.
If the file really is a script it may have been obfuscated, meaning that the source code on your system has deliberately been changed to make the resulting file hard to read.
Related
This question already has answers here:
How do Linux binary installers (.bin, .sh) work?
(5 answers)
Closed 1 year ago.
I noticed that the 64-Bit Command Line Anaconda Installer for macOS is a large 400+ MB Bash/Bourne shell script.
When I tried to read it, I noticed that its first 555 lines are readable text, but the following part of the script is in the binary format, probably encrypted.
See https://www.anaconda.com/products/individual and https://repo.anaconda.com/archive/Anaconda3-2021.05-MacOSX-x86_64.sh.
I noticed similar scripts, such as Tcl scripts associated with electronic design automation software.
How do we transform source code files, such as scripts (shell scripts, or Tcl/Perl/Python/Ruby scripts, or C++/Java/Scala/Haskell/Lisp source code), into partially readable text and binary otherwise?
Can we just merge two parts, one in ASCII/text format, and the other in binary format?
That said, how do we obtain the binary executable for scripts, such as shell scripts or Tcl/Perl/Python/Ruby scripts?
I know how to obtain binary executables for C++ and C, and FORTRAN.
Other than using a platform-specific (in terms of operating system and hardware configuration, such as processor type or instruction set architecture) compiler to compile scripts into binary executables, and concatenating them text files with the binary files, how else can I do it?
Are there software applications that do this? What techniques, in terms of algorithms, do these software applications use?
Thank you so much, and have a good day!
To answer one of your questions, here is a helpful guide to embedding a binary file into a shell/bash script:
https://www.xmodulo.com/embed-binary-file-bash-script.html
(code example below is taken from this link)
The body of the shell script needs to be commands to isolate & execute the binary data contained within.
The trick is to place an "exit" command at the end of the written script followed by a unique delimiter line (which is "__PAYLOAD_BEGINS__" in the below example):
#!/bin/bash
# line number where payload starts
PAYLOAD_LINE=$(awk '/^__PAYLOAD_BEGINS__/ { print NR + 1; exit 0; }' $0)
# directory where a binary executable is to be saved
WORK_DIR=/tmp
# name of an embedded binary executable
EXE_NAME=dummy_executable
# extract the embedded binary executable
tail -n +${PAYLOAD_LINE} $0 | base64 -d | cat > ${WORK_DIR}/${EXE_NAME}
chmod +x ${WORK_DIR}/${EXE_NAME}
# run the executable as needed
${WORK_DIR}/${EXE_NAME}
exit 0
__PAYLOAD_BEGINS__
Then you can append the script with base64-encoded binary data:
$ base64 dummy_executable >> script.sh
You could also append the binary data without base64-encoding, but this is not recommended as you will not be able to edit the script again after doing so.
Shell Scripts with Payload
In Anaconda3....sh there is nothing encrypted. There are multiple binary files appended to the end of the script. Creating such a file yourself is trivial. Open a terminal and run
cat script.sh file1 file2 ... > script-with-payload.sh
The only tricky part is to write a script.sh that can handle the payload.
For starters, write exit at the end of your script.sh, so that the shell does not try to interpret the binary part as shell commands when executing script-with-payload.sh.
Then, somewhere inside script.sh use something like tail, sed, or dd to extract the binary data at the end of the script.
For a concrete example see Combine a shell script and a zip file into a single executable for deployment or Self-extracting script in sh shell or How do Linux binary installers (.bin, .sh) work?.
In Anaconda3....sh they use dd commands to extract a Mach-O 64-bit x86_64 executable (14'807'207 bytes) and a tar.bz2 file (438'910'836 bytes). Comments in the script point out that the shell script was generated by shar.py.
Remaining Questions
How do we transform source code files, such as [...] C++/Java/Scala/Haskell/Lisp [...] into partially readable text and binary otherwise?
C++, Java, and so on have to be compiled to be run, so distributing the uncompiled text file with an embedded payload doesn't really make sense.
how do we obtain the binary executable for scripts, such as shell scripts or Tcl/Perl/Python/Ruby scripts?
This is an entirely different question and has to be answered for each scripting language independently. The general answer is, you don't. Scripting languages are not meant to be compiled.
Are there software applications that do this?
Yes, by searching for bash payload or bash selfextracting you can find quite a few tools. However, most of them seem rather hacked together. The most officially are is GNU sharutils and makeself.
What techniques, in terms of algorithms, do these software applications use?
The principle is always the same: concat a script and some payload, then let the script extract the payload from itself. There is no "algorithm" involved.
It is possible to supply batch commands directly with the -b flag, but if the commands become very long, this is no longer an option. Is there a way to give the path to an .scm script that was written to a file, without having to move the file into the scripts directory?
No as far as I know. What you give in the -b flag is a Scheme statement, which implies your function has already been loaded by the script executor process. You can of course add more directories that are searched for scripts using Edit>Preferences>Folders>Scripts.
If you write your script in Python the problem is a bit different since you can alter the Python path before loading the script code but the command line remains a bit long.
I ask because I recently made a change to a KornShell (ksh) script that was executing. A short while after I saved my changes, the executing process failed. Judging from the error message, it looked as though the running process had seen some -- but not all -- of my changes. This strongly suggests that when a shell script is invoked, the entire script is not read into memory.
If this conclusion is correct, it suggests that one should avoid making changes to scripts that are running.
$ uname -a
SunOS blahblah 5.9 Generic_122300-61 sun4u sparc SUNW,Sun-Fire-15000
No. Shell scripts are read either line-by-line, or command-by-command followed by ;s, with the exception of blocks such as if ... fi blocks which are interpreted as a chunk:
A shell script is a text file containing shell commands. When such a
file is used as the first non-option argument when invoking Bash, and
neither the -c nor -s option is supplied (see Invoking Bash), Bash
reads and executes commands from the file, then exits. This mode of
operation creates a non-interactive shell.
You can demonstrate that the shell waits for the fi of an if block to execute commands by typing them manually on the command line.
http://www.gnu.org/software/bash/manual/bashref.html#Executing-Commands
http://www.gnu.org/software/bash/manual/bashref.html#Shell-Scripts
It's funny that most OS'es I know, do NOT read the entire content of any script in memory, and run it from disk. Doing otherwise would allow making changes to the script, while running. I don't understand why that is done, given the fact :
scripts are usually very small (and don't take many memory anyway)
at some point, and shown in this thread, people would start making changes to a script that is already running anyway
But, acknowledging this, here's something to think about: If you decided that a script is not running OK (because you are writing/changing/debugging), do you care on the rest of the running of that script ? you can go ahead making the changes, save them, and ignore all output and actions, done by the current run.
But .. Sometimes, and that depends on the script in question, a subsequent run of the same script (modified or not), can become a problem since the current/previous run is doing an abnormal run. It would typically skip some stuff, or sudenly jump to parts in the script, it shouldn't. And THAT may be a problem. It may leave "things" in a bad state; particularly if file manipulation/creation is involved.
So, as a general rule : even if the OS supports the feature or not, it's best to let the current run finish, and THEN save the updated script. You can change it already, but don't save it.
It's not like in the old days of DOS, where you actually have only one screen in front of you (one DOS screen), so you can't say you need to wait on run completion, before you can open a file again.
No they are not and there are many good reasons for that.
One of the things you should keep in mind is that a shell is not an interpreter even if there are some similarities. Shells are designed to work with a stream of commands. Either from the TTY ,a PIPE, FIFO or even a socket.
The shell reads from its resource line by line until a EOF is returned by the kernel.
The most shells have no extra support for interpreting files. they work with a file as they would work with a terminal.
In fact this is considered to be a nice feature because you can do interesting stuff like this How do Linux binary installers (.bin, .sh) work?
You can use a binary file and prepend shell scripts. You can't do this with an interpreter. because it parses the whole file or at least it would try it and fail. A shell would just interpret it line by line and doesnt care about the garbage at the end of the file. You just have to make sure the execution of the script gets terminated before it reaches the binary part.
I have a Ruby script that I wrote that sorts some files in a jumble of directories based on it's file extension. It would be very difficult to sort it using a GUI, and its easier for me to just put the file in the topmost directory and let the sorter do the work.
Problem is, I'm a bit of a noob to unix scripting. What I want to be able to do is be able to run that sorter script from anywhere on my computer, without having to
cd Desktop/Whatever/Foo
ruby sorterscript.rb
just write sortfolders at the commandline and have the program be run.
I've tested the script many times, and it works fine, I just want a bit more convenience.
Bonus: If possible, and not too difficult, it would be even better if I could have the program run, say, every hour automatically.
As far as your first question goes, you need to do couple of things:
Add a shebang line to your script (make it the first line of the script):
#!/usr/bin/ruby (or whatever the path to the Ruby interpreter's executable is, I forget its exact location)
Make the script executable, either via the Finder's "Get Info" context menu, or via the command line, for example:
chmod 755 my_script.rb
Add the directory location of your script to the PATH environment variable to OS X's launchd.conf file, as described here. You need to add this line:
setenv PATH /path/to/my/script:$PATH (substitute the real path to your script)
As far as your bonus question goes, you can use cron to set up a recurring job. I never really do this, but here's Apple's cron man page to get you started.
I have a bunch of scripts (which can't be modified) written on Windows. Windows allows relative paths in its #! commands. We are trying to run these scripts on Unix but Bash only seems to respect absolute paths in its #! directives. I've looked around but haven't been able to locate an option in Bash or a program designed to replace and interpreter name. Is it possible to override that functionality -- perhaps even by using a different shell?
Typically you can just specify the binary to execute the script, which will cause the #! to be ignored. So, if you have a Python script that looks like:
#!..\bin\python2.6
# code would be here.
On Unix/Linux you can just say:
prompt$ python2.6 <scriptfile>
And it'll execute using the command line binary. I view the hashbang line as one which asks the operating system to use the binary specified on the line, but you can override it by not executing the script as a normal executable.
Worst case you could write some wrapper scripts that would explicitly tell the interpreter to execute the code in the script file for all the platforms that you'd be using.