many Shell Commands architecture - bash

at work, we are using docker and docker-compose, our developers need to start many containers locally and import a large database, there are many services that need to run together for development to be successful and easy.
so we sort of define reusable functions as make commands to make the code easier to maintain, is there another way to define and reuse many shell commands better than make.
for us due to network limitations running docker locally is the only option.
we managed to solve this challenge and make our developers' life easier by abstracting away complex shell commands behind multiple make targets, and in order to split these numerous make targets that control our docker infrastructure and containers we decided to split the targets among many files with .mk extension.
there are multiple make commands, like 40 of them, some of them are low level, some are meant to be called by developers to do certain tasks.
make launch_app
make import_db_script
make build_docker_images
but lately things are starting to become a little slow, with make commands calling other make commands internally, each make call is taking significant amount of time, since each lower level make call has to go through all defined .mk files, and do some calculations, as it shows when we run make -d, so it starts to add up to a considerable overhead.
is there any way to manage a set of complex shell commands using anything other than make, while still being easy for our developers to call.
thanks in advance.

Well, you could always just write your shell commands in a shell script instead of a makefile. Using shell functions, shell variables, etc. it can be managed. You don't give examples of how complex your use of make constructs is.
StackOverflow is not really a place to ask open-ended questions like "what's the best XYZ". So instead I'll treat this question as, "how can I speed up my makefiles".
To me it sounds like you just have poorly written makefiles. Again, you don't show any examples but it sounds like your rules are invoking lots of sub-makes (e.g., your rule recipes run $(MAKE) etc.) That means lots of processes invoked, lots of makefiles parsed, etc. Why don't you just have a single instance of make and use prerequisites, instead of sub-makes, to run other targets? You can still split the makefiles up into separate files then use include ... to gather them all into a single instance of make.
Also, if you don't need to rebuild the makefiles themselves you should be sure to disable the built-in rules that might try to do that. In fact, if you are just using make to run docker stuff you can disable all the built-in rules and speed things up a good bit. Just add this to your makefile:
MAKEFLAGS += -r
(see Options Summary for details of this option).
ETA
You don't say what version of GNU make you're using, or what operating system you're running on. You don't show any examples of the recipes you're using so we can see how they are structured.
The problem is that your issue, "things are slow", is not actionable, or even defined. As an example, the software I work on every day has 41 makefiles containing 22,500 lines (generated from cmake, which means they are not as efficient as they could be: they are generic makefiles and not using GNU make features). The time it takes for my build to run when there is nothing to actually do (so, basically the entire time is taken by parsing the makefiles), is 0.35 seconds.
In your comments you suggest you have 10 makefiles and 50 variables... I can't imagine how any detectable slowness could be caused by this size of makefile. I'm not surprised, given this information, that -r didn't make much difference.
So, there must be something about your particular makefiles which is causing the slowness: the slowness is not inherent in make. Obviously we cannot just guess what that might be. You will have to investigate this.
Use time make launch_app. How long does that take?
Now use time make -n launch_app. This will read all makefiles but not actually run any commands. How long does that take?
If make -n takes no discernible time then the issue is not with make, but rather with the recipes you've written and switching to a different tool to run those same recipes won't help.
If make -n takes a noticeable amount of time then something in your makefiles is slow. You should examine it for uses of $(shell ...) and possibly $(wildcard ...); those are where the slowness will happen. You can add $(info ...) statements around them to get output before and after they run: maybe they're running lots of times unexpectedly.
Without specific examples of things that are slow, there's nothing else we can do to help.

Related

Is it true that "[s]hell scripts put the source code right out in the open for all the world to see"? How come? [duplicate]

TLDP's Advanced Bash Scripting Guide states that shell scripts shouldn't be used for "situations where security is important, where you need to guarantee the integrity of your system and protect against intrusion, cracking, and vandalism."
What makes shell scripts unsuitable for such a use case?
Because of the malleability of the shell, it is difficult to verify that a shell script performs its intended function and only that function in the face of adversarial input. The way the shell behaves depends on the environment, plus the settings of its own numerous configuration variables. Each command line is subject to multiple levels of expansion, evaluation and interpolation. Some shell constructs run in subprocesses while the variables the construct contains are expanded in the parent process. All of this is counter to the KISS principle when designing systems that might be attacked.
Probably because it's just easy to screw up. When the PATH is not set correctly, your script will start executing the wrong commands. Putting a space somewhere in a string might cause it to become two strings later on. These can lead to exploitable security holes. In short: shells give you some guarantees as to how your script will behave, but they're too weak or too complex for truly secure programming.
(To this I would like to add that secure programming is an art in itself, and screwing up is possible in any language.)
I would disagree with that statement, as there is nothing about scripts that make them inherently unsafe. Bash scripting are perfectly safe if some simple guidelines are followed:
Does the script contain info that others shouldn't be able to view?
If so, make sure it's only readable by the owner.
Does the script depend on input data from somewherE? If so, ensure that input data
can not be tainted in any way, or that tainted data can be detected
and discarded.
Does it matter if others were to try and run the
script? If so, as with the first point, ensure that nobody can execute it, and preferably not read from it. chmod 0700 is generally a good idea for scripts that perform system functions.
And the cases where you'd want a script to have a setuid (via its interpreter) are
extremely rare
The two points that separate a script from a compiled program would be that the source is visible, and that an interpreter executes it. As long as the interpreter hasn't been compromised (such as having a setuid bit on it), you'd be fine.
When writing scripts to do system tasks, typos and screwups and general human error when writing it do to some extent represent a potential security failure, but that would also be the case with compiled programs (and a lot of people tend to ignore the fact that compiled programs can also be disassembled)
It is worth noting that in most (if not all) linux flavors, most (if not all, in fact, can't think of any that aren't) services are started via a shellscript.
it's easier for bad boys to make shell script work differently (it interacts a lot with other processes, PATH, shell functions, prifile)
it's harder for good boys to deal with sensitive data (passing passwords, etc)

partial parallel and serial compilation with make

I am compiling llvm with make. When I do a parallel compile I do not have enough RAM during the linking steps. Is it possible to to a parallel compilation for all the object files and serial compilation during the linking step? For now stop compilation when my machine starts swapping and just restart the build process with make -j1, it would be neat if this could be done without human interaction.
I am not aware of any make implementation that dynamically adapts the degree of parallelism according to resource consumption. Indeed, although there is a potential to do that to some degree, the problem is not really solvable by make, because processes' resource consumption is not static. That is, a make could conceivably observe, say, that physical RAM was overcommitted, and react by holding off on starting new child processes, but it cannot easily protect against starting several child processes while resource consumption is low, which then balloon to demand more resources than are available.
However, depending on the makefile, there may be a workaround: you can name specific targets for make to build. You can use that to specify what targets you want to build in parallel, and then run a separate make to complete the build serially. That might look something like this:
make -j4 object1.o object2.o object3.o object4.o
make
That's rather unwieldy though. Supposing that there exists a target (or that you can create one) that represents all the object files but not any of the linked libraries / executables, then you can use that:
Makefile
OBJECTS = object1.o object2.o object3.o object4.o
all: my_program
my_program: objects
# ...
# This target collects all the objects:
objects: $(OBJECTS)
.PHONY: all objects
Command line
make -j4 objects
make
If your memory consumption correlates somehow with the load, you may limit make to take load into account with -l option. Relevant documentation:
When the system is heavily loaded, you will probably want to run fewer jobs than when it is lightly loaded. You can use the ‘-l’ option to tell make to limit the number of jobs to run at once, based on the load average. The ‘-l’ or ‘--max-load’ option is followed by a floating-point number.
This might or might not help in your case, but it can be worth to check.

calling perl script with system VS implementing package

Let me start with giving an example of what I'm dealing with first:
I often call existed Perl scripts from previous engineers to process some data, and then proceed further with my script. I either use system or back-ticks to call other people scripts within my script.
Now, I'm wondering if I rewrite those scripts as packages and use require or use to include those packages in my script, will it increase the processing speed? How big of a difference would it be?
Benefits:
It would save the time taken to load the shell, load perl, compile the script and the module it uses. That's a couple of seconds minimum, but it could be much larger.
If you had to serialize data to pass to the child, you also save the time taken to serialize and deserialize the data.
It would allow more flexible interfaces.
It would make error handling easier and more flexible.
Downsides:
Since everything is now in the same process, the child can have a much larger effect on the parent. e.g. A crash in the child will take down the parent.

Using gmake to build large system

I'm working on trying to fix/redo the makefile(s) for a legacy system, and I keep banging my head against some things. This is a huge software system, consisting of numerous executables, shared objects, and libraries, using C, Fortran, and a couple of other compilers (such as Motif's UIL compiler). I'm trying to avoid recursive make and I would prefer "one makefile to build them all" rather than the existing "one makefile per executable/.so/.a" idea. (We call each executable, shared object, library, et al a "task," so permit me to use that term as I go forward.)
The system is scattered across tons of different source subdirectories, with includes usually in one of 6 main directories, though not always.
What would be the best way to deal with the plethora of source & target directories? I would prefer including a Task.mk file for each task that would provide the task-specific details, but target-specific variables don't give me enough control. For example, they don't allow me to change the source & target directories easily, at least from what I've tried.
Some things I cannot (i.e., not allowed to) do include:
Change the structure of the project. There's too much that would need to be changed everywhere to pull that off.
Use a different make. The official configuration (which our customer has guaranteed, so we don't need to deal with unknown configurations) uses gmake 3.81, period.

calling shell commands from code by design?

The Unix philosophy teaches that we should develop small programs that do one thing well. It also teaches that we should separate policy from mechanics. I guess one way to take this is to design a text-based shell command first and build a gui on top of that later (if desired).
I truly like the idea that small programs can be composed (piped together) into more complex systems. I also like the fact that simple, focused designs should theoretically need less maintenance than a monolithic system that binds all its rules together.
How sound would it be to program something (in Ruby or Python for example) that relegates some of its functionality to shell commands called straight from the code? Taking this a step further, does it make sense to deliberately design a shell command that is intended to be called directly from code (compiled or scripted)? Obviously, this would only make sense if the shell command had some worthy console use.
I can't say from my experience that this is a practice I've seen much of. More times than not task-specific code relies on task-specific libraries. Of course, it's possible that, unbeknownst to me, I have made use of libraries which are actually just wrappers around shell commands. (Or rather the shell command is a wrapper around some library.)
The unix paradigm is modularity. You should write your program as a bunch of modules, which can then be extracted into multiple programs if you want to. However, executing a new program whenever you'd like to make a function call is slow and unpractical.

Resources