Modern readable alternative to Cron Format - data-structures

Is there any modern (JSON, XML, etc...) readable alternative to the Cron Schedule Format? I understand that, for instance 0 0 12 1/1 * ? * is very powerful and can cover a lot of special cases, but it cannot be read by someone, who has not worked with cron or a similar tool before.
I would like to use something like:
<schedule>
<weekly dayOfWeek="Monday"/>
<monthly day="1" />
<schedule>
But I do not want to reinvent the wheel, so is there any standard or quasi-standard out there?

Related

Choosing a better parallel architecture in Python

I am working on Data Wrangling problem using Python,
which processes a dirty Excel file into a clean Excel file
I would like to process multiple input files by introducing concurrency/parallelism.
I have the following options 1) Using multiThreading 2) Using multiProceesing modules 3) ParallelPython module,
I have a basic idea of the three methods, I would like to know which method is best and why?
In Bref, Processing of a SINGLE dirty Excel file today takes 3 minutes,
Objective : To introduce parallelism/concurrency to process multiple files at once.
Looking for, best method of parallelism to achieve the objective
Since your process is mostly CPU bound multi-threading won't be fast because of the GIL...
I would recommend multiprocessing or concurrent.futures since they are a bit simpler the ParallelPython (only a bit :) )
example:
with concurrent.futures.ProcessPoolExecutor() as executor:
for file_path, clean_file in zip(files, executor.map(data_wrangler, files)):
print('%s is now clean!' % (file_path))
#do something with clean_file if you want
Only if you need to distribute the load between servers then I would recommend ParallelPython .

Bash piping output and input of a program

I'm running a minecraft server on my linux box in a detached screen session. I'm not very fond of screen very much and would like to be able to constantly pipe the output of the server to a file(like a pipe) and pipe some input from a file to the server(so that I can input and output to the server from remote programs, like a python script). I'm not very experienced in bash, so could somebody tell me how to do this?
Thanks, NikitaUtiu.
It's not clear if you need screen at all. I don't know the minecraft server, but generally for server software, you can run it from a crontab entry and redirect output to log files.
Assuming your server kills itself at midnight sunday night, (we can discuss changing this if restarting 1x per week is too little or too much OR you require ad-hoc restarts), but for a basic idea of what to do, here is a crontab entry that starts the server each monday at 1 minute after midnight.
01 00 * * 1 dtTm=`/bin/date +\%Y\%m\%d.\%H\%M\%S`; export dtTm; { /usr/bin/mineserver -o ..... your_options_to_run_mineserver_here ... ; } > /tmp/mineserver_trace_log.${dtTm} 2>&1
consult your man page for crontab to confirm that day-of-week ranges are 0-6 (0=Sunday), and change the day-of-week value if 0!=Sunday.
Normally I would break the code up so it is easier to read, but for crontab entries, each entry has to be all on one line, (with some weird exceptions) AND usually a limit of 1024b-8K to how long the line can be. Note that the ';' just before the closing '}' is super-critical. If this is left out, you'll get un-deciperable error messages, or no error messages at all.
Basically, you're redirecting any output into a file (including std-err output). Now you can do a lot of stuff with the output, use more or less to look at the file, grep ERR ${logFile}, write scripts that grep for error messages and then send you emails that errors have been found, etc, etc.
You may have some sys-admin work on your hands to get the mineserver user so it can run crontab entries. Also if you're not comfortable using the vi or emacs editors, creating a crontab file may require help from others. Post to superuser.com to get answers for problems you have with linux admin issues.
Finally, there are two points I'd like to make about dated logfiles.
Good: a. If you app dies, you never have to rerun it to then capture output and figure out why something has stopped working. For long running programs this can save you a lot of time. b. keeping dated files gives you the ability to prove to you, your boss, others, that It used to work just fine, see here are the log files. c. Keeping the log files, assuming there is useful information in them, gives you the opportunity to mine those files for facts. I.E. : program used to take 1 sec for processing, now it is taking 1 hr, etc etc.
Bad: a. You'll need to set up a mechanism to sweep old log files, otherwise at some point everything will have stopped, AND when you finally figure out what the problem was, you discover that your /tmp OR whatever dir you chose to use IS completely full.
There is a self-maintaining solution to using dates on the logfiles I can tell you about if you find this approach useful. It will take a little explaining, so I don't want to spend the time writing it up if you don't find the crontab solution useful.
I hope this helps!

Incorporating bash scripts into an R package?

Background
I am writing an R package to support reproducible research. At this point, the workflow is mostly held together by bash scripts, and I can run an analysis by sending a single command like ./runscript.sh. I use bash for the following:
file manipulation tar, rsync, 'rename'
running bash files locally and via ssh
running R scripts using R --vanilla that in turn call R functions
find and replace text within files using sed
submitting jobs via qsub
It seems to me that it would be much more efficient (cleaner and easier) to execute the entire workflow from an R function or R script. I am partial to R since I am more familiar with it and mostly work within emacs ESS.
Questions
Would it be worthwhile to encapsulate all of these uses of bash within R using the system and files functions?
Are there other R packages that I have not yet found that would be helpful for doing this?
Notes
Following Al3xa's answer, I realize that it is important to note that the speed penalty of using eg. R vs bash versions of tar and gsub on 1000-2000 files would likely be less than the current rate limiting steps in the workflow: computations by JAGS (~10-20min) and FORTRAN (>4hrs)
I'm a big fan of using R as your "integrated" environment vs. bash scripts. I'm in the process of moving all of my bash and ruby scripts to Rscript as I need to make changes to them.
There are only a couple of reasons not to move everything into R that come to mind. I'm referring mainly to using Rscript to accomplish this
1) Speed, which from my testing is a moderate impact in any situation I've come across, and would be trivial relative to the times you mentioned.
2) Portability, in that paths to Rscript, etc. may be different across systems. I've had no problems writing things on OS X and moving them to a Linux server, but might break on Windows.
The advantages in my book are:
1) Much easier for me to write. I don't have to switch back and forth between the slight idiosyncrasies with things like conditional statements and for loops.
2) More forgiving. I can't describe how much time I've spent trying to get bash scripts to work because I accidentally had a space where I shouldn't have. R is much nicer in that regard (yes, of course, we should all follow conventions in R perfectly, but I'd rather that it not stall me up for hours if I don't).
3) I do better work. For tar a file it doesn't matter, but I find I do better text manipulation in R vs. awk/sed for example.
Re: packages that are helpful -- This doesn't exist, to my knowledge, but I'd love a version of make that's based on R. make's syntax is one of the most inflexible out there (tabs vs spaces? really?) - I'd love to write an R-based alternative. Some day, I will...
Well, there are functions like tar, gsub etc. Anyway, I guess you're willing to create a crossplatform solution. You should prefer bash for the sake of speed, and use R only for R-specific functions. I don't find it useful to wrap all system-based commands within system and/or file.*... it would be much slower... If you're using Linux, I suggest littler over Rscript interface.

Schedule tasks Linux vs Windows

I'm trying to convert the time parameter of crontab(Linux) which is formatted like this(* * * * *) to the corresponding time parameter of schtasks(Windows) which is kind of complex then that of crontab.
I'm in the middle of writing an application for doing this and its kinda getting more and more complex since crontab have so many possibilities/permutations for providing the time parameter.
I was wondering if this is the right approach to write a program for converting the parameter, or there exist any better approach for achieving this?
And since parameters in crontab can be provided in many possible ways like (0 0 1,10,15 * *) would schedule the task for midnight on 1st ,10th & 15th of month. So, is it possible at all for schtasks to take parameters that can do the same kind of scheduling?
Have you checked these cron ports to Windows?
CRONw
z-cron

Real life SHELL SCRIPTS usage?

I'm learning UNIX/LINUX shell scripting and trying to think about it appropriate usage?
The only thing that comes into mind - it'll be nice for let's say backup operations and logs management....But I'm sure it goes way beyond that...or is it?
I'm sure there are people on this server who use Shell scripting on the daily basis.
Can you tell me what do you use it for in your organization/business?
Thanks:)
Why use shell scripts
Basically, there are any number of tasks related to backup, maintenance, etc. that need to be automated, and shell scripts do that.
You can do quite everything in shell, but it is easy to write ugly and slow scripts.
First domain of expertise of shells is to start and combine other programs. This is exceptionally well suited for:
file manipulations: list, move, copy, compress, archive
text lines manipulation: filter (grep), modify (sed), delete lines (sed), combine files (paste), sort (sort), unify (sort -u)
All those operation are NOT shell operation, but the shell is the glue that put them all together.
file operations are generally combined with flow control instructions (while, if, for)
line operations are combined with pipes | and named pipe mkfifo
Things you can do in less than 20 lines with shell commands.
I personally use it to batch miscellaneous daily/weekly commands and start up long running processes. They can be unwieldy and hard to debug when they get big. Unknown variables evaluate to empty strings (icky).
Scripting languages languages such as Python, Perl, and Ruby become more attractive as the code becomes more complex.
I work on an actively developed software project that runs in a unix environment. Unfortunately it uses a lot of different environment variables for configuration and stashes binary programs, data files, and shared libraries on version dependent paths.
All that is a pain to set up.
But it gets worse: at any given time I might want to work with the stable version, the pretty-stable-but-more-up-to-date version, the bleeding-edge-every-new-feature version, or my personally hacked development version.
Switching between them is a even bigger hassle.
Enter a shell script which insures that I am set up for exactly one version at a time. Ta da!
BTW--The script I use for this makes extensive use of the accepted answer to How do I manipulate $PATH elements in shell scripts?, so you know Stack Overflow works for me in the real world. More over, I've infected several other people with this technology.
I've seen and worked-on full-blown applications (medical records and scheduling processing) written in Korn shell.
Batch programming, PostScript print filters, automatic mailers and automated airline checkin systems, regular stock price tracking, software installers, et al, et al.
Better question = what could not be programmed in Shell?
for our company, we use shell scripts for the following:
backups - it would be very disastrous for us if we lose our data. Various parts of our backup like database backup, offsite backup, continuous backups etc all uses shell scripts that runs daily and some runs once a week.
update dates - we do not use ntp so we rely on sh scripts to update the date due to firewall restrictions.
log cleanup
send emails
I didn't think bash programming was particularly powerful until I saw that the OS startup scripts are all written in it. That made me re-examine my assumptions. I now have several dozen important shell scripts that I've written over the years that automate some common tasks.
For example, I wrote one that polls the current load average, and then executes a provided command if it exceeds a certain value (useful for examining events that only happen once or twice a day).
Another that I wrote iterates through all the mysql databases on the server and outputs a mysqldump for each one into its own appropriately-named .sql file.
Another iterates through a list of homedirs and changes the ownership of all the files under the corresponding public_html dir to match the user who should own them to be compliant with suPHP's restrictions.
Another examines the current hardware configuration and downloads, installs, and configures appropriate software for monitoring the health of the currently-attached RAID controller.
These are all relatively simple tasks that could be done by hand -- but whenever I find myself doing the same task more than once, I write a shell script to automate the process.
I also built a base-64 decoder in bash just to see if I could. It works, but it's terribly slow. I use shell scripting for simple tasks that primarily involve executing other programs. I often use Perl when a significant amount of string processing is required, and I use Python for the more complex scripting tasks. The more languages you know, the better you will be at choosing the right one for the job.

Resources