Related
I have a custom build and deployment script which work over SSH and deploy to servers (on running MacOS). The bash script does a lot of simple things like copying files, backing up the old ones and applying the correct SQL scripts for a forward moving database. But there are some advanced things like starting a remote SQL upgrade procedure which can be disconnected from and once the deployment script is started again it only goes forward if the SQL script has been applied completely (in short there is some flow control happening and bash is not really ideal for such stuff)
The script is already huge and is a mess since bash is not meant for such kind of detailed logic. Can you recommend some tools, libraries which would make things easier.
For what you tell us, I think you need a deployment tool, rather than a configuration management tool.
To simplify, I'll distinguish the two like this:
A deployment tool is a 'push' tool: When you press the button, the required actions are run to make the deployment. It's a one-step process (it can have multiple actions, but it's launched once).
A configuration management tool is usually a 'pull' tool, where your servers periodically check if their configuration is exactly as the CM server tells them to be - and apply changes, if needed. You configure your servers once, and after that the system assures that all is as it should be. It is also a great tool to easily clone systems.
For deployment tools, I personally know Fabric, a great Python tool. But there is also Capistrano in the Ruby world. I don't know of any others.
For CM tools, Puppet and Chef seem to be the preferred choice of people nowadays. Cfengine is an older tool, which had some problems (I don't know if that has changed).
Here are my recommendations:
Puppet
Chef
cfengine
These are all free (as in beer) and allow you to do what you're wanting. They will require you to adapt your current bash script into modules to fit their design/framework. It's a bit of work, but in the long run it tends to be better since the frameworks take care of error checking, converging configurations and a lot of other things you'd have to manually insert into your own code were you doing this yourself.
I've also used Opsware previously for this sort of thing, but that costs a fair bit of cash and, for what you're trying to do, does not offer significantly more benefit.
In some cases moving from a bash-script to an complete solution is not as straightforward as many cloudservices claim.
With 'dont try new things when your on a deadline' in mind:
it could also be a good timing to refactor your bashscripts.
I have done automated, repeatable deployments in the past using PaaS or just using GIT/SVN hooks using deployogi (which is bash) : https://github.com/coderofsalvation/deployogi
I understand your situation, but Im not sure whether its fair to say that the bash-language implies 'a mess' and 'complex'.
Every language allows to hide complexity no?
I guess code (in whatever language) gets overly complex when time does not allow us to refactor :)
PaaS is great. But always needed? I think not.
Question
Can anyone explain why it would be better to choose the puppet or chef vagrant provisioners, rather than the shell provisioner?
Background
I'm in the process of getting started with Vagrant. One of the things I'm having trouble with is deciding which provisioner to use. So far, I've had some success using the shell provisioner, but it has been more work than I expected to get it to run reliably.
At the moment, I'm not familar with ruby, puppet or chef, but I'm happy to learn any or all of them if I have to. My early experience playing with puppet and chef is that if someone else has a recipe that does exactly what you want, it works really well, but doing something non-standard means falling back coding up solution in ruby.
I'm aware of articles comparing puppet and chef, and I'm less worried about which of them to use, rather than knowing when and why I should use them at all.
Full disclosure: I'm a Puppet Labs employee. But I chose Puppet as a product over 2 years before joining them.
I would recommend that you use Puppet or Chef over shell if your configurations are going to a) have any degree of complexity and b) going to change over time - or you expect your installation environment itself to change in a way that might alter the way your deployment performs. Your scripts may be very good, but ultimately, unless you are following terrific programming practices around them, testing and QA'ing them, etc they are going to fail at some point.
There's an entire body of literate around DevOps discussing this notion, but it comes down to the principle of "technical debt" - we tend to do things the easy way now, and thus perceive them as simpler, at the cost of increasing complexity and difficulty later.
One of Puppet's strengths is its deterministic nature - the manifest you write must be able to be programmatically transformed by Puppet into a model of the server you are building. This is perceived by people as being more "difficult" but I would argue that the difficulty is lessened if you average it out along the curve of your technology's lifecycle. In other words, Puppet forces you to do your thinking now, but then deploy to scale with ease, rather than thinking later and re-engineering as you go. Pay in cash now, rather than by credit, with interest, later.
If you're purely pulling down other peoples' manifests, you're going to run into trouble at some point - although we would like it not to be so, working with Puppet today that's certainly the case, because they are writing them to address the general case, and not your particular system. Many general-purpose manifests become useful only when you reach a better understanding of Puppet.
So rather than start there, I'd work my way through the excellent Learning Puppet guide to start to grasp the basics. Puppet's learning curve is steep, but it levels off after a short while.
There are other reasons to use other provisioners or tools, but I'd surely argue that you are better with Puppet or Chef than trying to ensure that your shell scripts are doing exactly what you think they are supposed to do, for as long as you need to spawn new environments.
Ah, with the freedom of choice comes the complication of choosing what is right for you.
Chef Solo - Chef solo is most ideal if you’re just getting started with chef or a chef server is simply too heavy for your situation. Chef solo allows you to embed all your cookbooks within your project as well, which is nice for projects which want to keep track of their cookbooks within the same repository. Chef solo runs standalone – it requires no chef server or any other server to talk to; it simply runs by itself on the VM.
Chef Server - Chef server is useful for companies or individuals which manage many projects, since it allows you to share cookbooks across multiple projects. The cookbooks themselves are stored on the server, and the client downloads the cookbooks upon running.
Puppet - The Puppet provisioner runs stand-alone Puppet manifests that are stored on the server and downloaded to the client VM when it is created. The provisioner does not require a Puppet server and runs on the VM itself.
Puppet Server - The Puppet Server provisioner connects to a Puppet server and configures your client VM using node configuration on that server.
Other tools, shell scripts, etc. - Do you use something other than that which is built into Vagrant?
Provisioners are simply subclasses of Vagrant::Provisioners::Base, meaning you can easily build your own, should the need arise.
You can also check out the documentation, docs.vagrantup.com/v2
I would choose the Shell provisioner, then let the shell script clone your puppet/chef repository from github or bitbucket. The script can setup a ssh key to allow automated git clone. The benefits are most cloud providers support this as well so you can use the same script.This blog is explains git, puppet and vagrant well, one man and the cloud blog
Our dev team uses VS.NET for app development and TortoiseSVN/VisualSVN for version control. It seems that almost every day issues arise with the working copy or the repository getting screwed up, and folks just throw up their hands and call me when it happens. There are definitely human factors at work (SVN works as it should) but I'm tired of playing SVN helpdesk to the dev team. Can anyone recommend a better/more intuitive setup for version control?
Agent SVN works well for me. It integrates nicely with Visual Studio.
SVN is about as simple as version control systems get. Problems should only arise when dealing with merging operations...those can be tricky.
If you don't address the "human factors" it won't matter which version control system you use, you will always be the helpdesk. To address these kinds of problems, you typically need to:
Set up a wiki with common "recipes" for version control tasks.
Include a workflow diagram for how changes are made to your code (for those who don't like to read).
Host a training session that is specifically
designed for your users (use the wiki
material).
When helping someone with a problem, be sure to make them perform the actual fix. Don't just do it for them, talk them through it instead.
Make a point of directing users to product documentation when helping them.
Introducing a new version control system into any organization should include the items I listed. I realize it is extra work for those who get it done, but it does save you from long "support" hours down the road.
Can anyone recommend a better/more intuitive setup for version control?
Better? Yes. More intuitive? That's debatable. Look into distributed version control software, namely Mercurial or Git. Both have freely available plugins to integrate with Visual Studio. And if you can manage spending a little money, I've heard very good things about Fog Creek's Kiln.
As for your issues with SVN, I have a couple tips. The first is to make sure you keep everyone synced on the same version of the product. It tends to update frequently, and so this can be tricky, as you also don't want to fall too far behind the current version. The second is that we used to have big problems with Tortoise trying to cache icon overlays on mapped network drives. There is an option you can turn off somewhere that suddenly made things way more stable. But that was at my last job, and I don't remember the exact setting any more.
I think you already gave the answer in your question - sort out the "human factors" by providing appropriate training. Version control for software development doesn't get much simpler than SVN, so from the way your question is phrased, my guess would be that said human factors are just going to find other ways of making your life interesting.
if you have issues with your repository getting screwed (like committing on tags, wrong commit messages...), one of the easiest way is to play it the hard way : put hooks on the server to enforce policies. You can have a look in official documentation.
Basically, this is an easy way to enforce naming / formatting and avoid a lot of human issues (committing on tags, messing with externals...)
I have inherited a sprawling crontab that I need to maintain and update. I don't have much experience with it or bash scripting (I think I've got a decent grip on the basics) and I want to do a good job.
Short request: Any guidelines for 'refactoring' a messy crontab and set of bash scripts
Long request: I've run into a number of issues, but are so many people using cron files etc that I feel like I must be missing some large repository of information, best practices and tools - or is this just a stylistic difference for this kind of programming? (My bias: why do something manually if I can use a tool to do it faster, consistently and well?).
Examples of issues so far:
Due to an external event, the crontab didn't run for a couple of days. Along with someone else, we manually went through the list, trying to figure out what didn't run, what we needed to rerun, and what scripts we needed to edit and run with earlier dates etc.
What I can't find:
There are plenty of (slightly pointless) 'cron generators' online. Where are the reverse? Something I can feed in a long crontab, two dates, and have it output which processes should have run when, or just how many times total?
This seems within my meager scripting capabilities, so shouldn't it exist already? ;)
Alternatively, if I ever have to do that again, is there some way of calling a bashscript so that any instances of date() are pre-set to an earlier time, rather than changing every date call within the script? (e.g. for all the missed reports and billing invoices)
It turns out a particular report hadn't been running for two years. It was just requested again, and lo, there it was in the crontab! The bash script just had broken path references to the relevant files.
What I can't find: some kind of path checker for bash files? Like a website link checker. Yes I'll be going through these all manually eventually, but it'd show up some at least some of the problem areas.
It sounds like some times, there has either been too long or short a gap between dependent processes, so updates have happened after the first has been run, or the first hasn't finished running before the second has been called. I've seen a few possible options for this (eg anacron runs in sequential order), but what would you recommend?
There are also a large number of essentially meaningless emails generated from the crontab (scripts throwing errors but running 'correctly', failing mostly silently, or just printing everystep of non-essential scripts). I'll be manually going through scripts and trying to get them to provide more useful data, or 'succeed quietly', but y'know - any guidelines?
If my understanding or layout of the issue is confused, then I apologize, but hey - you see my problem then! I need to go from newbie, to knowing what to do to get this right, and not screw up a touchy system further. Thanks!
Not a full answer, but more resources that have been helpful:
http://blog.endpoint.com/2008/12/best-practices-for-cron.html
I am slowly going through this, and trying to implement each of the points. I hadn't thought to google 'best practices cron' til after my post. :P
For version control, I'm just going to use RCS in the meantime, as I edit scripts on a file-by-file basis, but I've been advised to get Git set up (or Mercurial if I was on a Windows system).
This actually sounds great:
http://everythingsysadmin.com/2010/09/xed-202-released.html
"xed is a perl script that locks a file, runs $EDITOR on the file, then unlocks it."...and puts it in RCS if it wasn't already.
Completely brainless version control. If I get my head around bash, I'd like to create an editing shortcut that automatically commits to whichever version control system I use.
Other tips I received from an System Admin,
Dates: Rather than using say, date, or --date="last monday", use a fixed date and add a day/week etc to it each time it runs (if not more than current day obviously), because then if the script doesn't run, I can just re-run the script repeatedly until it catches up. Ah!
(And, this might sound obvious, but heaps of the reports I'll be eventually edit, don't say prominently what dates the report is running for. Will fix.)
And was reassured I should try and get the cron emails as quiet as possible, so that I actually notice if there's an error email.
There are wrappers for better cron error reporting that I have not yet investigated, linked here: http://habilis.net/cronic/
Herculean task ahead of you, best of luck. :)
I'd suggest finding all the tasks that run daily and shove them into their own scripts in /etc/cron.daily/. Same for weekly into /etc/cron.weekly, hourly, and monthly.
You might want to investigate use of anacron(8) for scheduling your jobs, if the machine won't always be online, but you still need some level of control over when the jobs are run. It's been the default cron-helper-tool for multiple distributions for a few years, so hopefully it's stable enough to rely on for your own tasks; but I could easily imagine that it might not perfectly meet your needs.
Faking the dates to scripts can be done with at least two packages on Ubuntu: datefudge and faketime. I have no experience with either, but both sound like they should be able to help. I hope you won't need it in the future. :)
Sorry, I know of no path-checker for bash scripts. It seems unlikely, since simple scripts are simple and easy to check by eye :) and complex scripts will be generating their pathnames at runtime anyhow. Maybe you could keep a database of pathnames used by each script and write a new script to verify that database regularly.
You could disable the cron email by setting MAILTO="". I'm not sure I like this. Maybe setting MAILTO to a logging-only account would help the deluge. Another option is getting really good at your procmail(1) rules so you can stuff them in another mailbox completely.
Getting good at mutt color or score controls can help you spot the wheat amongst the chaff. (color index red black ERROR or similar commands might help you spot the problems more quickly.)
What do the clever programmers here do to keep track of handy programming tricks and useful information they pick up over their many years of experience? Things like useful compiler arguments, IDE short-cuts, clever code snippets, etc.
I sometimes find myself frustrated when looking up something that I used to know a year or two ago. My IE favorites probably represent a good chunk of the Internet in the late 1990s, so clearly that isn't effective (at least for me). Or am I just getting old?
So.. what do you do?
Two Things I do:
I blog about it - this allows me to go back and search my own blog.
We use the code snippet feature in Visual Studio.
Cheers.
I use:
Google Notebook - I take notes for projects, books I'm reading, etc
Delicious + Firefox plug in - Every time I see a good page I mark it.
Windows Journal (in tablet pc) - When I need to draw something and then copy/cut/paste it. I have more distractions here, the web is always very close :)
Small Moleskine paper notebook - Its always with me.
Big paper notebook - When I need more space to write and less distractions.
Obviously these are for all useful information, not just for snippets or tips and tricks.
Why not set up a Wiki?
If you are on windows, i know that ScrewTurn wiki is pretty simple to deploy on a desktop/laptop. No database to fuss around with.
Blog about it.
One of the nice side-effects of blogging is that if you use a sensible categorization or tagging system, it's quite easy to search for stuff within your blog. The fact that you wrote about it also makes it easier to remember problems you have encountered before ("hey, I blogged about that!").
That's a great benefit aside from, of course, being able to share this information publicly so that others might be able to find your solution to a particular problem using Google.
A number of people I know swear by Google Notebook
I send them to my gmail account, that way I have them where ever I go, and they can be put into appropriate folders for later.
I second the blog about it technique...even Jeff said that's a major reason he blogs.
Also, regarding the wiki idea, if you set one up at work, be sure to encourage your coworkers to do the same. When someone finds something of interest they can just write a little "article" explaining what it is and how to do it... that way, not only are your own things easily available and quickly searchable, but you'll often find out things you never knew from other people in your group. That way it benefits everyone not just you.
I agree with emailing, the wiki and the blog. Emailing is the most useful. If you can't use GMail and you're on windows, install a desktop search utility (Windows search, Google Desktop, Copernic, etc)
I also like to jot it into a textfile and save it in my documents folder. Whatever desktop search utility you use will be able to find it easily. e.g.
//print spool stop.notes.txt
If the printer spooler stops, start it again by
- Services > Provision Networks > Restart Service
tags: printer provision no printer spooler cannot print remote desktop
Subscribe in Google Reader and then search later.
At my last place of work they wouldn't let me set up a wiki or anything - so I just made various word documents full of tips and instructions and gave that to my successor when I left.
Now though I'd use a private wiki, or maybe a blog.
For many years I've kept a Word doc named Knowledgebase.doc that contains all my notes with a decent table of contents. I like to keep everything in one searchable doc.
I use a sync tool to make sure the file is copied to all the machines I want it on.
I use TiddlyWiki stored in my DropBox account. Although, recently, Evernote is getting my atention; it has a really useful feature: you send a twitter direct message to evernote user (myen) and it adds a note with your message (a really quick way to add notes or URL's for post-processing). Imagine, you can use a command-line twitter client to create notes! (or any twitter client). I really like this feature.