How do you manage the application in your own ami? - amazon-ec2

We have some crawler for gathering data from internet.
EC2 spot is a very inexpensive solution for our application.
In our case, we can set up the crawler by following steps:
launch an ami from AMAZON quick start template
install the dependency library
send crawler app to instance
set up the launcher for our crawler, let it work after boot completed
make the instance as an ami
But we need to repeat the step 3 when crawler need to update.
It influences other settings, such as the 'ami-id' in auto scaling
or other spot instance request scripts.
Application managment in 'ami' is a deployment issue, therefore we need suggestions to make it as easy as possible. Now, there is another way to manage it. We use the source code management tool, and deployment steps is like this:
3 git clone from source code repo.
3.1 compile the app from source
3.2 remove the previous build
3.3 install the latest build
4 launcher always rebuild crawler from latest release before it wakes up the crawler.
The new method prevents from ami-id changing, but it must checkout source code each time. Finally, it takes more time to fetch source (source is growing everyday)
How do you manage your artifacts on ami ?
I'm not sure always building from source is the best choice.
It only overcome some deployment problem, but no addressing about updating after the crawler instance has been running.

Well, if your crawler is not updating every hour of the day then I think you should write some script ie You will be using both of your idea previous and new, to do so write the script to check from your server if the current build is latest then go normal crawling and if that older then move to the GIT Clone stuff, by this if you are not modifying the crawler very often you can have efficient performance.
with above actually you will be reducing the rebuild for most of the time because as you describe the rebuild process you must be doing these steps mostly for no reason
Hope this helps you


Can you host a bitbucket pipeline internally?

We are currently using bitbucket cloud to host our grails-app repository. We want to set up some pipelines to do things like run unit tests and make sure the app compiles before being able to merge a branch to master.
I know this can pretty easily be done by letting them host the pipeline and committing a well written pipe file, however there is a problem standing that our app is very large, and even on brand new macbook pros takes 20 minutes to compile, on some older ones it can take 2 hours or more. Grails, thankfully, only compiles files that have changes in them from the last compilation. However, this can't be used on a bitbucket pipe that's working off a fresh pull of the app every time it runs.
My solution to this was wanting to set up a pipeline to run for us internally so that it can already have the app pulled, and just switch to the desired branch and run from there. This still might take time if switching between 2 very diverged branches, but it's better than compiling from fresh every time.
I can't seem to find any documentation on hosting a pipeline internally with bitbucket cloud, does anyone know if this is possible, and if so where there is documentation for it?
It would also be acceptable to find a solution to the long compilation problem itself with bitbucket hosted pipelines.
A few weeks ago, self hosted runners was made available as a public beta. Here are the details:
Additionally, if you're looking to retain some of your files from one build to the next to save doing the same work over and over again, have a look at caches: there are some built ones that you could use, but you can define your own custom ones as well. Essentially it's just a way of preserving the contents of a directory for a future build.

Alfresco: How to update repository-tier workflow files without restarting the tomcat server?

I'm currently working on developing a custom workflow with many custom behaviors and scripts. I'm using the Alfresco Maven SDK to build and test my project as I develop it. This necessitates that I restart the repository-tier tomcat server every time I want to make a change/update my workflow files. I am getting quite frustrated with how long this takes each time, and it means that I'm wasting time while waiting for the server to restart, especially when I've made a small typo in one of my files.
I'm looking for a way (if it's possible) to update my files (in particular the bpmn process file) and apply these changes to my Alfresco instance without having to restart the tomcat servers each time. I've set to true in my service-context.xml, and I have also tried to redeploy the workflow from the admin-workflow-console, but my changes do not take place unless I manually restart the server.
I am using: Alfresco Community 5.2, Maven SDK 2.2
Any tips or suggestions would be very welcome!
Yes, you can do it by
workflow admin console
Ex :: deploy alfresco/workflow/<workflow-definition>.xml
path for your workflow definition file.
Refer this docs for more information

TFS Team Build - Testing to Production

I have scoured the internet to find out what I can on this, but have come away short. I need to know two things.
Firstly, is there a best practice for how TFS & Team Build should be used in a Development > Test > Production environment? I currently have my local VS get the latest files. Then I work on them & check them in. This creates a build that then pushes the published files into a location on the test server which IIS references. This creates my test environment. I wonder then what is the best practice for deploying this to a Live environment once testing is complete?
Secondly, off the back of the previous - my web application is connected to a database. So, the test version will point to a test database. But when this is then tested and put live, I will need that process to also make sure that any data connections are changed to the live database.
I am pretty much doing all this from scratch and am learning as I go along.
I'd suggest you to look at Microsoft Release Management since it's the tool that can help you to do exactly the things you mentioned. It can also be integrated with TFS.
In general, release management is:
the process of managing, planning, scheduling and controlling a
software build through different stages and environments; including
testing and deploying software releases.
Specifically, the tool that Microsoft offers would enable you to automate the release process, from development to production, keeping track of what and how everything is done when a particular stage is reached.
There's an MSDN article, Automate deployments with Release Management, that gives a good overview:
Basically, for each release path, you can define your own stages, each one made of a workflow (the so-called deployment sequence) containing the activities you want to perform using pre-defined machines from a pool.
It's possible to insert manual interventions/approvals if necessary, and the whole thing can be triggered automatically once your build is done.
Since you are pretty much in control of the actions performed on each machine in each stage (through the use of built-in or custom actions/components) it is also certainly possible to change configuration files, for example to test different scenarios, etc..
Another image to give you and idea of how it can be done:

SVN Post-Commit to Update Working Copy when Working Copy is on a Network Drive

I work for a fairly new web development company and we are currently testing subversion installations to implement a versioning system. One of the features we need the versioning system to perform is to update the development server with an edited file once it has been committed.
We would like to maintain one server for all of our SVN repositories, even though, due to system requirements, we need to maintain several separate development servers. I understand that the updates are fairly simple when the development server resides in the same location as SVN, but that is just not possible for us. So, we need to map separate network drives to the SVN server for each development server.
However, this errors on commit. Here is my working copy test directory, as referenced in the post-commit.bat file:
This, however, results in an error...
post-commit hook failed (exit code 1) with output: svn: Error resolving case of 'Z:\testweb'
I'm sure this is because the server is not the same user as me and therefore does not have the share I need mapped to "Z" - I just have no idea how to work around this. Can anyone help?
UPDATE: The more I look in to these issues it appears that the real solution to the problem is to use a CI Server to accomplish what I am attempting to accomplish. I am currently looking in to TeamCity and what it might do for us.
Don't do this through a post-commit hook. If you ever manage to get the hook to succeed, you'll be causing the person who did the commit to wait until the update is complete. Instead, I recommend that you use Jenkins which is a continuous build engine.
It is possible that you don't have anything to build. After all, if you're using PHP or JavaScript, there's nothing to compile. However, you can still use Jenkins to do the update for you.
I can't get into the nitty-gritty detail hear, but one of the things you can do with Jenkins is redefine its working directory. You can do this by clicking on the Advanced button when you define a job, and it'll ask you where you want the working directory. In this case, you can specify your server's working directory.
One of the things you can do with Jenkins is have it automatically run tests, or maybe do a bit smoother update. For example, you might have to restart your web server when you change a few files, or maybe you need to make sure that if you're changing 100 files, they all get changed at once, or your server isn't in a stable state. You could use Jenkins to do this too. And, if there are any problems, you can have Jenkins email the person who is responsible for the server that the server update failed.
Jenkins is easy to setup and use. You can download it and start up Jenkins in 10 minutes. Setting up a job in Jenkins might take you another 15 minutes if you had never seen Jenkins before and had no idea how it works.

What's the workflow of Continuous Integration With Hudson?

I am referred to Hudson today.
I have heard about continuous integration before, but I have no idea what the heck is a ci-server.
Hudson is really easy to install in Ubuntu and in several minutes I managed to set up an instance of it.
But I don't quite understand the workflow of a ci-server, or how am I supposed to use it?
Please tell me if you have experience about ci, thanks in advance.
I am currently using Mercurial as my SCM, and I wonder what is the right way to use it with Hudson.
I have installed the Mercurial Plugin of Hudson, and I create a new job with a local repository. When I commit in the repository the Hudson job is built with the latest version of my source code.
If what I used is a remote repository, what's the workflow like?
Is it something like the following?
Set up a Hudson job with the repository
Developer makes a local clone of the repository
Developer commit and push changes
The remote repository update with the incoming changeset
Run a Hudson build
There may be something I misunderstanded at all, please help me point it out.
Continuous Integration is the process of "integrating software" continuously i.e. as frequently as possible (ultimately after each set of changes) to avoid any big-bang integration and all subsequent problems by getting immediate feedback.
To implement Continuous Integration, you first need to automate the build of your software (where build means of course compiling sources, packaging them, but also compiling tests, running the tests, running quality checks, etc, anything that will help to get feedback on the health of your code). Then you need to trigger the build on the latest version of the sources on a particular event (a change in the repository, a temporal event), to generate reports and to send notifications upon failure (by mail, twitter, etc).
And this is precisely the responsibility of a CI engine: offering trigger mechanisms, being able to get the latest version of the sources, running the build, generating and publishing reports, sending notifications. CI engines do implement this.
And because running a build is CPU and Disk intensive, CI engines usually run on a dedicated machine (or even a farm of machines if you want to build lots of projects).
Back to your question now. Once you've got Hudson running, configure it (Manage Hudson > Configure System): setup the JDK, build tools, etc. Then setup an Hudson Job and follow the steps: configure the location of the source repository, the build tool, the trigger, a notification channel and you're done (you can do more complex things but that's a start).
For more details on the setup, check:
The official Use Hudson guide for more details. << START HERE
Continuous Integration with Hudson - Tutorial.
Spot defects early with Continuous Integration.
Martin Fowler's overview of continuous integration is one of the canonical references. In my opinion, using automation to make sure your code base is healthy is one of the most useful things that you can set up.
Update Sorry that I didn't have much time earlier to expand on my reply. #Pascal_Thivent is right that in order to effectively use CI, you need to be able to automate your builds, tests, etc. CI is actually a good forcing function for this. For me, it's one of those little warning flags if I start to think that it would be too painful to put a build into Hudson. It means that something is not quite right.
What I like about Hudson is that it's flexible enough to accommodate different workflows. We use it for both builds / unit tests and releases. And it eliminates a lot of the worry about certain release procedures only working in one person's environment.
What I don't like about Hudson is that it is occasionally unstable when new builds break plugins. I've had a couple of upgrades (2 out of 10 or so) go bad because of incompatibilities. I do two things now:
I never upgrade my team's Hudson server to the latest and greatest right away. I generally only upgrade when there are significant new features, or bug fixes.
I now have a basic Hudson instance set up with all my plugins on a virtual machine with some dummy builds that I fire up to test out any new upgrades before doing it on the public server.
