Ho to do blue/green deployment for hybris? - continuous-integration

I want to deploy hybris builds with zero down time. Our technical architecture consist of two frontend servers, two backend servers, two master/slave solr clusters, but a single DB server (MS SQL 2012). A new build may require patch execution which changes the DB schema.
Would it be possible to achieve this in a single DB landscape?
If two DB's are required (blue and green), then what is the best practice for DB replication in case of hybris?

Hybris does provide a rolling update feature (when you're running it in a cluster environment).
This is targeted to allow for zero downtime.
You can find more information on the hybris help pages, e.g.
https://help.hybris.com/6.5.0/hcd/8c455268866910149b25f7b53d1af3e1.html
Looking at the first picture there it seems to be pretty much fitting for the architecture you describe.
(But to be honest I have no experience with it, so I can't tell you whether or or how well it works :) )
If you have risky changes or end up needing to rollback your rolled out update you will have to do quite a bit of db cleanup etc.
From that perspective a blue/green setup might sound better although with db replication you would end up with the same problem (as your updated schema would be replicated as well I assume).

Hybris only adding new columns to db, never change their type or remove them. So single DB can be OK. I didn't test this using store front while updating system. I think it will be OK.
On the other hand you need development for empty/null check for new attributes in development.

Related

CouchDB view replication

Using CouchDB to create a hosted app for clients. I have a dev database I work from, as well as separate DBs for each client. Works well, problem is when I make a change on dev, I have to manually copy the view code into each separate DB. It's fine now that I have 2 clients. But my hope is to grow to 100 clients. One small change could take a very long time!
Am I missing something simple in regards to replicating ONLY the views?
Thanks!
Here is how I usually work.
I have my local dev db. create and update my design docs (containing the views).
Have a production deployment db that will be visible to all the clients. I usually use iriscouch. Keep no data in this db.
When setting up a client, make sure you setup one way replication from #2 to this client db.
So to deploy to all clients, I put my latest design docs on the master, then all the clients will then be updated. There are some caveats to this. You have to make sure when you deploy to the master db, that you respect the revisions, so the client dbs will know to update.
Here is a quote from the master, Jason Smith:
The Good Way: Work with _rev
I think your application has a concept of "upgrading" from one
revision to another. There is staging or development code, and there
is production code. Periodically you promote development code to
production. That sounds like two Git branches and it also sounds like
two doc ids. (Or two sets of doc ids.)
You can test and refactor your code all day long, in the temporary doc
(_design/dev). But in production (_design/pro), it's just like a long
Git history. Every revision built from the one previous, to the
beginning of time.
If you want to promote _design/dev, the latest deploy is
_rev=4-abcdef. So this will be the fifth revision deployed, right?
Hey! Stop reading the "_rev" field! But yeah, probably.
COPY /db/_design/dev
Destination: _design/pro?rev=4-abcdef
{"id":"_design/pro","rev":"5-12345whatever"}
Notice that each deployed _design/pro builds from the other, so it
will naturally float out to the slaves when they replicate.
In real-life, you may have add a middle step, pushing design documents
to production servers before actually publishing them. Once you push,
how long will it take couch to build new views? The answer is,
"Christ, who knows?"
Therefore you have to copy _design/dev to _design/staging and then
push that out into the wild. Then you have to query its views until
you are satisfied that they are fresh and fast. (You can compare
"update_seq" from /db vs. "update_seq" from /db/_design/ddoc/_info).
And only then do you HTTP copy from _design/staging to _design/pro and
let that propagate out.
Source
Its not as confusing as it may sound. But to simplify the process, you can use Reupholster
(I admit, I have written this tool). It is mainly for couchapps, but even if you are just promoting design docs, it might be worth you just using reupholster to deploy to your master db. Reupholster adds in some handy info to the design doc, like date/time svn or git info. That way when you look at a clients db you can tell which design doc they are on.
Good luck
You can replicate just the design docs;
http://wiki.apache.org/couchdb/Replication#Named_Document_Replication

Speeding up integration tests that rely on an Oracle DB

We have an Oracle database server specifically for our unit tests to run against. Is there a way to tune Oracle specifically for this kind of purpose? As the data is constantly being thrown away (since it's just test data). I wonder if there is a way to have an Oracle database in-memory and connect without the TCP/IP stack perhaps to speed up these tests.
Any suggestions?
The answer is likely yes, but changing the database environment from the production configuration to a integration configuration during testing introduces risk that the testing will give false results.
If the hangup is the database cleanup/reset stage, and you have Enterprise Edition, look into FLASHBACK DATABASE as (potentially) a quicker way to reset the database to a fixed point.
At worst, you don't need to waste time building the cleanup/reset scripts.
The TCP/IP stack is unlikely to be adding much to your overhead. You could, however, run the Oracle instance on the same server as your test cases, and access via ORACLE_SID (which I believe uses OS-level inter-process communication).
Before examining changes to Oracle, however, I'd look at what tests are getting run on your continuous integration server. If you haven't done it already, this means splitting the integration tests (which require a back end) from the unit tests (which don't), and running them on different schedules. There's rarely a reason to run a full suite of integration tests for every change.
Next: are you using any sort of object-relational mapper to access your database? If yes, and you're not relying on any particular Oracle quirks, you could replace Oracle with an in-memory database (you don't say what language you're using, so this may or may not be an option).
And finally, consider using the Oracle import/export facility to completely rebuild your database for each integration test run. It's probably quicker, and definitely more stable than trying to delete whatever rows you created (this assumes that your integration tests start with pre-populated data; if not, just drop and rebuild the schema).
There are a lot of things you could do to the Oracle instance for the scenario you mention like using the correct locking strategy/isolation level, disabling all kinds of undo-logs, etc. You should consult a good Oracle Tuning book for that (I like the one by Mark Gurry, but I'm not sure how up to date that is).
There is one other thing that might be important: If you constantly add and delete data from your db (I mean "totally empty the db"), make sure you're setting up the storage parameter for each table correctly. If you have the space, consider assigning an initial extent equal to the max size for your test cases. (Either in the db creation script, or define once and then just trancate the tables with the reuse storage option.) Then when you run the test cases, the db doesn't have to allocate additional storage space.
I faced a similar issue and I was able to speed up the unit tests by moving the redo logs, undo and users tablespace to a RAM disk. There is a free version of ramdisk software available for you to try out. Commercial versions that back up the files periodically are also very cheap.
In my case the unit tests only verify data integrity and hence this strategy is low risk even though it is does not replicate production setup. We have a separate scale and performance test stgrategy.
You can incorporate rebuilding of your table indexes as part of the test run. Choose to either take the time hit of rebuilding your indexes prior to the run or after the run. You'll eat up the same total amount of time, but you'll "feel" it less if you rebuild them after the test run.
ALTER INDEX index_name REBUILD
Will rebuild the indexes without dropping and re-creating them.

How do I deploy an Oracle database?

I have an ASP .NET application that connects to an Oracle or a SQL Server database. An installer has been developed to install a fresh database to an existing SQL Server using sql commands such as "restore database..." which simply restores a ".bak" file which we keep under source control.
I'm very new to Oracle and our application has only recently been ported to be compatible with 10g.
We are currently using the "exp.exe" tool to generate a ".dmp" file and then using the "imp.exe" to import it into a developers box.
How would you go about creating an "Oracle Database Installer"?
Would you create the database using script files and then populate the database with required default data?
Would you run the "imp.exe" tool behind the scenes?
Do we need to provide a clean interface for system administrators so that they can just select the destination server and have done, or should we just provide them with the ".dmp" file? What are the best practices?
Thanks.
The question is -- what do your customers know about Oracle?
Nothing? You should probably rethink this position. Oracle is very large and complex. If you assume your customers know nothing, you'll then start providing tutorials and help that's inappropriate.
Minimally Competent? If they're competent, they know enough to run imp by themselves. Also, they know enough to run a script that executes SQL.
Actual DBA's? Most organizations that can afford Oracle can afford real DBA's. Real DBA's can cope with a lot of things -- they do not need much hand-holding. Some of them like to assign storage parameters according to their shop standards.
You should provide a script with reasonable defaults. You should define your script in a way that someone can easily find all of your storage parameters and tweak them if necessary.
Your initial data can be via export/import or via a script. I prefer a script.
I have done this repeatedly from both sides (consumer and provider) as a DBA, developer, and architect.
As a provider, one of my grand accomplishments (in 1996) was the creation of an installation CD for a commercial insurance claims management software product targeted to the largest insurance carriers (a multi-million dollar item). That installation CD installed the Oracle 7.2 RDBMS engine, the FileNet optical storage system (scans paper documents and creates cataloged binary versions), and our custom claim-processing application (built in VB 4.0), all integrated and ready to run. As part of the installation process, the user could skip the Oracle software installation or customize it, and the user could customize/override the database configuration in all of its major details (database, schemas, tablespaces, sizes, disks, etc.).
I also provided the field service for this product, which included traveling to the client site as necessary. I tested the installation CD literally hundreds of times under every imaginable scenario that I could replicate, and we NEVER had a field failure that required even a phone call, let alone a trip (I did travel on four occasions, but for pre-sales stuff instead).
More recently (2007), I scripted the creation of an Oracle 10g database for an internal system at a megacorp. In production, the database was sized at 8 TB, mostly for a single transaction table with high data volume. In test, the database was sized around 1 TB for a modest server. In development, the database was sized around 100 MB to run on my laptop. The EXACT SAME SCRIPTS created all three environments, and I could extend them to handle a new environment/machine in about five minutes. This database involved extreme performance tuning, so customization of all pertinent characteristics was absolutely crucial.
Back to the insurance claims processing product--let me please add that I was originally hired to lead its conversion from a SQL Server database to an Oracle database. That conversion was identified as a business necessity because most potential clients did not view a SQL-Server-based product as a professional, serious solution. That is not quite as common today, but it still applies in general: a software product has a better chance of market penetration if it can accommodate multiple database options as preferred by the target customers (especially enterprise-class customers).
Likewise, the installation CD was also viewed as an essential element. However, that situation and many more have revealed to me that most "real" DBAs will not accept an import-based database installation. As a DBA and architect, I know that I definitely will not for the same reasons.
Simply put, an import-based database installation gives the customer almost no control over the resulting database. It is opaque to the customer, leaving them questioning what it did. It forces the customer to expend massive efforts to attempt to exercise what little control they can. It is notoriously fragile and error-prone (Oracle imports are well known for ownership and permission problems, constraint problems, etc.). Weighing all those impacts, an import-based database installation is unprofessional--it does not put the customers' needs first.
Scripting the database installation provides the right kind of transparency, configurability, selective repeatability, and overall customer control that professionalism demands. It also encourages you to properly understand the impacts of your database design decisions in a way that an import does not.
Best wishes.
Personally I favour SQL scripts to database creation and data loads where possible. I tend to use PL/SQL Developer. It has some good options to generate scripts from an existing database. Once you have these you can run the scripts using sqlplus or any application code that can execute arbitrary SQL (eg JDBC with Java). Toad is the more common (and more expensive) tool for Oracle development.
The only limitation of a SQL export is it can't export CLOB/BLOB fields. If you have those, you either need to do them separately (as a PL/SQL export) or do the whole thing as a PL/SQL export. Theres no dramas with this except the file is effectively a binary export (extension .pde) and is more limited in how you can execute it.
The other big advantage of SQL source files is they can be version controlled easily. It's really handy to be able to create a database environment by running one or two scripts.
The import and export tools for Oracle I think are more applicable for backup and restore operations.
Now, as for delivering that to a customer, from your comments it seems that you'll be giving this to DBAs. Pretty much any Oracle installation will have DBAs involved. They will be fine with SQL scripts to create the schema and do the data load. They will be doing a lot of site-specific configuration (eg tuning the SGA, temp tablespaces, # of concurrent connections, etc based on expected load).
You, as the vendor, can give guidance on any relevant configuration and you may get involved in support and possibly installation but ultimately it's up to them to figure out what works for them. Oracle runs on a large number of operating systems and hardware variants with infinite variations in network topology and firewall configuraiton. You can't factor in all of these to an installer or even a set of instructions (other than the guidelines mentioned previously).
The last time I was involved in the creation of a (oracle) db (for a reasonably large company with in-house DBAs) the DBAs wanted to know things like:
what we wanted to call the db,
what tablespaces we would need, and an estimate of how much data would be in each one
how many users would be connecting.
(From memory) they set up the db and tablespaces, then we provided a combination of simple scripts that they could run (or clear instructions if a task wasn't easy to automate)
As I say this was for an in-house app, so your mileage may vary, but in my case they wanted all instructions clearly spelt out so that (a) there was no possibily of a misunderstanding leading to the wrong thing being done, and (b) no culpability on their part if something didn't work ("we were just following the instructions")

Maintaining Multiple Databases Across Several Platforms

What's the best way to maintain a multiple databases across several platforms (Windows, Linux, Mac OS X and Solaris) and keep them in sync with one another? I've tried several different programs and nothing seems to work!
I think you should ask yourself why you have to go through the hassle of maintaining multiple databases across several platforms and have them in sync with one another. Sounds like there's a lot of redundancy there. Why not just have one instance of that database, since I'm sure it can be made accessible (e.g. via SOA approach) to multiple apps on multiple platforms anyway?
Why go through the hassle? Management claims it's more expensive?
Here's how to prove them wrong.
Pick one database, call it the "master" or "system of record".
Write scripts to export data from the master and load it into your copies. If you have a nice database (MySQL, SQL/Server, Oracle or DB2) there are nice tools to do this replication for you. If you have a mixture of databases, you'll have to resort to exporting changed data and reloading changed data. The idea is that this is a 1-way copy: master to replicants.
Fix each application, one at a time, to do updates in the master database only. Since each application has a JDBC (or ODBC or whatever) connection to a database, it can just as easily be a connection to the master database.
Once you've fixed the applications to update only the master, the replicas are worthless. Management can insist that it's cheaper to have them. And there they are -- clones of the master database -- just what management says you must have.
Your life is simpler because the apps are only updating the system of record. They're happy because you have all the cloned databases lying around.

How do you manage schema upgrades to a production database?

This seems to be an overlooked area that could really use some insight. What are your best practices for:
making an upgrade procedure
backing out in case of errors
syncing code and database changes
testing prior to deployment
mechanics of modifying the table
etc...
Liquibase
liquibase.org:
it understands hibernate definitions.
it generates better schema update sql than hibernate
it logs which upgrades have been made to a database
it handles two-step changes (i.e. delete a column "foo" and then rename a different column to "foo")
it handles the concept of conditional upgrades
the developer actually listens to the community (with hibernate if you are not in the "in" crowd or a newbie -- you are basically ignored.)
http://www.liquibase.org
opinion
the application should never handle a schema update. This is a disaster waiting to happen. Data outlasts the applications and as soon as multiple applications try to work with the same data ( the production app + a reporting app for example) -- chances are they will both use the same underlying company libraries... and then both programs decide to do their own db upgrade ... have fun with that mess.
I am a big fan of Red Gate products that help creating SQL packages to update database schemas. The database scripts can be added to source control to help with versioning and rollback.
In general my rule is: "The application should manage it's own schema."
This means schema upgrade scripts are part of any upgrade package for the application and run automatically when the application starts. In case of errors the application fails to start and the upgrade script transaction is not committed. The downside to this is that the application has to have full modification access to the schema (this annoys DBAs).
I've had great success using Hibernates SchemaUpdate feature to manage the table structures. Leaving the upgrade scripts to only handle actual data initialization and occasional removing of columns (SchemaUpdate doesn't do that).
Regarding testing, since the upgrades are part of the application, testing them becomes part of the test cycle for the application.
Afterthought: Taking on board some of the criticism in other posts here, note the rule says "it's own". It only really applies where the application owns the schema as is generally the case with software sold as a product. If your software is sharing a database with other software, use other methods.
That's a great question. ( There is a high chance this is going to end up a normalised versus denormalised database debate..which I am not going to start... okay now for some input.)
some off the top of my head things I have done (will add more when I have some more time or need a break)
client design - this is where the VB method of inline sql (even with prepared statements) gets you into trouble. You can spend AGES just finding those statements. If you use something like Hibernate and put as much SQL into named queries you have a single place for most of the sql (nothing worse than trying to test sql that is inside of some IF statement and you just don't hit the "trigger" criteria in your testing for that IF statement). Prior to using hibernate (or other orms') when I would do SQL directly in JDBC or ODBC I would put all the sql statements as either public fields of an object (with a naming convention) or in a property file (also with a naming convention for the values say PREP_STMT_xxxx. And use either reflection or iterate over the values at startup in a) test cases b) startup of the application (some rdbms allow you to pre-compile with prepared statements before execution, so on startup post login I would pre-compile the prep-stmts at startup to make the application self testing. Even for 100's of statements on a good rdbms thats only a few seconds. and only once. And it has saved my butt a lot. On one project the DBA's wouldn't communicate (a different team, in a different country) and the schema seemed to change NIGHTLY, for no reason. And each morning we got a list of exactly where it broke the application, on startup.
If you need adhoc functionality , put it in a well named class (ie. again a naming convention helps with auto mated testing) that acts as some sort of factory for you query (ie. it builds the query). You are going to have to write the equivalent code anyway right, just put in a place you can test it. You can even write some basic test methods on the same object or in a separate class.
If you can , also try to use stored procedures. They are a bit harder to test as above. Some db's also don't pre-validate the sql in stored procs against the schema at compile time only at run time. It usually involves say taking a copy of the schema structure (no data) and then creating all stored procs against this copy (in case the db team making the changes DIDn't validate correctly). Thus the structure can be checked. but as a point of change management stored procs are great. On change all get it. Especially when the db changes are a result of business process changes. And all languages (java, vb, etc get the change )
I usually also setup a table I use called system_setting etc. In this table we keep a VERSION identifier. This is so that client libraries can connection and validate if they are valid for this version of the schema. Depending on the changes to your schema, you don't want to allow clients to connect if they can corrupt your schema (ie. you don't have a lot of referential rules in the db, but on the client). It depends if you are also going to have multiple client versions (which does happen in NON - web apps, ie. they are running the wrong binary). You could also have batch tools etc. Another approach which I have also done is define a set of schema to operation versions in some sort of property file or again in a system_info table. This table is loaded on login, and then used by each "manager" (I usually have some sort of client side api to do most db stuff) to validate for that operation if it is the right version. Thus most operations can succeed, but you can also fail (throw some exception) on out of date methods and tells you WHY.
managing the change to schema -> do you update the table or add 1-1 relationships to new tables ? I have seen a lot of shops which always access data via a view for this reason. This allows table names to change , columns etc. I have played with the idea of actually treating views like interfaces in COM. ie. you add a new VIEW for new functionality / versions. Often, what gets you here is that you can have a lot of reports (especially end user custom reports) that assume table formats. The views allow you to deploy a new table format but support existing client apps (remember all those pesky adhoc reports).
Also, need to write update and rollback scripts. and again TEST, TEST, TEST...
------------ OKAY - THIS IS A BIT RANDOM DISCUSSION TIME --------------
Actually had a large commercial project (ie. software shop) where we had the same problem. The architecture was a 2 tier and they were using a product a bit like PHP but pre-php. Same thing. different name. anyway i came in in version 2....
It was costing A LOT OF MONEY to do upgrades. A lot. ie. give away weeks of free consulting time on site.
And it was getting to the point of wanting to either add new features or optimize the code. Some of the existing code used stored procedures , so we had common points where we could manage code. but other areas were this embedded sql markup in html. Which was great for getting to market quickly but with each interaction of new features the cost at least doubled to test and maintain. So when we were looking at pulling out the php type code out, putting in data layers (this was 2001-2002, pre any ORM's etc) and adding a lot of new features (customer feedback) looked at this issue of how to engineer UPGRADES into the system. Which is a big deal, as upgrades cost a lot of money to do correctly. Now, most patterns and all the other stuff people discuss with a degree of energy deals with OO code that is running, but what about the fact that your data has to a) integrate to this logic, b) the meaning and also the structure of the data can change over time, and often due to the way data works you end up with a lot of sub process / applications in your clients organisation that needs that data -> ad hoc reporting or any complex custom reporting, as well as batch jobs that have been done for custom data feeds etc.
With this in mind i started playing with something a bit left of field. It also has a few assumptions. a) data is heavily read more than write. b) updates do happen, but not at bank levels ie. one or 2 a second say.
The idea was to apply a COM / Interface view to how data was accessed by clients over a set of CONCRETE tables (which varied with schema changes). You could create a seperate view for each type operation - update, delete, insert and read. This is important. The views would either map directly to a table , or allow you to trigger of a dummy table that does the real updates or inserts etc. What i actually wanted was some sort of trappable level indirection that could still be used by crystal reports etc. NOTE - For inserts , update and deletes you could also use stored procs. And you had a version for each version of the product. That way your version 1.0 had its version of the schema, and if the tables changed, you would still have the version 1.0 VIEWS but with NEW backend logic to map to the new tables as needed, but you also had version 2.0 views that would support new fields etc. This was really just to support ad hoc reporting, which if your a BUSINESS person and not a coder is probably the whole point of why you have the product. (your product can be crap but if you have the best reporting in the world you can still win, the reverse is true - your product can be the best feature wise, but if its the worse on reporting you can very easily loose).
okay, hope some of those ideas help.
These are all weighty topics, but here is my recommendation for updating.
You did not specify your platform, but for NANT build environments I use Tarantino. For every database update you are ready to commit, you make a change script (using RedGate or another tool). When you build to production, Tarantino checks if the script has been run on the database (it adds a table to your database to keep track). If not, the script is run. It takes all the manual work (read: human error) out of managing database versions.
I've heard good things about iBATIS 3 Schema Migrations System:
User Guide: http://svn.apache.org/repos/asf/ibatis/java/ibatis-3/trunk/doc/en/iBATIS-3-Migrations.pdf
As Pat said, use liquibase. Especially when you have several developers with their own dev databases
making changes that will become part of the production database.
If there's only one dev, as on one project I'm on now(ha), I just commit the schema changes as SQL text files into a CVS repo, which I check out in batches on the production server when the code changes go in.
But liquibase is better organized than that!

Resources