I have been assigned a team currently using liquibase to version their database and being new to the environment a problem came about from the developers and wanted to find what would be the best practice. I have never worked with liquibase before this, so bare with what I am trying to ask. The issue being is that developers are adding unintended changes(spaces or new lines) to a liquibase script inside the repo and not noiced and pushing their changes and liquibase is seeing it as a change. This was an example one of the developers gave me. Which I know can be dumb especially making sure the developers should be paying attention and git has ways of preventing this, but was given the task to create a some way to rollback if this issue were to arise. I was wondering if creating a rollback procedure is the best way or some way implment a block to not allow things like this to happen and if so what? Would it be on the db side as a block of is there something in liquibase that can prevent this. Also, being I think having a rollback procedure import for those one off accidents, whats the best way to create one for liquibase? As in best practice for a production level environment.
Related
I am aware of some potential solutions, but they all feel awful to me.
In pipeline (github actions), run a one-off task on fargate to migrate DB before the deployments.
Publish some kind of cloudformation event as a deploy hook and use it as a lambda trigger, and lambda will do the migration.
Leverage laravel crons with onOneServer() to continually check if a migration is necessary
[problem, no good] docker entrypoint command to run db migrations on task startup. (Bad, all instances will try to migrate the DB in quick succession, probably)
Each of these has various things I dislike.
This one will migrate the DB, and then deploy. If the deploy fails, the DB is now migrated and to fix it I would have to somehow run a db migration rollback in pipeline after a failure. Also it feels really bad to rely on one-off tasks through pipeline in general.
This one has more moving parts than I think should be necessary. Multiple points of failure: cloudformation event, lambda function failure. Also the deploy seems like it would be the event trigger, which means the deploy could be a success, but the lambda db migration fail, and the pipeline be unaware. Thus requiring a manual rollback of the deployment.
This one feels hacky, and yet, seems to have the least amount of moving parts and least entropy. The major downside however is that, I think this would essentially require a 1/min cron spamming php artisan migrate (nothing to migrate), so that it catches deploys with migrations. The benefit is that with onOneServer(), it should actually solve the concern: we don't want multiple instances to all try to migrate the database on a deploy, just one. This has a big benefit of linking the deploy and migrations, so if deploy fails, there is no migration yet, and if the migration fails, at least it is easier to rollback the task to the older task version quite easily. Less moving parts are involved. The resource overhead of spamming php artisan migrate each minute and it have nothing to rollback, should be very small/not noticeable resource usage. But, it still bothers me very much how inefficient it is resource-wise.
Is there another solution out there? I am anticipating someone may suggest to me to control instances with env variables, but I also don't want to do that. If we deploy and have 3 instances running, they should all be updated and they are all 'the same' instance states. Otherwise, I'd have to make a 2nd service that also runs 24/7 to check for migrations as its own special job. I guess that is solution 5:
Have a separate service task from the request-handling instances that runs 24/7 and whose only job is to run crons and migrate the DB after deploys. This also sucks though because you have a task running 24/7 to check for deploys, which are not so frequent.
I think solution 3 is my preferred solution, despite its resource overhead. I would love to hear some insight from others on this problem. I am in a situation where this pipeline really should be easy for non-ops-people to deal with if I get hit by a bus. Keeping it simple inside of the laravel app code seems like it fits that requirement. I know there are scheduled task / cloudformation event solutions, but keep in mind I have a big goal of as little entropy / moving parts as possible, within reason.
I have read every single blog post and every single google hit I can find on this subject, and have not found a clear obvious answer. I've come up with solution 3 myself and don't see it suggested anywhere.
Possibly automated DB migrations in all circumstances is too ambitious, and a manual process should be developed and followed. Especially if a DB migration contains a change which won't work on the old instances -- migrating it before deploy would break those temporarily.
Running database migrations before deployment (option 1) is the industry standard & what you should be doing, regardless of your cloud platform, database engine or application language.
The short and long answer is that DB migrations are there for fault tolerance - if for whatever reason you need to reverse your deployment, you know exactly what has happened to be able to roll back.
Most (if not all) ORMs e.g. Entity Framework for .NET or Liquibase for Java allow you to roll back the migration with a simple command. The Eloquent ORM included with Laravel for PHP also allows you to roll back migrations using php artisan migrate:rollback.
A step in your pipelines before deployment should apply the database migrations. If deployment then fails for any reason, you should manually roll back.
This is the intersection of your application & the database at an infrastructure level - unfortunately, expect some manual work to be needed if something fails.
use database migration:
php artisan migrate:fresh
this will drop all tables and create again
php artisan migrate:refresh
this will drop all tables
php artisan migrate:rollback
this will rollback tables
We are planning to move our application in production which is using flyway with spring boot .Most of the them time we are facing Validate database exception doing application start
org.flywaydb.core.api.FlywayException: Validate failed:
Migration checksum mismatch for migration version .
To recover from this exception we need to correct data on database or last option is reset database .But when we move to production to fight with this exception it will be nightmare .So we want to follow best practice for configuring flyway in production .We need answer from expert who has been using flyaway for several years in production .Thank You .
What is it?
The checksum validation from Flyway is basically a check between the checksum of the current migration file in your app against the checksum from the same migrtion it already run in the past. You can check this list on your database, under flyway_schema_history table created and used by Flyway.
What it means?
It means that the script you app has when it starts is not the same Flyway already applied in the past and since it can't figure out if that is correct or not, it fails. Ideally, you should never change a script you already applied, you should always evolve and create new ones, that's the whole idea about migrations.
How to avoid it?
As said before, you should never change scripts that were already executed before. You should always create new ones. Of course if that happens on a dev environment and you figure out changes are needed.
Extra
I see in your error message it says version ., which means that probably you haven't defined a properly name for your migration script. By default, and as a good practice, names are in this format: VyyyyMMd_HHmmss__action_you_performed_on_your_database and all that will be translated to the Flyway table as version and description.
Based on my experience using flyway,
Flyway tries to compare the checksum of your SQL script with the checksum previously run. This exception usually happens if you edit an SQL script that has already been applied by Flyway, causing a discrepancy in the checksum.
Development environment, you can delete your database and start migrations again.
Production environment, you must never edit SQL scripts that have already been applied on Production environment. Just create a few new SQL scripts in the future.
At work we use Oracle (12c client) to store most of our data and I use SQL Developer to connect to the database environments.
Issue:
We have issues where tables are being modified for one reason or another (too lazy to create a new table so they add new columns and change data types or lengths). This in return will break the table for others who actually utilize it for its real purpose.
Update:
We have DEV, TST, UAT, and PRD environments. We test and have scripts approved before we promote to PRD. The problem resides in DEV when we want to go back to an existing table to make an change, but that table had already been modified for different reasons.
Question 1:
Is the versioning just for stored procedures or is it possible to track changes to table structures, functions, triggers, sequences, synonyms, etc.?
As Bob Jarvis indicates you need way more than a solution to your question. You need policies and practices enforced for all developers. Some ideas from places I have worked:
every developer has a VM machine with a copy of the database installed. They can do whatever they like on it but must supply scripts to move their changes to production. These scripts are applied on a test instance and again on a QA instance before going to production.
subversion works on all OS and tortoise works well on windows. Committing scripts to a repository works well and this is integrated with SQL developer and can be done with Toad.
you have a permissions issue. Too many people have the privileges to alter tables. Remove these permissions and centralize on one or two people. Changes are funnelled through them as scripts and oversight can be applied there. Developers can have their own schema to test or a VM with a copy for development.
run this script to see who can alter tables
select * from DBA_TAB_PRIVS
WHERE PRIVILEGE = 'ALTER'
The key is a separation of concerns. Developers should have access to a schema where they can do what they need. The company needs to know who did what, when and where.
If you have more than one developer working on multiple changes to a dev environment then you need coordination and communication as well as source control. A weekly meeting to discuss overlap areas or a heads up chat message are just some ways to work together.
The approach I think works best, is to have a DEV database where all the developers manage their own set of schemas.
Scripted builds are provided with test data loads to allow any developer to create his own working schema. He then works on there, tests his changes and then commits his changes via scripts to the source control. DEV databases do not need to be large, just need enough test cases to allow for unit tests.
Script all the changes so that they can be checked into a version control system, and merged with other changes. The goal is to have a system where devA checks in changeA, and then when merged with the main trunk, devB gets changeA as he builds his schemaA.
This approach requires care if the main project schema employs PUBLIC synonyms. You will need to consider this as you go forward.
I would also advise with each change checked in an accompanying back out script should be checked in.
The advantage of this approach is that devs can manage their own schemas. With a scripted approach they dont all need to have DBA knowledge, and don't need to manage the database either. having all these on one database makes it easier to manage and control resources.
I've used this approach in teams with 50+ developers and it has worked very well.
This approach also paves the way for having devs checking scripts in and having a automatically creating a deployment package.
There is so much that can be done to make the development-test-deploy-backout cycle easier to manage.
I am just wondering how Flyway deals with the fact that several dynos can try to run a database migration several times in a clustered environment such as Heroku thereby causing a conflict?
It seems Flyway uses locking in order to deal with this issue.
To quote the documentation:
Can multiple nodes migrate in parallel? Yes! Flyway uses the locking
technology of your database to coordinate multiple nodes. This ensures
that even if even multiple instances of your application attempt to
migrate the database at the same time, it still works. Cluster
configurations are fully supported.
This question explains a bit more how the locking behaviour works. It appears to acquire a lock on the Flyway schema table: select * from dbschema.schema_version for update which might cause problems for longer running migrations as the innodb_lock_wait_timeout setting might cause a timeout, at least in the case of MySQL.
I've just started learning .NET MVC so this may be a silly question, but I've yet to find a good answer.
I'm following the Code First approach using the Entity Framework to build my database for me. I've included the following in my Application_Start() method in order to allow me to edit my database by making changes to my Model objects.
Database.SetInitializer<ContactManagerDB>(new DropCreateDatabaseIfModelChanges<ContactManagerDB>());
I was just wondering what would happen if I pushed this application to a production environment and then made a few changes to my models and then updated the application? Would this really drop and recreate the database in the production environment?
What's the best practice for pushing changes to production env. using the Code First approach?
DropCreateDatabaseIfModelChanges should only be use early on in development, never on a production machine. If you pushed to a production machine and made schema changes, you'd loose all your data.
You could delete the EdmMetadata table in your production environment. In that case, EF would not know the current schema to compare to the new, so it would just assume you know what you are doing and it would not touch the database schema.
Code first does not have the ability to upgrade your database while keeping your data intact.