I just started with ElasticSearch and I wish to automate migration between code versions.
For RDBMS I use tools like phinx that apply changes to the DB.
For example:
Create a migration file with up() & down() methods.
Write commands to apply (for example add index).
after tests and etc ./phinx migrate.
Is there a migration tool like this?
If not, is there another acceptable approach to handle changes to the cluster?
I have never heard of a tool like that specifically for ES indexes.
If your goal is to update the internal representation of your data, i think the best approach is just create a script that:
Find the affected documents
Read the contents
Modify them
Reindex them in a new doc
Then you can delete the old document.
Update a doc it wont be more efficient that reindex, since documents are immutable, so update is just get + reindex (https://www.elastic.co/guide/en/elasticsearch/guide/current/update-doc.html)
Flyway with code-based (e.g. Java) migrations can be used to work with any data store. Similar to migrating relational DB, but requires a bit more work since you need to implement calls to ElasticSearch with the relevant commands (e.g. create index).
https://flywaydb.org/documentation/concepts/migrations.html#java-based-migrations
Coming from a background of RDBMS, the migration tool is very handy when you are working with a big project that is having a lot of migrations files. I was also facing the same issue with Elasticsearch that currently there was no stable migration tool in the community.
I have created a migration tool and it will be handy if you are coming from a background of python https://pypi.org/project/chalan/. The core idea is taken from Alembic migration tool that is for Sqlalchemy.
Usage is simple
pip install chalan
Then for upgrade you have to use
chalan upgrade
And for downgrade you have to use
chalan downgrade
Please let me know if you face any issues with this tool and feel free to suggest some improvements if any.
For source code please refer the github link - https://github.com/anandtripathi5/chalan
Related
everyone. I've been following in Github the progression of this issue handling. I think and believe all is OK now. I just need you to tell me what to do in my deployments. I've just installed Clickhouse 21.8.9, and I'm trying to make some tests in order to extract from MongoDB and fill out an aggregatingMergeTree table in Clickhouse. I've been reading a lot of tech doc about Clickhouse possibilities, so I know that is not the only way to accomplish what I want to do. But it's a valid way, so I want to test it. My Clickhouse installation comes from downloaded DEB files (I'm using Ubuntu 20.04 in my laptop). According to some changes I saw in the Clickhouse repo at Github, it seems I might have to re-compile Clickhouse, is that correct? What do you advice me to do? Thanks in advance.
PS: I've just tried in a MongoDB 3.6 and a MongoDB 4.0, the outcome is the same: no admit empty username or cannot authenticate if I use credentials
Strapi is a powerful tool, but I am unable to find any documentation/instructions which explain about the best practices/strategy to migrate contents from one environment to another, for example Dev to staging to production ?
If you got a lot of content and recreate all the content on each environment is not viable.
Please guide, how to move API and DB (mongo) stuff.
If there is only manual workaround for now, we should document it so that can be used for now.
For now there is no other way than manually import/export your database:
Github issue: Import / export data
I am looking a this Amazon page - https://aws.amazon.com/rds/aurora/serverless/ and it has this quote:
You pay on a per-second basis for the database capacity you use when
the database is active, and migrate between standard and serverless
configurations with a few clicks in the AWS Management Console.
I have a few normal Aurora clusters and want to switch them to serverless. I have looked and looked and cannot find the "migrate with a few clicks" bit in the Amazon user interface. I made a new serverless cluster just fine and so I could do a stop, backup, and restore with a short outage - but If I can do this without an outage - that would be far superior.
So where are these "few clicks" - or perhaps you will tell me the "few clicks" means stop, backup, and restore. Either way I think a lot of folks could benefit from knowing what "few clicks" make this happen.
As a comment on #drchuck's approach - We've learned this the hard way that AWS Database Migration Service does a bad job at creating the schema in the target database. However - there's a simple workaround:
1) Run mysqldump --no-data to get the exact schema from the source database.
2) Execute the dump'd schema on the target database.
3) Within your DMS task, under target table preparation mode, choose "Truncate" instead of "Drop tables on target". (https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.Creating.html)
With this in place, DMS doesn't create the schema on the target side, and things work pretty well (all existing data is loaded, and then ongoing changes are sync'd in near-real-time).
We've used this approach for minimal downtime cutovers a few times.
It took more than a while to figure out those few clicks.
I'm here initially as I too could not find them and yes I saw the exact quote on the AWS page you indicated saying that yes you could.
First you take a snapshot and then you restore it. In the process of restoring it you can select a serverless instance. (At least under SOME conditions. I do not think that a 5.7.12 (just confirmed actually) can be restored to a serverless configuration).
I suspect that 5.7.12 will happen in due time.
Right now the magic bullet is to start with a 5.6.10a version, take a snapshot and then restore that to a serveless instance.
For what it's worth after the long time:
Apparently Amazon Aurora Serverless is only compatible with MySQL 5.6 - this explains why 5.7 snapshots cannot be recovered.
So the two options are
downgrading the MySQL version to 5.6 first or
dumping and importing the data (after I read the other answers, I'd go for the second option).
Further reading:
https://aws.amazon.com/rds/aurora/serverless/?nc1=h_ls#How_to_Get_Started
When I did not get an answer in a few days, I did the conversion two ways with different results so I figured I would share my results here. I would still love to hear a better approach. (1) When I did the conversion using mysqldump and restore, with a short outage things were fine. (2) When I used AWS Database Migration Service it went pretty badly.
First, you have to get the binary log format as "ROW" and retention to 24 hours. That necessitated server restarts on my old clusters. Then when the data migration worked, I lost all my auto increments, then NULLness in my columns, the UNIQUE clauses and foreign keys in the new tables. Literally the only thing that migrated correctly was that the actual data and PRIMARY KEY indications. Also, I would recommend migrating one database at a time (i.e. schema) and don't try to migrate the mysql internal schemas. I said "migrate everything" and the migration tool tried to migrate the MySQL stuff - sheesh.
The one thing the AWS Database Migration Service did that was really cool was the migrate and monitor (made possible by the binary logging on the rows). You could watch it moving rows.
Just for the record, AWS amended the quoted documentation in mid-2022 by changing 'few clicks' to 'few steps'.🤣
You pay on a per-second basis for the database capacity that you use
when the database is active, and migrate between standard and
serverless configurations with a few steps in the Amazon Relational
Database Service (Amazon RDS) console.
Currently the documentation states that there are two (multi-step) methods that can be used to migrate from provisioned to serverless, and serverless to provisioned:
Snapshot restore.
Logical backup and restore.
Details here.
I searching for an way to do a different migration in production and development.
I want to create a Spring Webapplication with Maven.
In development i want to update database schema AND load test data.
In production when a new version of the application is deployed i want only change the schema and don't load test data.
My first idea was to save the schema update and insert statements into different folders.
I think every body has solved this problem and can help me, thank you very much.
Basically, you have two options:
You could use different locations for your migrations in your flyway.locations property, i.e.:
for Test
flyway.locations=sql/structure,sql/test
for Production
flyway.locations=sql/structure
That way, you include your test data in the sql/test folder. You would have to take care with numbering, of course.
The second option (the one I prefer), is don't include test data in your migrations at all.
Rather, create your testdata any way you want and create an sql-dump of this data, which you keep separate from your migrations.
This works best if you have a separate database (instance, schema, whatever) containing your pristine testdata, where you apply each migration as part of your build process. This build job could then create a dump always matching the current migration.
When preparing your test machine, you first apply your migrations, then you load the contents of the matching dump.
I think this is a lot cleaner than the first version, especially because your test data can be prepared using other tools (your application) and has not to be handcoded.
I've just started learning .NET MVC so this may be a silly question, but I've yet to find a good answer.
I'm following the Code First approach using the Entity Framework to build my database for me. I've included the following in my Application_Start() method in order to allow me to edit my database by making changes to my Model objects.
Database.SetInitializer<ContactManagerDB>(new DropCreateDatabaseIfModelChanges<ContactManagerDB>());
I was just wondering what would happen if I pushed this application to a production environment and then made a few changes to my models and then updated the application? Would this really drop and recreate the database in the production environment?
What's the best practice for pushing changes to production env. using the Code First approach?
DropCreateDatabaseIfModelChanges should only be use early on in development, never on a production machine. If you pushed to a production machine and made schema changes, you'd loose all your data.
You could delete the EdmMetadata table in your production environment. In that case, EF would not know the current schema to compare to the new, so it would just assume you know what you are doing and it would not touch the database schema.
Code first does not have the ability to upgrade your database while keeping your data intact.