Start sqitch migrations for target database with present snapshot of data - sqitch

I wish to use the same sqitch migrations on different database instances for which the initial snapshot of data can vary between the two.
This seems like a common use-case, and even if we are concerned merely with a single database.
Suppose the name of our database is D, and that we want to start sqitch migrations with the initial state of D -- call that D0, with successive states D1, D2, etc. How do we tell sqitch to start at a specific state of D?
I dealt with this issue for one database by simply dumping schema and data, and putting all the resulting SQL directives in a script used for my initial sqitch migration -- which corresponds to D0 above. But that approach seems overly cumbersome. Plus, what if for a different instance of the schema, I wish to start the migrations at D17?

Related

Cache and update regularly complex data

Lets star with background. I have an api endpoint that I have to query every 15 minutes and that returns complex data. Unfortunately this endpoint does not provide information of what exactly changed. So it requires me to compare the data that I have in db and compare everything and than execute update, add or delete. This is pretty boring...
I came to and idea that I can simply remove all data from certain tables and build everything from scratch... But it I have to also return this cached data to my clients. So there might be a situation that the db will be empty during some request from my client because it will be "refreshing/rebulding". And that cant happen because I have to return something
So I cam to and idea to
Lock the certain db tables so that the client will have to wait for the "refreshing the db"
or
CQRS https://martinfowler.com/bliki/CQRS.html
Do you have any suggestions how to solve the problem?
It sounds like you're using a relational database, so I'll try to outline a solution using database terms. The idea, however, is more general than that. In general, it's similar to Blue-Green deployment.
Have two data tables (or two databases, for that matter); one is active, and one is inactive.
When the software starts the update process, it can wipe the inactive table and write new data into it. During this process, the system keeps serving data from the active table.
Once the data update is entirely done, the system can begin to serve data from the previously inactive table. In other words, the inactive table becomes the active table, and vice versa.

GORM Golang : the purpose of cloning DB instance

In few pass week I just learn about GORM as the database ORM. After checking inside the code, every command (limit, order, where, or, select, etc) are returning new instance by cloning the current DB.
Is there anyone here know what is the main purpose of cloning the DB instead of using the current instance?
When I have command select, where, limit, order, join, that will be 5 times of cloning the DB instance. AFAIK, creating object on the memory are expensive.
The purpose is to be able to store "temporary" instance of your query to be able to derive them later. That is, if you have a number of queries which share the some part of the sequence, you should be able to do something like
q := gorm.Select(...).Limit(...).Order(...)
q1 := q.Where(...)
q2 := q.Where(...)
(This example is a rought example that probably doesn't even map to GORM API as I don't use it myself.)
Now, I believe that cloning objects in memory that won't be kept long doesn't hinder much performance compared to the cost of doing a SQL query, which imply a network round-trip…

Comparing across 2 database instances

I have a problem with 2 databases that I have created on my local machine. I keep changing one of the database instances(say SID A) and the other instance(say SID B) is only changed once every 2-3 weeks. I want to find out all the changes that I have done on the local DB (Procedures, inserts, deletions, functions etc.) in SID A. Both the instances have 10 users, and the changes are present across all the 10 users.
I have tried to do a "diff" in sqldeveloper, but I end up getting a list of all the tables, procedures etc. - all to be created in SID B.
I have seen some tools, ready made scripts etc.
Is there a definite way that I am missing - I dont want to do a database export and import every time I want to migrate the changes.
Database: Oracle 10G
Thanks in advance for helping out.
Thanks,
Contrib
One option is to use a tool like Red Gate's "Schema Compare for Oracle"; it's rock solid and will do exactly what you need it to, pretty much out of the box.
Before going down this sort of route though, I would suggest that you think about how you are deploying changes to your environments. For example, if you stored the incremental DML and DDL changes you made to schema A in source control, you could then play those in against schema B very easily.

Should I trust Redis for data integrity?

In my current project, I have PostgreSQL as my master DB, and Redis as kind of a slave, e.g., when some user adds another as a friend, first the relationship will be stored in PostgreSQL and then a friend list in Redis will be updated. When some user's friend list is requested, it will be pulled out of Redis instead of PostgreSQL.
The question is: when I update the friend list in Redis, should I get a fresh copy outof PostgreSQL, and replace the old list in Redis with the new one or should I keep the old list and simply SADD the userid into the list? The latter is of course best for performance, but intuitively the former does a better job in keep the data integrity? And if something like Celery is used, is the second method worth the risk?
This has nothing to do with Redis. When you are writing to two databases, a lot of things can go wrong even if both of them individually guarantee data integrity.
For the sake of discussion, replace Redis with MySQL in your question, and ask yourself - will data integrity be compromised?
You may have written to Postgres and then your process can die without writing to MySQL. Or perhaps there is a network outage. Or perhaps MySQL is down. In all these cases, Postgres and MySQL would start to differ.
It does not matter whether you replace the entire record or simply add one row. Both can lead to data corruption.
If you are concerned with data integrity, keep data in a single authoritative system. Otherwise, you would need a two phase commit protocol
You should evaluate how important consistency is to your application and take things from there. It doesn't sound like anyone's cry if you loose a commit. You could have a background process that reads data from PostgreSQL and pushes it back into Redis, eventually cleaning up any inconsistencies. Alternatively, you could look at read slave PostgreSQL instances replicating from the write master. This would get you better read scalability using well tested synchronization technology.

Two Phase Commit/Shared Transaction

The scenario is this
We have two applications A and B, both which are running in separate database (Oracle 9i ) transactions
Application A - inserts some data into the database, then calls Application B
Application B - inserts some data into the database, related (via foreign keys) to A's data. Returns an "ID" to Application A
Application A - uses ID to insert further data, including the ID from B
Now, because these are separate transactions, but both rely on data from each others transactions, we need to commit between the calls to each application. This of course makes it very difficult to rollback if anything goes wrong.
How would you approach this problem, with minimal refactoring of the code. Surely this kind of this is a common problem in the SOA world?
------ Update --------
I have not been able to find anything in Oracle 9i, however Oracle 11g provides DBMS_XA, which does exactly what I was after.
You have three options:
Redesign the application so that you don't have two different processes (both with database connections) writing to the database and roll it into a single app.
Create application C that handles all the database transactions for A and B.
Roll your own two phase commit. Application C acts as the coordinator. C signals A and B to ask if they're ready to commit. A and B do their processing, and respond to C with either a "ready" or a "fail" reply (note that there should be a timeout on C to avoid an infinite wait if one process hangs or dies). If both reply ready then C tells them to commit. Otherwise it sends a rollback signal.
Note that you may run into issues with option 3 if app A is relying on foreign keys from app B (which you didn't state, so this may not be an issue). Oracle's read consistency would probably prevent this from being allowed, since app A's transaction will begin before app B. Just a warning.
A few suggestions:
Use Compensating transactions. Basically, you make it possible to undo the transaction you did earlier. The hard part is figuring out which transactions to rollback.
Commit the data of applications A and B to the database using a flag indicating that it is only temporary. Then, after everything checks out fine, modify the flag to indicate that the data is final. During the night, run a batch job to flush out data that has not been finalized.
You could probably insert the data from Application A into a 'temporary' area so that Application B can do the inserts of both A and B without changing much in either appplications. It's not particularly elegant but it might do the trick.
In another scenario you could add a 'confirmation' flag field to your data which is updated after the entire process has run successfully. It if fails at one point, it might be easier to track down the records you need to rollback (in effect, delete).
I like both solutions presented, so I avoided posting this for a while. But you could also make an update to the main table, having saved the state of the affected rows in some cache before hand.
This could be combined with the two-tier (The traffic cop system Zathrus proposed)--because it really wouldn't be needed for neonski's solution of using a "sketchpad" table or tables. The drawback of this is that you would have to have your procs/logic consult the main table from the workarea or the workarea from the main table--or perhaps store your flag in the main table and set it back when you commit the data to the main table.
A lady on our team is designing something like that for our realtime system, using permanent work tables.
App_A =={0}=> database # App_A stores information for App_B
App_A ------> App_B # App_A starts App_B
App_B <={0}== database # App_B retrieves the information
App_B =={1}=> database # App_B stores more informaion
App_A <={2}== App_B # App_B returns 'ID' to App_A
App_A ={2,3}> database # App_A stores 'ID' and additional data
Is it just me or does it seem like Application B is essentially just a subroutine of A. I mean Application B doesn't do anything until A asks it, and Application A doesn't do anything until B returns an ID. Which means it makes little sense to have them in different applications, or even separate threads.

Resources