I have a script that runs periodically, and on each run I want it to get the records that were added to the DB since the last run. How can I achieve this?
I prefer not to change anything in the way records are added to the DB if I don't have to.
Related
I've two Python script. Both of them uses PYMYSQL to read / write data on a table of MySQL DB.
Script 1: It runs in a infinite while loop and fetches data from outside API in every second and updates the same to the table of MySQL DB via UPDATE SQL Query. It's working completely fine. I can see on DB level that data is getting updated every second.
Script 2: It runs in a infinite while loop and fetches data from above mentioned table of MySQL DB and prints it via SELECT SQL Query. While I run it first time, it fetches correct / updated / latest data from DB, but after that when it iterates on the while loop, it fetches the same data which it fetched at the first time of starting and does not fetch the updated data. But it should fetch the updated data as Script 1 is continuously updating the data on DB every second. But if I stop the script and restarts then it shows the updated data.
I just want to know the reason or fix for same. It seems that when DB connection is getting restarted for Script 2 when I restart the script, it fetches the latest data.
Kindly help on same.
I have found the fix. Any of below can be applied.
Adding autocommit=True into the connect() parameters.
Calling conn.commit() after the cursor.execute() command.
It's repeatable read as mentioned here: https://dev.mysql.com/doc/refman/8.0/en/glossary.html#glos_consistent_read
I'm wondering is it possible to touch/update all records in some class so they trigger before and after save hooks. I have a lot of records in database and it takes time to update all manually via Parse control panel.
You could write a cloud job which iterates through everything, but it would need to make an actual change to each object or it won't save (because the objects won't be dirty). You're also limited on runtime so you should sort by updated date and run the job repeatedly until nothing is left to do...
I have created a Magento module that will, based on some filters, create a CSV file with the order data. This report takes anytime from 15–40 min to run depending on the selected filters. Since there is a lot of data, I used straight queries to generate the report.
So what I am trying to do now, is to make sure that when this report is being generated, no one else can run it. So I need to be able to detect that the query is running. Any suggestions on the best approach to this?
create a file called report.lock when you start the report. Check to see if this file exists when you start the report and return an error if it does, otherwise create the file. Delete it once it is complete.
how to specify no of records to delete in Tibco JDBC Update activity in batch update mode.
Actually I need to delete 25 million of records from the database so I wrote Tibco code to do the same and it is taking lot of time .. So I am planning to use Batch mode in Delete query so I don't know how to specify no of records in JDBC Update activity.
Help me if any one has any idea.. thanks
From the docs for the Batch Update checkbox:
This field is only meaningful if there are prepared parameters in the
SQL statement (see Prepared Parameters).
In which case the input will be an array of records. It will execute the statement once for each record.
To avoid running out of memory, you will still need to iterate over the 25mil, but you can iterate in groups of 1000 or 10000.
If this is not something you would do often (deleting 25M rows, sounds pretty one-off), an alternative is to use BW to create a file containing the delete statements and then giving the file to a DBA to execute.
please use subset feature of jdbc palette!! Let me know if you face any issues?
I would suggest two points:
If this is an one time activity then it is not adviced to use Tibco BW code for that. SQL script should be the better alternative.
When you say 25 million records- what criteria is this based on. It can be achieved through subset iteration .But there should be proper load testing in the Pre - Prod environment to check that the process is not causing any memory/DB issue.
You can also try using SQL procedure and invoking the same through BW.
Now I have an Oracle Database with 8 millions records and I need to move them to MongoDB.
I know how to import some data to MongoDB with JSON file using import command but I want to know that is there a better way to achieve this regarding these issues.
Due to the limit of execution time, how to handle it?
The database is going up every seconds so what's the plan to make sure that every records have been moved.
Due to the limit of execution time, how to handle it?
Don't do it with the JSON export / import. Instead you should write a script that reads the data, transforms into the correct format for MongoDB and then inserts it there.
There are a few reasons for this:
Your tables / collections will not be organized the same way. (If they are, then why are you using MongoDB?)
This will allow you to monitor progress of the operation. In particular you can output to log files every 1000th entry or so to get some progress and be able to recover from failures.
This will test your new MongoDB code.
The database is going up every seconds so what's the plan to make sure that every records have been moved.
There are two strategies here.
Track the entries that are updated and re-run your script on newly updated records until you are caught up.
Write to both databases while you run the script to copy data. Then once you've done the script and everything it up to date, you can cut over to just using MongoDB.
I personally suggest #2, this is the easiest method to manage and test while maintaining up-time. It's still going to be a lot of work, but this will allow the transition to happen.