Why python script loop is not fetching updated data from DB? - pymysql

I've two Python script. Both of them uses PYMYSQL to read / write data on a table of MySQL DB.
Script 1: It runs in a infinite while loop and fetches data from outside API in every second and updates the same to the table of MySQL DB via UPDATE SQL Query. It's working completely fine. I can see on DB level that data is getting updated every second.
Script 2: It runs in a infinite while loop and fetches data from above mentioned table of MySQL DB and prints it via SELECT SQL Query. While I run it first time, it fetches correct / updated / latest data from DB, but after that when it iterates on the while loop, it fetches the same data which it fetched at the first time of starting and does not fetch the updated data. But it should fetch the updated data as Script 1 is continuously updating the data on DB every second. But if I stop the script and restarts then it shows the updated data.
I just want to know the reason or fix for same. It seems that when DB connection is getting restarted for Script 2 when I restart the script, it fetches the latest data.
Kindly help on same.

I have found the fix. Any of below can be applied.
Adding autocommit=True into the connect() parameters.
Calling conn.commit() after the cursor.execute() command.
It's repeatable read as mentioned here: https://dev.mysql.com/doc/refman/8.0/en/glossary.html#glos_consistent_read

Related

H2 database table getting clear automatically

I am using H2 Database to test my SpringBoot application. I do not use the file to store the data. instead I just use the In Memory datatabase. in the properties file, my JdbcUrl look like below:
spring.datasource.url=jdbc:h2:mem:;MODE=MSSQLServer;INIT=runscript from 'classpath:/schema.sql'\\;runscript from 'classpath:/data.sql'
Now When I run the tests, I have the following test scenario
Add Some Entities in a table (This adds some rows in a table)
search those entities by some criteria
Do the assertion
Now, sometime this runs successfully, but sometimes what happens is, the search query returns empty list, which causes the test to be failed.
I tried to add print statements just to check whether my entities are getting inserted properly, so in the insert function. after each insertion, I run the below query
SELECT * FROM tableName;
Which returns correct list. means each insertion is inserting in the table correctly. Now in the search function, before running the actual search query, I run the same query again
SELECT * from tableName;
And Surprisingly this is returning empty also, which means there is no data in the table. Please suggest what I check for?
Pretty sure #Evgenij Ryazanov's comment is correct here.
Closing the last connection to a database closes the database.
When using in-memory databases this means the content is lost.
After step 1) Add Some Entities in a table - is the connection closing?
If so to keep the database open, add ;DB_CLOSE_DELAY=-1 to the database URL.
e.g.
spring.datasource.url=jdbc:h2:mem:;DB_CLOSE_DELAY=-1;MODE=MSSQLServer;INIT=runscript from 'classpath:/schema.sql'\\;runscript from 'classpath:/data.sql'
Note, this can create a memory leak!
see: http://www.h2database.com/html/features.html#in_memory_databases

Why informatica fetches more records from source when source itself has less records

I have an issue in production env, one of the work flow is running more tgan one day and inserting records in to sql server db. It s just direct load mapping, there is no sq over ride as well. Monitor shows sq count as 7 million and inseting same no of records inyo target. But source db shows around 3 million records only. How can this be possible?
Have you checked if the source qualifier is joining more than one table? A screenshot of the affected mapping pipeline and obfuscated logfile would help.
Another thought... given your job ran for a day, were there any jobs ran in that time to purge old records from the source table?
Cases when I saw this kind of things happening:
There's a SQL Query override doing something different than I thought (eg. joining some tables)
I'm looking at a different source - verify the connections and make sure to check the same object on the same database at the same server the PowerCenter is connecting to.
It's a reusable session being executed multiple times by different workflows. In such case in workflow monitor it may happen that Source/Target Statistics will refer to another execution.

Oracle DB links and retrieving stale data

I have 2 databases, DBa and DBb. I have 2 records sets, RecordsA and RecordsB. The concept is that in our app you can add records from A to B. I am having an issue where I go to add a record from A to B and try to query the records again. The particular property on the added record is stale/incorrect.
RecordsA lives on DBa and RecordsB lives on DBb. I make my stored proc call to add the record to the B side and modify a column's value on DBa which makes the insert/update using a dblink on DBb. Problem is, when I do a insert/update followed by an immidiate get call on DBa (calling DBb) that modified property is incorrect, it's null as if the insert never went through. However, if I put a breakpoint before the pull call and wait about 1 second the correct data is returned. Making me wonder if there is some latency issues with dblinks.
This seems like an async issue but we verified no async calls are being made and everything is running on the same thread. Would this type of behavior be likely with a db link? As in, inserting/updating a record on a remote server and retrieving it right away causing some latency where the record wasn't quite updated at the time of the re-pull?

ElasticSearch: get new records

I have a script that runs periodically, and on each run I want it to get the records that were added to the DB since the last run. How can I achieve this?
I prefer not to change anything in the way records are added to the DB if I don't have to.

Exporting 8million records from Oracle to MongoDB

Now I have an Oracle Database with 8 millions records and I need to move them to MongoDB.
I know how to import some data to MongoDB with JSON file using import command but I want to know that is there a better way to achieve this regarding these issues.
Due to the limit of execution time, how to handle it?
The database is going up every seconds so what's the plan to make sure that every records have been moved.
Due to the limit of execution time, how to handle it?
Don't do it with the JSON export / import. Instead you should write a script that reads the data, transforms into the correct format for MongoDB and then inserts it there.
There are a few reasons for this:
Your tables / collections will not be organized the same way. (If they are, then why are you using MongoDB?)
This will allow you to monitor progress of the operation. In particular you can output to log files every 1000th entry or so to get some progress and be able to recover from failures.
This will test your new MongoDB code.
The database is going up every seconds so what's the plan to make sure that every records have been moved.
There are two strategies here.
Track the entries that are updated and re-run your script on newly updated records until you are caught up.
Write to both databases while you run the script to copy data. Then once you've done the script and everything it up to date, you can cut over to just using MongoDB.
I personally suggest #2, this is the easiest method to manage and test while maintaining up-time. It's still going to be a lot of work, but this will allow the transition to happen.

Resources