Magento determine if report script is running - magento

I have created a Magento module that will, based on some filters, create a CSV file with the order data. This report takes anytime from 15–40 min to run depending on the selected filters. Since there is a lot of data, I used straight queries to generate the report.
So what I am trying to do now, is to make sure that when this report is being generated, no one else can run it. So I need to be able to detect that the query is running. Any suggestions on the best approach to this?

create a file called report.lock when you start the report. Check to see if this file exists when you start the report and return an error if it does, otherwise create the file. Delete it once it is complete.

Related

Limit Power Automate Flow Trigger to Specific Excel Table

Is there a way to control a Power Automate Flow based on which table is updated in an Excel file?
I have one Flow that reads table data when the Excel file has been modified. I also have another Flow that will update a row, in this same file but a different table, when a SharePoint list is modified. The Flow that runs when the list is modified is kicking off the flow when the Excel file is modified, which I don't need or want.
Is there any way to qualify which table is being updated to allow/prevent the Flow to continue running? Is there something available in MS Graph maybe?
This is not preventing any work, it's more of an annoyance and the fact that it contributes to daily limits.
In Power Automate, I don't see any way to identify which table has been updated to allow/prevent the Flow from continuing.
From my experience, no, it's not possible.
To overcome this, you could have that table in an isolated file and then sync them with a main file when it's updated.
Annoying as hell but it will work.

spring batch - how to avoid re-loading(writing) data that was loaded in the previous run

I have a basic spring batch app which is trying to load the data from a csv file to mysql. the program does load the file into db during the first run. However when I accidently re-run the job/app again, it had thrown the primary key violation (for the right reasons).
What is the best way to avoid reloading the data that is present on the target system? when the batch job is scheduled, if for any good reason, the source file has not changed since the previous run, I want to see 0 record processed message rather than a primary key violation error. hope it makes sense.
more information:
Thanks. I have probably not understood the answer. Let me explain my requirement in a better way. I have a file contains the data from external data source (say new hire data) with a fixed name of hire.csv. The file should be updated with the delta changes for every run. As there is a possibility of manual error of not removing all loaded rows, some new hires from previous run would also be present on current run. Is there a mechanism available within itemreader or itemprocessor to skip those records that are already present on the target db? I can do "insert into tb where not in (select from tb)" but this run for every row which I dont want to use. Hope it is clear now. thanks again.
However when I accidently re-run the job/app again, it had thrown the primary key violation (for the right reasons). What is the best way to avoid reloading the data that is present on the target system?
The file you are ingesting should be a (identifying) job parameter. This way, when the first run succeeds, the job instance is complete and cannot be run again. This is by design in Spring Batch for this very use case: preventing accidental job execution twice by error.
Edit: Add further options based on comments
If deleting the file is an option, then you can use a job listener or a final step to delete the file after ingesting it. With this option, you need to add a second identifying paramter (since the file name is always hire.csv) to make sure you have a different job instance for each run. This option does not require having a different file name for each run.
If the file can be renamed to hire-${timestamp}.csv and will be unique, then deleting the file after ingesting it and using a single job parameter with the filename is enough
Side note: I have seen people using a business key to identify records in the input file and using an item processor to query the database and filter items that have been already ingested. This works for small datasets but performs poorly with large datasets due to the additional query for each item.

Check queries not used in a Oracle reports

I'm using Oracle Report Builder 9.0.4.1.0 and I have a heavy report that has defined a large number of queries. I think not all that queries are used in the report and are not linked to any layout object.
Is there a easy way to detect what queries (or other objects) aren't used at all in a specific report? Instead of delete the query, compile and run and verify one by one if are used or not?
Thanks
If there is an easy way to do that, I don't know it. A long time ago, when Reports 1.x was used, report was saved in the database so you could write a query to fetch metadata you're interested in. I never did that, though, but - that would be an option. Now, all you have is a RDF (or a JSP) file.
However, a few suggestions, if I may.
Open Paper Layout Editor. Click the repeating frame and observe its property palette as it contains information about the group it belongs to. "Group" can be viewed in Data Model layout.
As there aren't that many repeating frames, you should be able to eliminate queries that don't have any frames, i.e. don't contribute to the final result.
Another option is to put a condition
WHERE 1 = 2
into every query so that they won't return any rows. Run the report and check what's missing - then remove that condition so that you'd get values. Move on to second query, and so forth. That's a little bit tedious and time consuming, but should still be faster than deleting queries.
You can return a report results to an XML file. Each query with data will contain something in XML-s tags.
enter image description here

Parse Cloud Code touch all records in database

I'm wondering is it possible to touch/update all records in some class so they trigger before and after save hooks. I have a lot of records in database and it takes time to update all manually via Parse control panel.
You could write a cloud job which iterates through everything, but it would need to make an actual change to each object or it won't save (because the objects won't be dirty). You're also limited on runtime so you should sort by updated date and run the job repeatedly until nothing is left to do...

Exporting 8million records from Oracle to MongoDB

Now I have an Oracle Database with 8 millions records and I need to move them to MongoDB.
I know how to import some data to MongoDB with JSON file using import command but I want to know that is there a better way to achieve this regarding these issues.
Due to the limit of execution time, how to handle it?
The database is going up every seconds so what's the plan to make sure that every records have been moved.
Due to the limit of execution time, how to handle it?
Don't do it with the JSON export / import. Instead you should write a script that reads the data, transforms into the correct format for MongoDB and then inserts it there.
There are a few reasons for this:
Your tables / collections will not be organized the same way. (If they are, then why are you using MongoDB?)
This will allow you to monitor progress of the operation. In particular you can output to log files every 1000th entry or so to get some progress and be able to recover from failures.
This will test your new MongoDB code.
The database is going up every seconds so what's the plan to make sure that every records have been moved.
There are two strategies here.
Track the entries that are updated and re-run your script on newly updated records until you are caught up.
Write to both databases while you run the script to copy data. Then once you've done the script and everything it up to date, you can cut over to just using MongoDB.
I personally suggest #2, this is the easiest method to manage and test while maintaining up-time. It's still going to be a lot of work, but this will allow the transition to happen.

Resources