How to retrive more than 5000 record from CRM using kingswaysoft for SSIS packages? - dynamics-crm

I am trying to migrate data between two CRM databases(dynamics 365) but when in kingswasoft there is limit of 5000 record per batch. can anyone please suggest an approach wherein I can send n number of records?

We will page through all records in the source entity. The Batch Size setting on the CRM Source component is just used to specify how many records you want to retrieve per service call, not the total number you will get from the source entity. Hope this clarifies things a bit more.

Related

Advice on Setup

I started my first data analysis job a few months ago and I am in charge of a SQL database and then taking that data and creating dashboards within Power BI. Our SQL database is replicated from an online web portal we use for data entry. We do not add data ourselves to the database but instead the data is put into tables based on the data entered into the web portal. Since this database is replicated via another company, I created our own database that is connected via linked server. I have built many views to pull only the needed data from the initial database( did this to limit the amount of data sent to Power BI for performance). My view count is climbing and wondering in terms of performance, is this the best way forward. The highest row count of a view is 32,000 and the lowest is around 1000 rows.
Some of the views that I am writing end up joining 5-6 tables together due to the structure built by the data web portal company that controls the database.
My suggestion would be to create a Datawarehouse schema ( star schema ) keeping as principal, one star schema per domain. For example one for sales, one for subscriptions, one for purchase, etc. Use the logic of Datamarts.
Identify your dimensions and your facts and keep evolving that schema. You will find out that you will end up with a much fewer number of tables.
Your data are not that big so you can use whatever ETL strategy you like.
Truncate load or incrimental.

Data integration for Magento to Quick Book

I'm currently new to Talend and I'm learning through videos and documentation, so I'm just not sure how to approach/implement this with best practices.
Goal
Integrate Magento and Quick Book using Talend.
My thoughts
Initially my first thought was I will setup direct DB connection for Magento and will take relevant data which I need and will process it and will send to QuickBook using REST API's(specifically bulk API's in batch)
But then again I thought it would be little hectic for me to query Magento database(multiple joins) so I've another option to use Magento's REST API.
But as I'm not much familiar with the tool I'm struggling little to find best suitable approach, so any help is appreciated.
What I've done till now?
I've saved my auth(for QB) and db(Magento) credentials data in file and using tFileInputDelimited and tContextLoad, I'm storing them in context variables so they can be accessible globally.
I've successfully configured database connection and dbinput but I've not used metadata for connection(should I use that and if Yes how can I pass dynamic values there?). I've used my context variables data in db connection settings.
I've taken relevant fields for now but if I want multiple fields simple query is not enough as Magento stores data in multiple tables for Customer etc but it's not big deal I know but I think it might increase my work.
For now that's what I've built and my next step is send the data to QB using REST while getting access_token and saving it to context variable and again storing the QB reference into Magento DB.
Also I've decided to use QB bulk API's but I'm not sure how I can process data in chunks in Talend(I tried to check multiple resources but no luck) i.e. if the Magento is returning 500 rows I want to process them in chunks of 30 as QB batch max limit is 30, so I will be sending it using REST to QB and as I said I also want to store back QB reference ID in magento(so I can update it later).
Also this all will be on local, then how can I do same in production? how I can maintain development and production environment?
Resources I'm referring
For REST and Auth best practices - https://community.talend.com/t5/How-Tos-and-Best-Practices/Using-OAuth-2-0-with-Talend-to-Access-Goo...
Nice example for batch processing here:
https://community.talend.com/t5/Design-and-Development/Batch-processing-in-talend-job/td-p/51952
Redirect your input to a tFileOutputDelimited.
Enter the output filename, tick the option "Split output in several files" from the "Advanced settings" and enter the value of 1000 into the field "Rows in each output file". This will create n files based on the filename with 1000 in each.
On the next subjob, use a tFileList to iterate over this file list to get records from each file.

faster large data exports in laravel to avoid timeouts

I am to generate a report from the database with 1000's of records. This report is to be generated on monthly basis and at times the user might want to get a report spanning like 3 months. Already as per the current records, a month's data set can reach to like 5000.
I am currently using vue-excel to which makes an api call to laravel api and there api returns the resource which is now exported by vue-excel. The resource does not only return the model data but there are related data sets I also need to fetch.
This for smaller data sets works fine that is when I am fetching like 3000 records but for anything larger than this, the server times out.
I have also tried to use laravel excel with the query concern actually timed them and both take same amount of time because laravel excel was also mapping to get me the relations.
So basically, my question is: is there some better way to do this so as get this data faster and avoid the timeouts
just put this on start of the function
ini_set(max_execution_time, 84000); //84000 is in seconds
this will override the laravel inbuild script runtime max value.

MS Dynamics CRM 365 (online) - Performance issue inserting custom entity records with ExecuteMultipleRequest

I'm calling ExecuteMultipleRequest to insert 25 records of a custom entity at a time. Each batch is taking roughly 20 seconds.
Some info about the custom entity:
I did not create its schema and can't have it changed;
It has 124 attributes (colums);
On each CreateRequest the entity has 6 attribute values filled: 2 Lookup and 4 Money. ExecuteMultipleRequest is being called from a middleware component in a corporate network, which connects to the CRM in the cloud. The CRM instance used is a sandbox, so it may have some restrictions (CPU/bandwidth/IO/etc), that I'm not aware of.
I can issue concurrent requests, but considering I can only have 2 concurrent requests per organization (https://msdn.microsoft.com/en-au/library/jj863631.aspx#limitations), it would only cut the time in half. That is still not a viable time.
For each new custom CRM process created I need to load at most 5000 entity records, in less than 10 minutes.
What can I do to improve the performance of this load? Where should I be looking at?
Would a DataImport (https://msdn.microsoft.com/en-us/library/hh547396.aspx) be faster than ExecuteMultipleRequest?
Only really got suggestions for this, you would probably have to experiment and investigate to see what works for you.
Can you run your middleware application in a physical location closer to your CRM Online site?
ExecuteMultipleRequest supports much larger batch sizes, up to 1000.
Have you compared to just using a single execute request.
Do you have lots of processes (workflows, plugins) that occur in CRM when the data import is running? This can have a big performance impact. Perhaps these can be disabled during data import. E.g. you could pre-process the data before import so a plugin wouldnt need to be executed.
The concurrent requests limitation only applied to ExecuteMultipleRequest, have you tried running lots of parallel single execute requests?

(ASP.NET) How would you go about creating a real-time counter which tracks database changes?

Here is the issue.
On a site I've recently taken over it tracks "miles" you ran in a day. So a user can log into the site, add that they ran 5 miles. This is then added to the database.
At the end of the day, around 1am, a service runs which calculates all the miles, all the users ran in the day and outputs a text file to App_Data. That text file is then displayed in flash on the home page.
I think this is kind of ridiculous. I was told they had to do this due to massive performance issues. They won't tell me exactly how they were doing it before or what the major performance issue was.
So what approach would you guys take? The first thing that popped into my mind was a web service which gets the data via an AJAX call. Perhaps every time a new "mile" entry is added, a trigger is fired and updates the "GlobalMiles" table.
I'd appreciate any info or tips on this.
Thanks so much!
Answering this question is a bit difficult since there we don't know all of your requirements and something didn't work before. So here are some different ideas.
First, revisit your assumptions. Generating a static report once a day is a perfectly valid solution if all you need is daily reports. Why hit the database multiple times throghout the day if all that's needed is a snapshot (for instance, lots of blog software used to write html files when a blog was posted rather than serving up the entry from the database each time -- many still do as an optimization). Is the "real-time" feature something you are adding?
I wouldn't jump to AJAX right away. Use the same input method, just move the report from static to dynamic. Doing too much at once is a good way to get yourself buried. When changing existing code I try to find areas that I can change in isolation wih the least amount of impact to the rest of the application. Then once you have the dynamic report then you can add AJAX (and please use progressive enhancement).
As for the dynamic report itself you have a few options.
Of course you can just SELECT SUM(), but it sounds like that would cause the performance problems if each user has a large number of entries.
If your database supports it, I would look at using an indexed view (sometimes called a materialized view). It should support allows fast updates to the real-time sum data:
CREATE VIEW vw_Miles WITH SCHEMABINDING AS
SELECT SUM([Count]) AS TotalMiles,
COUNT_BIG(*) AS [EntryCount],
UserId
FROM Miles
GROUP BY UserID
GO
CREATE UNIQUE CLUSTERED INDEX ix_Miles ON vw_Miles(UserId)
If the overhead of that is too much, #jn29098's solution is a good once. Roll it up using a scheduled task. If there are a lot of entries for each user, you could only add the delta from the last time the task was run.
UPDATE GlobalMiles SET [TotalMiles] = [TotalMiles] +
(SELECT SUM([Count])
FROM Miles
WHERE UserId = #id
AND EntryDate > #lastTaskRun
GROUP BY UserId)
WHERE UserId = #id
If you don't care about storing the individual entries but only the total you can update the count on the fly:
UPDATE Miles SET [Count] = [Count] + #newCount WHERE UserId = #id
You could use this method in conjunction with the SPROC that adds the entry and have both worlds.
Finally, your trigger method would work as well. It's an alternative to the indexed view where you do the update yourself on a table instad of SQL doing it automatically. It's also similar to the previous option where you move the global update out of the sproc and into a trigger.
The last three options make it more difficult to handle the situation when an entry is removed, although if that's not a feature of your application then you may not need to worry about that.
Now that you've got materialized, real-time data in your database now you can dynamically generate your report. Then you can add fancy with AJAX.
If they are truely having performance issues due to to many hits on the database then I suggest that you take all the input and cram it into a message queue (MSMQ). Then you can have a service on the other end that picks up the messages and does a bulk insert of the data. This way you have fewer db hits. Then you can output to the text file on the update too.
I would create a summary table that's rolled up once/hour or nightly which calculates total miles run. For individual requests you could pull from the nightly summary table plus any additional logged miles for the period between the last rollup calculation and when the user views the page to get the total for that user.
How many users are you talking about and how many log records per day?

Resources