Design approach for feeds

Design approach for feeds - algorithm

We have feeds running between external systems and our system that brings in investment data. These feeds run every 15 minutes. Every time feeds run, we update a LastRun timestamp column that indicates feed ran successfully. To force a feed to run, we set that feed's LastRun timestamp to NULL.
I am working on some new workflow that will let my users create investments in our own system. Once investment is created in the original external system, feed will get that in, and I will link that investment to the one I created. While linking, I will force-run the feeds related to investments to get other investment-related data.
Issue I am having is, what if feed is already running when I set the LastRun timestamp to null? It will not know that linking has happened, and it will simply update the LastRun timestamp and be on its way. Any solution to this?

you can do one thing that make a table that will keep id,status and dt_created where you keep the new investment done to your system and set the status flag to no. now when you run the feed check the status flag if it is no then run the feed and after running update it to yes
hope this can solve your problem

Related

can't find what I want in logminer, in fact, can't find anything recent

I am the sysadmin for a school, so I'm an IT generalist, jack of all trades, master of none, right? Our student information system runs on top of Oracle 11g. Would like to know how to use logminer to find out, at the very least, when something was changed in the database that shouldn't have been changed.
I have configured a test server to play with, so rest your mind, our production system isn't at risk while I play here.
The server is Windows. I go to a command prompt, type sqlplus / as sysdba.
Execute dbms.logmnr.addlogfile blah, blah multiple times to add the log files.
alter session set NLS_DATE_FORMAT = 'mm-dd-yyyy HH24:mi:ss'; so the time stamps tell me more than just the date.
Then I go to the application on my test server and make a change to a student demographic record. I want to find this change using logminer.
I do a select timestamp,sql_undo from V$LOGMNR_CONTENTS WHERE TIMESTAMP > TO_DATE('04-11-2013 11:59:00'); (I made the change just now, around 3 pm)
I get no rows.
If I do the same thing, but with a time just after midnight, I get thousands of rows, as the app has routines that kick off at midnight doing maintenance, like recalculating student's class ranks, for instance.
So why am I not finding the change I made logged? I believe I'm looking in the right log files, or I wouldn't see the activity at midnight.

Though your latest entry is recorded it won't appear in V$LOGMNR_CONTENTS till it has sufficient number of updates recorded. For example, if you do 100 updates, you may get 80. To flush out the remaining 20, you need to have some more updates done so that you can see them again. We had a similar problem where logminer was particularly not showing latest updates especially if they are very few. We had to create a dummy table to create some updates regularly so that logminer is always actively showing the updates and nothing is stored in buffer. In our usecase creating the dummy table was ok, I am not sure if it's ok in your case.

Locking records returned by context? Or perhaps a change to my approach

I'm not sure whether I need a way to lock records returned by the context or simply need a new approach.
Here's the story. We currently have a small number of apps that integrate with our CRM. Some of them open a XrmServiceContext and return a few thousand record to perform updates. These scripts are calling SaveChanges along the way but there will still be accounts near the end that will be saved a couple of minutes after the context return them. If a user updates the record during this time, their changes are overwritten by the script.
Is there a way of locking the records until the context has saved the update back or is there a better approach I should be taking?
Kit

In my opinion, this type of database transaction issue is what CRM is currently lacking the most. There is no way to ensure that someone else doesn't monkey with your data, it's always a last-one-in-wins world in CRM.
With that being said, my suggestion would be to only update the attributes you care about. If you're returning all columns for an entity, when you update that entity, you're possibly going to update all the attributes of the entity, even if you only updated one of them.
If you're dealing with a system were you can't tolerate the last-one-in-wins mentality, then you're probably better off not using CRM.
Update 1
CRM 2015 SP1 and above supports Optimistic Updates. Which allows the use of a version number to ensure that no one has updated the record since you retrieved it.

You have a several options here, it just depends on what you want to do. First of all though, if you can move some of these automated processes to off-time hours, then that's the best option.
Another option would be to retrieve each record 1 by 1 instead of by 1000+.
If you are only updating a percentage of the records retrieved, then you would be better off to check before saving if an update occurred (comparing the modified date). If the modified date changed, then you need to do a single retrieve and then save.

At first thought, I would create a field or status that indicates a pending operation and then use JScript in the form OnLoad event to warn/lock the form. When you process completes, it could clear the flag.

How to model data planning

I want to build a data model which supports:
Data history - store every change of data. This is not a problem: http://en.wikibooks.org/wiki/Java_Persistence/Advanced_Topics#History
Data planning - user should be able to prepare a record with validity sometime in the future (for example, I know that customer name changes from May so I prepare record with validity of 1st of May).
How can I do point 2?
How can I do these things together (points 1 & 2)

If you really need point 2 - and I would think very hard about this, because in my experience users will never use it, and you will be spending a lot of effort to support something no one will ever use - anyway, if you really need it, then:
Make no changes at all directly in the table. All changes go through history.
Behind the scenes, periodically you will run a batch updater. This goes through history, finds all unapplied changes (set a status flag in the history to be able to rapidly find them), and applies them, and it checks the date to make sure it is time to apply the change.
You are going to have to deal with merges. What if the user says: In one month my name changes. Then goes in a and changes their name effective today. You have a conflict. How do you resolve it? You can either prevent any immediate changes, until past ones are done (or at least all new changes have a date after the last unapplied one). Or you can change it now, and change it again in a month.

I think storing the change of data is handled in the background, Look into data warehousing and slowly changing dimensions http://en.wikipedia.org/wiki/Slowly_changing_dimension in a Stored Procedure to handle new records and predecessors of those new records which will be known as "expired records". Once you allowed for SCD it's quite easy to find those historic expired records that you're after.

VS2010 Database Project Deployment, to fail if data loss may occur or not?

I have a database project for a web app, and currently I have it configured to fail if data loss may occur during deployment. I feel safer this way. However I've run into a problem. I actually need to deploy changes on some things where I'm okay with the possible data loss, i.e. shortening column lengths where nothing would actually get deleted, but the system thinks it would.
I have 2 questions.
The first is this: other than enabling or disabling the catch all go or no go, is there any way to have more granular control over this process, i.e. specify columns it's okay to drop or shorten? Is there any way to get more granular control of this process?
The second is, how do you guys handle these situations? Initially I had hoped that adding a pre-deployment script to drop the columns would be sufficient, however it seems to catch drops etc. in those files as well.

No there isn't any way to control it at a more granular way unfortunately.
I disable it when I know I'll be deploying something that will cause data loss but is what I want. Then I re-enable it after. Also, I would always check the change script that comes out when deploying to production.

Just update the column in a pre-deployment script to the truncate length?
Eg : to truncate my col to 20 :
UPDATE mycol = LEFT(mycol, 20)
FROM mytable
WHERE mycol != LEFT(mycol, 20)

The Microsoft guidance is to move the data out into a temporary table in pre-deployment, let the deployment engine run a check to see whether the table contains rows (this will pass because it is now empty) and upgrade the schema, and move the data back in a post deployment script.
For more information, see Barclay Hills posts on the subject:
Managing data motion during your deployments (Part 1)
Managing data motion during your deployments (Part 2)

(ASP.NET) How would you go about creating a real-time counter which tracks database changes?

Here is the issue.
On a site I've recently taken over it tracks "miles" you ran in a day. So a user can log into the site, add that they ran 5 miles. This is then added to the database.
At the end of the day, around 1am, a service runs which calculates all the miles, all the users ran in the day and outputs a text file to App_Data. That text file is then displayed in flash on the home page.
I think this is kind of ridiculous. I was told they had to do this due to massive performance issues. They won't tell me exactly how they were doing it before or what the major performance issue was.
So what approach would you guys take? The first thing that popped into my mind was a web service which gets the data via an AJAX call. Perhaps every time a new "mile" entry is added, a trigger is fired and updates the "GlobalMiles" table.
I'd appreciate any info or tips on this.
Thanks so much!

Answering this question is a bit difficult since there we don't know all of your requirements and something didn't work before. So here are some different ideas.
First, revisit your assumptions. Generating a static report once a day is a perfectly valid solution if all you need is daily reports. Why hit the database multiple times throghout the day if all that's needed is a snapshot (for instance, lots of blog software used to write html files when a blog was posted rather than serving up the entry from the database each time -- many still do as an optimization). Is the "real-time" feature something you are adding?
I wouldn't jump to AJAX right away. Use the same input method, just move the report from static to dynamic. Doing too much at once is a good way to get yourself buried. When changing existing code I try to find areas that I can change in isolation wih the least amount of impact to the rest of the application. Then once you have the dynamic report then you can add AJAX (and please use progressive enhancement).
As for the dynamic report itself you have a few options.
Of course you can just SELECT SUM(), but it sounds like that would cause the performance problems if each user has a large number of entries.
If your database supports it, I would look at using an indexed view (sometimes called a materialized view). It should support allows fast updates to the real-time sum data:
CREATE VIEW vw_Miles WITH SCHEMABINDING AS
SELECT SUM([Count]) AS TotalMiles,
COUNT_BIG(*) AS [EntryCount],
UserId
FROM Miles
GROUP BY UserID
GO
CREATE UNIQUE CLUSTERED INDEX ix_Miles ON vw_Miles(UserId)
If the overhead of that is too much, #jn29098's solution is a good once. Roll it up using a scheduled task. If there are a lot of entries for each user, you could only add the delta from the last time the task was run.
UPDATE GlobalMiles SET [TotalMiles] = [TotalMiles] +
(SELECT SUM([Count])
FROM Miles
WHERE UserId = #id
AND EntryDate > #lastTaskRun
GROUP BY UserId)
WHERE UserId = #id
If you don't care about storing the individual entries but only the total you can update the count on the fly:
UPDATE Miles SET [Count] = [Count] + #newCount WHERE UserId = #id
You could use this method in conjunction with the SPROC that adds the entry and have both worlds.
Finally, your trigger method would work as well. It's an alternative to the indexed view where you do the update yourself on a table instad of SQL doing it automatically. It's also similar to the previous option where you move the global update out of the sproc and into a trigger.
The last three options make it more difficult to handle the situation when an entry is removed, although if that's not a feature of your application then you may not need to worry about that.
Now that you've got materialized, real-time data in your database now you can dynamically generate your report. Then you can add fancy with AJAX.

If they are truely having performance issues due to to many hits on the database then I suggest that you take all the input and cram it into a message queue (MSMQ). Then you can have a service on the other end that picks up the messages and does a bulk insert of the data. This way you have fewer db hits. Then you can output to the text file on the update too.

I would create a summary table that's rolled up once/hour or nightly which calculates total miles run. For individual requests you could pull from the nightly summary table plus any additional logged miles for the period between the last rollup calculation and when the user views the page to get the total for that user.
How many users are you talking about and how many log records per day?

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Design approach for feeds - algorithm

Related

can't find what I want in logminer, in fact, can't find anything recent

Locking records returned by context? Or perhaps a change to my approach

How to model data planning

VS2010 Database Project Deployment, to fail if data loss may occur or not?

(ASP.NET) How would you go about creating a real-time counter which tracks database changes?

Categories

Resources