Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
While doing a POC around Microservices architecture; one of the challenges that I need to explain it that how to obtain reporting data from different services in an effecient way?
I would appreciate guiding me in the right direction.
If the data spans over multiple microservices then it depends on the business use case. In my opinion there are couple of ways to do it
Approach 1 query microservices dbs (not a preferred approach)
If your microservices are not very load intensive then you may query the data from all the services databases at off peak time and insert records into your warehouse database. This is not preferred approach since you are still putting additional load to services but it's easier . Also the reporting data may not be in realtime.
Approach 2 Event sourcing/CQRS
This approach is very preferred since your write and read models are completely separate. In brief the way if works is events generated by your different microservices will also be updating your read models called materialized view. If you have requirement where your reporting data requires near real time data then this is the way to go forward. You can shape your reporting model as you like and you can create multiple reporting models using events. But this is complex approach and require application design accordingly. However the benefits are countless. You may want to reach more about Event Sourcing and CQRS if you are interested.
Approach 3 have read only replicas
If you are using cloud services you can create readonly replicas of your databases and can use them for reporting. this is widely accepted approach since you are not impacting transactional databases. but this may be expensive since you are paying for additional databases.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
It can be while creating a table or while using other queries like Inserting, updating, deleting on a table.
I understood that using options like BloomFilter, BlockCache can have an impact. But I would like to know the other techniques that will improve the overall throughput. Also can anyone show how to add a BloomFilter on a Hbase table. I'd like to try it for practicing.
Any help is appreciated.
You question is too general. In order to know how to properly build you DataStore in HBase you should understand its internal logic of the storage and how data is distributed across the regions. This is probably the main place for start. I would recommend you to get acquainted with LSM-tree and how HBase implements it in this article. After this I would advice you to read about the proper design of the data schema here as it would play the main role in your performance. Correct schema with good key would make your data properly distributed across the nodes and would avoid you from having such thing as hotspotting. Then you can start looking through optimization techniques like blume filters, BlockCache, custom secondary indexes and other stuff.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am in the process of creating a survey engine that will store millions of responses to various large surveys.
There are various agencies that will have 10-100 users each. Each will be able to administer a 3000+ question survey. There will be multiple agencies as well.
If each agency was to have hundreds of thousands of sessions each with 3000+ responses, I'm thinking that hadoop would be a good candidate to get the sessions and their response data to run various analyses on (aggregations etc).
The sessions, survey questions, and responses are all currently held in a sql database. I was thinking that I would keep that and put the data in parallel. So when a new session is taken under an agency, it is then added to the hadoop 'file', such that when the entire dataset is called up it would be included.
Would this implementation work well with hadoop or am I still well within the limits of a relational database?
I don't think anyone is going to be able to tell you definitively, yes or no here. I also don't think I fully grasp what your program will be doing from the wording of the question, however, in general, Hadoop Map/Reduce excels at batch processing huge volumes of data. It is not meant to be an interactive (aka real-time) tool. So if your system:
1) Will be running scheduled jobs to analyze survey results, generate trends, summarize data, etc.....then yes, M/R would be a good fit for this.
2) Will allow users to search through surveys by specifying what they are interested in and get reports in real-time based on their input....then no, M/R would probably not be the best tool for this. You might want to take a look at HBase. I haven't used it yet, but Hive is a query based tool but I am not sure how "real-time" that can get. Also, Drill is an up and coming project that looks promising for interactively querying big data.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have an existing Oracle 11g DB, with a high transaction volume application running on it. I have another application (a CMS), and am not sure if, performance wise, it makes sense to reuse the existing Oracle DB, or go with a separate database on another physical machine. The two apps share no common data.
My question is: does Oracle 11g (Enterprise) have features which would allow two entirely separate data sets to be accessed simultaneously, with the only performance limitation being the physical/virtual server resources available?
This question doesn't apply because my data sets are completely unrelated (and they're on MySQL). I checked out Oracle's suggestions for application performance, but this paper doesn't address optimizing performance for separate applications with separate data sets running on the same database.
The direct answer to your question is: no, Oracle doesn't have features to do that kind of separation if don't consider ANY kind of change in you infra-structure.
As far as I can see, your options, with Oracle, would be:
1) Single instance.
1.1) Just one node (your the case now, right!?). Oracle Enterprise scales adding nodes so, this option won't scale and the two schemas/data sets in same database will get in each other way.
1.2) Add more nodes. You can add more nodes to share load (using RAC). Administration would be more complex and licensing costs would go up. But in this case, scalability is only limited by your budget.
2) Two separate instances in separate machines. Equivalent to using a new database in MySql (not minding about the differences in capabilities and pricing).
MySql is inferior to Oracle in many ways but clearly superior in setup costs. Not so sure about maintenence/development costs.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
4 years ago, I've built a webapp which is still used by some friends. the problem with that app, is that now it has a huge database, and it loads very slow. I know that is just my fault, mysql queries are mixted all over the places(even in the layout generation time).
ATM I know some about OO. I'll like to use this knowledge in my old app, but I don't know how to do it without rewriting all the from the beginning. Using MVC for my app, is very difficult at this moment.
If you were in my place, or if you will had the task to improve the speed of my old app, how you will do it? Do you have any tips for me? Any working scenarios?
It all depends on context. The best would be to change the entire application, introducing best practices and standards at once. But perhaps would be better to adopt an evolutionary approach:
1- Identify the major bottlenecks in the application using a profiling tool or load test.
2 - Estimate the effort required to refactoring each item.
3 - Identify the pages for which performance is more sensitive to the end user.
4 - Based on the information identified create a task list and set the priority of each item.
Attack one prolem at a time, making small increments. Always trying to spend 80% of your time solving the 20% more critical problems.
Hard to give specific advice without a specific question, but here are some general optimization/organization techniques:
Profile to find hot spots in your code
you mention mysql queries being slow to load, try to optimize them
possibly move data base access to stored procedures to help modularize your code
look for repeated code and try to move it to objects one piece at a time
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I often see the phrases 'business logic' and 'application logic' in terms of web development (I assume it also applies to programming in general rather than just web development).
This is quite new to me so I don't really know what it means, could anyone please explain me what is exactly meant by this? Is it just a 'buzz word' used by programmers? Or?
Say you write a system which solves a business need for a customer.
The sum of all your code is the application logic, or system architecture - basically the entirety of the system you're building.
The business logic is the code subset which models and drives actual business processes. "What happens when an order for Product X is placed? How is the cost of Product Y calculated?" Ie. the bits of code where you probably need some input from the customer/domain expert/project stakeholder.
Ideally, the business logic is separated into its own tier or layer (see the Wikipedia article on N-tier architecture). The rest of the code can often simply be thought of as infrastructure to help that business logic execute (database wrapper, helper functions, service facades, external integration, GUI, etc).
Business logic is basically rules of the system according to functional specifications. For example Object A of type B must have attributed C and D, but not E.
Application Logic is more of a technical specification, like using Java servlets and OJB to persist to an Oracle database.
In the end, that are buzz words to help describe tiers of technology in an application. Hopefully in an effort to keep various tiers separated making a better application design.
It might be not very accurate, but I use the following thinking to determine whether it's application, business logic or something else: