Multi-platform social networking application development architecture [closed] - performance

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I was thinking on some social media applications like facebook or linkedin. I read lots of articles on websites like http://highscalability.com/ and didn't find the correct answer.
Because, the biggest apps of now, use their custom systems. They use custom file systems or customized db-engines or customized web servers. They don't use the original iis, apache, mssql, mysql, windows or linux. They use lots of programming language for different problems. It's OK for them because of their load. They have to calculate every bit or something. They started on some small enviroments and they encountered problems and saw bottlenecks. So they founded new solutions.
Now, we can find some articles about their current systems. But we have no answer about what is the best start.
I need to learn the answer of "What kind of architecture is a correct start?"
I have some ideas on it but we need to be sure about it.
We think,
Use mysql for relational database. And a caching mechanism like memcached over mysql. And a rest api for business layer. We think using python for codding of rest api. And all systems run on a suitable linux distro. After all of these enviroments is ok, we can use any language or system for UIs. It can be a PHP site for web or a native application for IOS or Android.
We need your advice. Thank you so much.
(I am a good reader but it's my first question. I hope there's no problem.)

Following a similar question last year I compiled the techniques and technologies used by some of the larger social networking sites.
The following architecture concepts are prevalent among such sites:
Scalability
Caching (heavily, across multiple tiers and layers)
Data Sharding (preferrably by data-locality criteria)
In-Memory DBs for often referenced data
Efficient wire-level protocols (as opposed to what an enterprise typically considers state of the art)
asynchronous processing
Flexibility
service oriented architecture as a baseline principle
decoupled and layered components
asynchronous processing
Reliability
asynchronous processing
replication
cell architecture (independently operated subsets, e.g. by geographical criteria)
NB: If you start a new site, you are unlikely to have the kind of scaling or reliability requirements that these extremely large sites face. Hence the best advice is to start small but keep it flexible. One approach is to use an application framework that starts out simple but has flexibility to scale later, e.g. Django.

Related

Why isn't Vert.x being more used? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
[It is not a dev related question]
I have been using Vert.x since 2017 and I think the framework is great. It has a better performance when compared to Spring Boot, more requests per second and less cpu usage.
It works great for event driven programming and concurrent applications.
However, I don't see the community increasing. Does anyone know what are the cons keeping developers away from Vert.x? I'm about to start a new application and I feel worried about Vert.x is dying.
Disclaimer: I work for Red Hat in the Vert.x core team.
Thanks for sharing your good experience with Vert.x.
There is no secret sauce behind community growth: you need marketing money and a dedicated evangelist team. Vert.x has neither of these BUT:
rest assured the project is not dead (we're releasing 4.0 in the coming months and Vert.x has become the network engine for Quarkus)
the community is still very strong and vibrant (users helping each other on the forum and significant features are actually contributions)
for a few years now Red Hat has offered commercial support
Rome wasn't built in a day: I first heard about Spring a few months after starting my career in IT 15 years ago...
I think that over the past 20 years (maybe more), the technologies that have been the most used are those where the developer is able to stop thinking by himself and can produce a large amount of features as quickly as possible.
In other words, it's mainly the frameworks that handle everything for you: JSF, Struts that hide the frontend complexity for the backend devs that were not qualified, Spring who takes care of hiding all the problems of exposition and resiliency behind a mountain of annotations and abstraction layers. We could observe the same thing in the PHP world with Zend, Symfony, Laravel and whatever. And lately we can say the same thing for the frontend devs with Angular.
Using a toolkit like vert.x, in my opinion and even if we find it simple, requires a better understanding of what we're doing. We need to be aware of the reactor pattern, the asynchronous paradigm, the reactive programing, monothread and concurrent programing, etc. We need to stop designing standard blocking restful api to solve all the issues. We need to have a better control of communication issues and failover through our microservices. Even if toolkits like akka, vert.x, quarkus, micronaut had made lots of effort to give good documentations, industrialization tools, more libs around that handle many things for you... there still is an entrance ticket that management sometimes considers (wrongly in my mind) as an obstacle to the production.
Finally, I think that when a toolkit seems to answer exactly to your need and when there is a strong community behind (that doesn't have to be the biggest but that is made up of available experts and great oss companies like RedHat), you shouldn't wait to give it a try. It's often a better answer than big frameworks that handle too much things in the same box.

What are the benefits of developing chatboat using MxNet rather than API.AI (DialogFlow), LUIS, WIT.AI or any other AI framework? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I want to develop own chatbot for my retail store project. I have checked different frameworks like API.AI (DialogFlow), LUIS, WIT.AI and Whatsan virtual agent. But I also come across MXNet. So if I develop my own chatbot using MxNet then what will be advantageous over other above discussed inbuilt API
MXNet is a deep learning framework which can do general model training and inference. What API.AI, Amazon Lex, WIT.AI, etc. do is provide a platform that uses this training and inference, but is itself a separate engine and not a deep learning framework.
API.AI, for instance, offers dialog and context constructs, which allow a conversation to take place while filling in data slots as the conversation proceeds, but this is out of the scope of a deep learning engine. The chatbot platform will utilize deep learning engines (and their models) for its subtasks such as speech recognition and conversion of spoken/written text to canonical form.
Advantage of MXNet over existing framework
MXNet deep learning framework can be used to implement, train and deploy deep neural networks that can solve text categorization and sentiment analysis problems.
** Improves Synonyms, Hypernyms, and Hyponyms**
Let’s suppose that the users asks for a soda, but your chatbot only knows specific terms such as coca-cola or pepsi, that are hyponyms of soda. Hypernyms, synonyms, and hyponyms can be handled in English because there are a lot of NLP resources, called thesaurus and ontologies, but they are usually for general language. Therefore, coca-cola, a very specific domain term, is unlikely to be part of this kind of resources.
You could try to find an existing thesaurus that fits your problem or build it by your own. Resources built by domain experts are expensive but highly accurate. With Machine Learning you can create linguistic(Langauge base) resources, particularly with Deep Learning techniques, that could be good enough to your use case.
Final Conclusion
If We build Chat from scratch using MXNet, we need machine learning
experience, we need resources and time. It’s open source we can’t get
immediate support as well. So, other alternative is use a
combination of a tool for solving general NLP problems (i.e.
Dialogflow, Wit.ai, IBM watson agent assist etc.) plus custom server
side logic for more powerful features.
Source

Ultimate way to host webapplications [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I know this topic was widely discussed all over the Internet, but as an amateur in these matters I dare to ask my question.
I am looking for a flexible solution to host my webapplications. Flexible means that it would be sufficient for any kind of project - small website and Reddit-scale giant.
So far, I was thinking this way: I start with shared hosting, then my website get more and more popular so I buy a VPS, then dedicated server, then people finally notice how amazing my idea is and I have to build my own datacenter.
Currently, I am in the stage when I am moving my webapps from several shared hostings to VPS and it got me thinking - is there a better, more comfortable solution? I do not want to move to another hosting, constantly upgrade my hosting plan and still be worrying about performance.
But there is a service like Amazon AWS (and some others), which promise to provide me with comfortable solution to the problem. They say I will be charged only for what I have used and, by enabling auto-scaling, my apps will be growing with (almost) no limits. But most importantly, I would not have to worry about building my own infrastructure etc.
Of course, service like this still requires management, but as far as I am concerned it would be a task to rather small team or even one person. (Right?)
Reddit is a great example of what I am talking about. One of the most popular website on the Internet is 100% hosted by AWS.
So my question is: Is service like AWS the dream platform for entrepreneurs who are willing to create "something big"? Is it the most flexible and the most comfortable solution there is? What are disadvantages I do not see?
In my experience a dedicated server gives you a lot of wiggle room, I use it for big webapps that do a lot of data crunching, but also small websites.
If you have a decent multi cpu server with a lot of RAM and large high speed sata 3 SSD's and fiberoptic network for top speed, your server will be blazing away.
If you use a linux based server it will become easy to cluster them.
Then when you need even more speed you can just put the SQL server in an own dedicated server and you will have doubled your speed instantly.
SQL servers cluster easily among multiple computers so there will be room to grow, and linux servers also cluster easily(at least for people who know what they are doing).
I totally recommend having your own dedicated service where you decide how it grows, then to be stuk on someone elses turf and to get pinched in speed when a high paying customer needs premium speed.
Just my 2 cents

Best practices and literature for web application load testing [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
As a web developer I've been asked (a couple of times in my career) about the performance of sites that we've built.
Sometimes you'll get semi-vague questions like "will the site continue perform well, even during product launch week?", "can the site handle a million users?", and even "how is the site doing?"
Of course, these questions are very legitimate, and I have always tried to answer these questions to the best of my ability, using a combination of
historic data (google analytics / IIS logs)
web load test tools
server performance counters
experience
gut feeling
common sense
a little help from our sysadmins
my personal understanding of the software architecture in question
I have usually been able to come up with reasonable answers to these questions.
However, web app performance can be influenced by many things (database dependencies, caching strategies, concurrency issues, etcetera, user behaviour).
I'm a programmer and not a statician, and my approach to this problem has always felt deeply unscientific. So I did a little more research... and all of my google results seem to focus on tools and features and metrics (and MORE metrics) when I am really looking for a way to make sense of these things.
The question:
What are some good resources (books?) to read on the best practices for a developer to read on the subject of web load testing, that will help me answer these types of questions?
First your question proves you do understand the problem. It can sometimes be tricky enough creating the tools, scripts etc. to generate the load but the real challenge lies in evaluating the results and what to monitor.
A very easy answer to your question could be to Generate load on a production-like environment that is similar to current or expected usage. If it runs ok without any crashes or slow performance that is usually good enough. After that, increase load to see where your limits are.
When you reach your limit my experience is that this is purely a project budget question. Will we invest more time/money/resources etc to evaluate the cause.
I work as a test professional and I do recommend respect load testing as a vital part of the development process but unfortunately that is not always in line of what management decides.
So the answer to your question is that almost everyone needs to be involved in this process:
developers to monitor their code; system admins need to monitor CPU, memory usage etc.; DBA; networking guys; and so on. They all probably need their own source of knowledge to be able to get all this info recorded and analysed.
A few book tips:
The Art of Application Performance Testing: Help for Programmers and Quality Assurance
http://www.amazon.com/exec/obidos/ASIN/0596520662/
The Art of Capacity Planning: Scaling Web Resources
http://www.amazon.com/exec/obidos/ASIN/0596518579/
Performance Testing Guidance for Web Applications
http://www.amazon.com/exec/obidos/ASIN/0735625700/
Have you seen:
Performance Testing Guidance for Web Applications by J.D. Meier, Carlos Farre, Prashant Bansode, Scott Barber, and Dennis Rea
It's even available on the web for free.
http://msdn.microsoft.com/en-us/library/bb924375.aspx
You could emulate typical user behaviour and use one of the cloud services to simulate a huge number of users on the website to see how well your website handles with huge numbers of users. I heard Amazon's service is decent
I can recommend two books published in 2010:
The first is "ASP.NET SITE PERFORMANCE SECRETS" by Matt Perdeck, was published in late fall 2010. It is written more from performance optimization standpoint, but also has detail material on load testing. It is a free pdf eBook.
The second book is ".NET Performance Testing and Optimization - The Complete Guide", by Paul Glavich, Chris Farrell". It is pretty complete source on performance / load testing

How do you decide between different emerging technologies? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm facing developing a new web app in the future and I'm wondering how to decide what framework to use. I've settled on Python as my language of choice. But there are still may frameworks to choose from! More generally how do you choose between different similar technologies that are still in the works as the latest round of web frameworks are? I'm curious what your process is for deciding on technologies you've never used.
Recognize that no choice is perfect -- or even very good.
No matter what you choose, someone will have a suggestion that -- they claim -- is better.
No matter what you choose, some part of your tech. stack will fail to live up to your expectations.
The most important thing is "shared nothing" so that the components can be replaced.
After that, the next most important thing is automatically-generated features to reduce or prevent programming.
Look at Django. Lots of automatic admin features make life very pleasant.
There are a number of things you can do:
Download the frameworks and build something similar with them for comparison.
Look for comparisons by other people, but attempt to understand the bias of the reviewer.
Observe the community at work, see what people are building and the issues they run into when using the technology. Forums, blogs, mailing list etc are good places to check out.
Go to conferences and meet like minded developers interested.
You can also take the approach of using stable versions rather than alpha bits. After a while you might move closer the bleeding edge. People associated with the project in question are generally more biased than those approaching from other platforms, be careful who you trust.
Consider the impact of using a bleeding edge framework versus an established one. Sometimes it's important to your customers that you are on one perceived as stable. At other times this doesn't matter. How comfortable are you with fixing the framework itself? Great developers will learn the internals, or at least know enough to keep things moving whilst a bug is sent to the framework mailing list etc.
Consider some general best practices in building abstractions and reusable code on the python platform. You may be able to save yourself some work in moving to another platform. However, don't be a reuse junkie as this can limit the effectiveness of your use of the framework. The 37Signals guys are right when they talk about extracting frameworks from working code rather than building frameworks from scratch.
I know this is an old posting, but I am in a similar situation (again) and I think there are other people who may want to look for different opinions, and hear of (somewhat) successful experiences.
Since baudtack mentioned Python, I will try to answer this along the lines of my experiences using Python. Here is what has been working for me:
determine the scope of your project - outlining what your application is supposed to be able to do without introducing any programming or design notes will clarify your goals greatly
determine how you would like to work with your code, stack and data:
a. what sort of programming paradigm do you want to work with? i.e. object-oriented, functional, etc. do you want to play to your programming style or do you want to follow somebody else's programming style?
b. use semantic web or not? do you want greater control over URIs and their design? (I found web.py great for this by the way - It is my choice to create REST APIs in Python)
c. do you want to be trapped by framework requirements, or do you want a better separation of the application from the web component, i.e. use a framework to utilize your application as a set of modules, for example. My problem with Django was that I ended up not programming Python, but having to learn more Django than I needed to. If that works for you, then that is the way to go.
d. data stores... some sort of SQL vs. non RDBMS (xml databases like eXist-db with full xquery support) vs. OODBMS vs. a combination of the above? how complicated do you need this to be? how much control/separation do you need to have over how data gets stored and recalled in your application?
e. testing: unit tests... thank goodness for python! if your web app has the potential to grow (as they often do), having a sane and coherent testing platform to begin with will help out a lot in the future - I wish I had learned about this earlier on. oh well... better late than never.
f. how much control over the server do you need? hosting considerations? how much control over an Apache instance do you need to have? OS specific needs? I found that using shared hosting providers like Webfaction has been great. I eventually found I needed greater needs for flexibility and bandwidth. In other words, what can you get for your budget? If you have USD50 to spend each month, it may be better to consider a virtual hosting solution like Linode....
Finally, I echo S.Lott's sentiments that no choice for a solution is perfect, and are subject to obsolescence.
Experience trumps hearsay. I've found that prototyping is a huge help. Make a prototype that uses the features you expect to be the most important for various frameworks. This helps route out any features that may not work "as advertised."
In general though, kudos for being willing to look at new technologies.
I have a set of criteria in different categories:
Activity & Documentation
Is there an active user base?
Is there an active development base?
Is the support responsive and information accessible?
Are there user and development guides and reference material?
These are essential, there needs to be traceability of all of these to build confidence in the solution.
Ease of use
Are basic features easy and complex features possible? I typically give a new framework a test drive and try to roll out a set of use cases to see how intuitive the framework is to use.
Is installation intuitive and simple for a local/dev installation and production deployment?
How is it backed up and upgraded?
What is the effort and UX for implementing a "Hello World" type blog post, static page, menu item, and plugin?
How are versions dealt with for the core & plugins?
Example (on the topic of Automated Testing/Continuous Integration solutions)
Several years ago I evaluated several Automated Testing solution. At the time Jenkins and TeamCity were front runners and in the end I chose TeamCity because of the UX, active user & development base and quality of accessible documentation.
Example (CMS for a blog)
This criteria is also why I prefer to use Wordpress over other options. While wordpress has its shortcomings, the user and development base is strong and active which leads to a software architecture with more potential to evolve over time and maintain its relevance and a development community that provides quality plugins and themes to choose from.

Resources