Worth caching a dynamic comments system? - caching

I plan to build a web site where users can leave some comments (like dig, facebook etc..) I am wondering whether it is useful or not to cache comments.
Do users usually read more comments than they write some?
With your experience, what could be read/write ratio for dynamic comments ?

A properly-designed comments component will have the same interface whether you cache them or not, and cacheing should be trivial to retrofit if you implement it sensibly.
Start without caching, measure the performance for your actual application. If it's a bottleneck, then add cacheing and measure the difference.
Don't waste your effort until you know you have a problem.

Related

instantaneous language translator

I am developing a new application for iPhone, the app must support two languages: French and Flemish.
If i will be implementing my database and store the same data on the two language, that will be a data redundancy issues which is not the aim of the database. right?
So, i am thinking about an instantaneous translator, for example, the default language and data on the DB are on French, if the user choose the Flamand language, all the data retrieved from the database (in French) will be translated in Flamand before being shown to the user.
Is this a good way, if yes, is there a translator on iOS SDK? is it the optimal solution?
Waiting for your suggestions. Thanx in advance.
To add to Dr.Kameleon's answer, I'd advise you to store both languages in your database. The same content in 2 languages is different content. But I'd also advise you to have a proper, manual translation, and not use automated translation for any professional grade app.
Why don't you try some service like Google Translator with an API publicly available?
Hint: I don't think Google's service is still open for the public (obviously because of extensive abuse, but I think Altavista had something like an alternative)
UPDATE :
Google Translate API v2 (paid service only, as far as I know...)
Bing Translation API (seemingly free)
Not (personally) tested :
Mygengo Translation API
Speaklite Translate API
WebServiceX Translate API
And an example script to access Altavista's BabelFish translation service :
http://code.activestate.com/recipes/64937-babelizer-api-for-simple-access-to-babelfishaltavi/
It depends on what you're optimizing on. Storing the information twice isn't as bad an idea as it might at first appear. There are many cases where it can be worthwhile to have redundant information in a database for computational efficiency, for example, and this may well be one of them.
The major cost of storing the data in both languages is that... well, you're storing the data in both languages. This means that you'll take about twice as much space to store your text blocks. If you have enough text that storage space is actually an issue for you, then that's obviously a concern. If you don't, It's really not.
On the other side, there are a few benefits to storing both.
Accuracy. No automatic translator is going to be as good at coming up with quality translations as a reasonably competent human translator. Of course, if you aren't hiring a human translator, and are just depending on machine translation anyway, then that's not so much of an issue.
Speed. Autotranslation isn't entirely trivial in processing time for large documents. CPU cycles spent on translation are cycles not spent on other things, and because those cycles must be spent between request and response, it'll make your latency worse regardless. If you have plenty of CPU cycles and the text blocks you're putting out are relatively small, that's less of an issue.
Security and Reliability. If you are intending to use an outside service to run these translations for you, suddenly your service is dependent on that service to run, and any time you go outside for anything, you're opening up a potential security hole or two (how bad those holes are depend on how you're doing it, but they'll be there.) Alternately, if you're intending to run the translation in-house, you have to keep a translation service up and running, which may not involve security problems, but will involve additional maintenance.
So... while it's possible that your case is one where you'll want to save it in only one language (particularly if you have a lot of text overall to deal with, it comes out in small chunks, and you don't care all that much about the user experience of your Flamand-speaking users) it's also quite possible that it's not.

Website performance certification

My client is looking for performance audit, something similar to yearly security audit.
Are there any reputed services or vendors that measure and analyze a given website performance and more importantly certify the performance data.
My client's intent is to share such data with future customers.
I suggest using a waterfall graph to show the performance. I have used webpagetest and am pretty happy with it. It is is also creditable because a lot of big companies use it.
Here is a sample run for SO: http://www.webpagetest.org/result/111031_H2_21NJR/1/details/
So for example time to first byte was 200 ms. This means that browser doesn't start rendering anything until after 200ms. Keeping it < 800ms is generally a good idea.
If you are looking at companies that do this performance test, I would be cautious because they will all say yes and just go to a similar website and say here is your performance analysis.

When is it too late to optimize for performance?

I know that you shouldnt optimize too early, and you should instead aim for maintainability. My question is, at what point is it too late?
I'm working on a website, similar to yahoo answers, and my database structure is exactly what I feel it should be. Table for users, questions, answers, question_comments, answer_comments, etc.
My question is, IF the site were to grow, how would this architecture scale? I'm thinking of putting both questions and answers in a single table (posts), separating them by type, and then putting both question_comments and answer_comments in the same table (comments). I believe this is similar to stackoverflow's DB scheme.
I know what you guys are gonna say, "Dont worry about it until it becomes an actual problem". But wouldn't it be a little too late to worry about it then?
Thanks
The reason why it's a bad practice to optimize early is you don't know where your bottlenecks will be until your website sees a significant amount of traffic. How your users access and interact with your site is an unknown at this point.
It's almost always best to start with a 'good' architecture (normalized database, MVC architecture, DRY, well-written frontend code, etc) and go from there. It will be much easier to scale a clean, organized architecture than one that was prematurely optimized.
At best right now you can do some load testing via ab or another load testing tool to see where your current bottlenecks are. It certainly won't find all of them, but it will find some.
If you're really worried about this (and you shouldn't be yet), install Nagios or Munin on your server to monitor performance. Use a third party tool to measure page load time daily. Once you start seeing issues then you can profile and tune.
You absolutely should optimize if a fast service is a fundamental requirement of the application.
If sub-second responses are not a requirement, than you can write clean code and optimize later.
A good example of this was JavaScript before the latest version of browsers, people who wrote nice, clean, extensible JS for their pages had terrible performance and had to start from scratch.
One huge table is generally harder to maintain. People usually cut their tables into partitions and even their databases into shards.
I don't see how putting all comments into the same table would save you a join. Really, putting questions and answers into the same table won't save you a join either, you'll just be joining by the same table.
If you want to save on joins, I'd expect you use a document-oriented NoSQL database, such as MongoDB. That's where you can store a question with all related answers and comments in a single 'record', fetchable with one operation.
Databases need to be designed with performance in mind not wait until you havea problem later. Premature optimization doesn't mean don't do it in design, it means don't get ridiculously excessive about it. However, there are known performance killers for every database backend and it is foolish to design to use one of those when a differnt technique will be faster and take the same amount of time to write code for if you are familar with it. So before designing any database, read up on performance tuning and you will never write database code the same way again.

How to represent data in an efficient way ? (Graphically Talking)

Before going for further reading, just to let you know this question is vague and do not need one precise answer. To the contrary more answer I get better it will be for me.
The question is : How to represent data in an efficient way ?
I am not talking about representing data into a database or any language.
I am talking about when a program, a report, a page needs to be shown to a user (Static - report- and Dynamic - web pages -) how one should represent the data in order to the user to catch as many information as possible from - almost - the first look. Is there any best-practices, pitfalls to avoid and stuff ?
Edit: Any book/link that can help or that treat about this subject are welcome.
"how one should represent the data in order to the user to catch as many information as
possible from - almost - the first look."
To me, this screams that you need to be speaking to your end-users more. My suggestion would be to mock up the initial layout using something like Balsamiq Mockups (This can be done even if it's a public facing site). Using the mockups will help you visualise the design of the overall page.
"First-look" type views indicates a dashboard which provide overall, high level results.
Now, just to be clear, this is the design and layout of the page and don't confuse this with any web UI tools eg JqueryUI that bring fancy effects to the page.
In terms of links, my suggestion would be thoroughly read through Designing User Interfaces For Business Web Applications from Smashing Magazine (incl. the related links). The one that is probably most relevant is 12 Standard Screen Patterns.
It is a brilliant read and should be, IMO, added to your saved bookmarks.
Effectiveness is always matter then efficiency. Before I express my opinions, I suppose that your question already based on effective solution from user's perspective.
First, data retrieving is about the storage of computer system. If your data can reside totally in the fastest storage(like main memory), keeping data in it is a better strategy than others. But the problem about performance issue is mostly because of non-enough main memories, so the data should be retrived from secondary storages(the slower one) and replace other data in main memory, and produce what you want. So you have to deal with multi-level storage systems.
Second, when you are dealing with multi-level storage systems(as most computer systems), the efficiency ways depend on how much the reductions of access in secondary storages. It's not noly about the gain in loading data from slower storage to faster one, but also, there are sacrifices that the data get kicked out.
In XML, DOM and SAX are two extremities of dealing with multi-level storage systems. In database systems, fully cached indexes are a good solution for performance(when indexes are small enough). In operating systems, file cache is alwasy the one of the most challenging things in computer science.
You can pre-calculating some data before required. You can using more efficient data structures to improve retriving data. You can rudely allocating more main memories to your application. You can... well, buying more memory modules or SSD. Whatever solutions you choose, it's definitely art of fusion in computer science.
Algorithms, data structues, database systems, operating systems, even theories of compilers, these hard metals can help you build a sword which kicks the dragon's ass.

How can I sell non-functional testing tool to my company?

I need to present performance test tool to management team of my company. Some of them think performance testing is not necessary for us because our customer never request or give requirement about performance to us.
However one of our current big project found performance problem, responding time is very long, server down when it handle many concurrent user.
I think I need to prepare myself to present about it benefit both concrete and non-concrete. Anyone have experience with performance testing tool? How it can empower your productivity?
Management cares about money. Show them how your tool will save them money and you will get their approval. Everything else is usually trivial to them.
Expanding on what #LWoodyiii had to say. When presenting a case for anything, whether that be hiring more people, or investing in a performance testing tool (or outsourcing your performance testing for that matter), it needs to be presented in terms of money saved. By doing a little leg work, you should be able to back into the $ saved amount.
If you had never had any performance problems, then it would be more difficult to quantify $ saved. But in your case, it should be a little easier to figure this out, as you have already had some significant performance issues. You should be able to put a $ amount to your existing performance problem. You should be able to quantify the revenue lost (lost transactions, lost customers, decreased transaction throughput, etc...) due to the degradation of service. You can also factor in costs associated with fixing and resolving the performance issue. Then it is a matter of comparing the costs of having performance problems vs implementing a performance testing program (tool, training and resource costs).
It also probably would not hurt to spice up the presentation with some anecdotal performance horror stories that were well publicized in the news and how much those outages cost those firms.
It sounds like you haven't used profilers yourself. That would be a good start. You didn't mention your environment but red-gate makes a wonderful profile for .NET.
http://www.red-gate.com/products/ants_performance_profiler/index.htm
Whatever environment you're in, you can probably find a decent profiler with a trial period. Use the trial period to profile your app and get to know how profilers work and how they can help make your app better.
One thing to demonstrate about productivity is how they can let you focus on the biggest bottlenecks and have the most impact on improving performance with the least effort. With a good profiler you won't bother optimizing code that is already performant.
Of course, if your company really don't care about performance they won't want you doing any optimization anyways. There are lots of companies like this and it stinks.
I think performance is one of the trivial cases that is really difficult to present to someone else especially the management. You should have some "Clear" plus "simple" way to show the use of it.
I have the experience with profilers like JBuilder and YourKit but no other performance tools. But I think the "Numbers" shown on them are not sufficient to show the use for them.
If you can build a nice practical example it would be great. Show same case for both scenarios. If you can show that the old one's response time is large and after performance improving the same operation takes much less time then it is a good way to prove your claim.

Resources