Multi-Language Websites - language-translation

Can anyone recommend a good option to translate websites into Spanish? We tried using the Google translate plugin but the translation was so rough (very inaccurate, bordering on embarrassing the company) we had to hire a company to refine the translation so that it was much more accurate which makes for an extremely inefficient process for updating the site moving forward.
We're in health insurance, so the language we're translating is very specialized in nature and needs to be accurate for our members. To make it even more complicated, the Google Translate plugin happens instantly, so the translation is live before we have a chance to refine it before users can see it. In other words, there's no way to refine the translation before you make the content visible to users in the production environment. This is a legal regulatory requirement for Covered California and the Affordable Care Act, so it has to be a top notch implementation.
Short of a proxy solution that intercepts the content before it hits the production site or a separate site coded in Spanish, I'm not sure what other solutions exist if any. Ideas? The separate site solution is also problematic because it requires a bilingual staff and it doubles the work because both environments have to mirror each other exactly at all times.
Recommendations? Ideas? Any suggestions based on experience are most welcome!

Hire developer - he will describe all you need. You will never do it by your own. If you already have - hire new one, he will know how to do it. Question is very spiciefied but any (let's take for example php) php-engine (framework) or even custom php-engine can be updated the way you want.
Preview before upload to public? Easy! Change by moderator|admin values of translations? Easy! Main thing that each sentence (or even paragraph) you will describe by your own... I don't want describe all mechanism of it - hire developer and he will do all you need. $)

Related

Body Text extraction from websites e.g. extract only article heading and text not all text in site

I am looking for algorithms that allow text extraction from websites. I do not mean "strip html", or any of the hundreds of libraries that allow this.
So for example for a news article I would like to identify the heading and all the text, but not the comments section and so on.
Are there any algorithms for that out there? Thank you!
In computer science literature this problem is usually referred to as the page segmentation or boiler plate detection problem. See the report Boilerplate Detection using Shallow Text Features and its related blog post. Also, I have a few reports and software sites bookmarked that address the problem. Also, see this stackoverflow question.
there are a few open source tools available that do similar article extraction tasks.
https://github.com/jiminoc/goose which was open source by Gravity.com
It has info on the wiki as well as the source you can view. There are dozens of unit tests that show the text extracted from various articles.
"Content extraction" is a very difficult topic. There are no common standards to identify the "main-article" content (there are several approaches to make HTML easier readably for crawlers, e.g. schema.org, but none of these is very popularly used).
So it turns out, that if you want good results, its probably best to define your own XPath selectors for each (news) website you want to scrape. Although there are some APIs for HTML content extraction, but as I said its very hard to develop an algorithm which works for every site.
Some APIs you could use:
alchemyapi.com
diffbot.com
boilerpipe-web.appspot.com
aylien.com
textracto.com
What you're trying to do is called "content extraction". It turns out to be a surprisingly hard problem to solve well, and many naive solutions do quite badly.
Instapaper and Readability both have to solve this, and you may learn something from looking at their solutions. They also both provide services that you may be able to take advantage of - perhaps you can outsource your problem to them and let their API take care of it. :)
Failing that, a search for "html content extraction" returns a great deal of useful results, including a number of papers on the subject.
I compared a few different libraries, and had really great luck with Mozilla's Readability library (Node), or its Python wrapper.
For example, take this CNN article: https://edition.cnn.com/2022/06/01/tech/elon-musk-tesla-ends-work-from-home/index.html
Readability successfully returns only the relevant data:
New York (CNN Business) Elon Musk is demanding that Tesla office workers return to in-person work or leave the company. The policy, disclosed in leaked emails Musk sent to Tesla's executive staff Tuesday, was first reported by electric vehicle news site Electrek. "Anyone who wishes to do remote work must be in the office for a minimum (and I mean *minimum*) of 40 hours per week or depart Tesla. This is less than we ask of factory workers," Musk wrote, adding that the office must be the employee's primary workplace where the other workers they regularly interact with are based — "not a remote branch office unrelated to the job duties." Musk said he would personally review any request for exemption from the policy, but that for the most part, "If you don't show up, we will assume you have resigned."
etc.
I think your best shoot is study what information can you get from the metadata and write a good html parser, oEmbed could be a good standard =)
https://oembed.com/#section7

How Long: Converting HTML to Joomla pages

I would really appreciate your help with finding out how long it takes a 1-3 year experenced programmer to convert a few HTML pages into joomla 1.5 dynamic pages. I know that some of it depends on how complex the pages are but i'm talking about average pages. That's my first question, my other question is how long will it take a 1-3 year experenced programmer to install all of these componants: Video module, photo gallery module, vertuemart shopping cart. I pay programmers to do this work but i have to make as sure as i can that i'm not over paying them. Thanks in advance for answering these two questions...George
Depends on complexity and quality of html/css design. Usually 1+ hours, if you want additional modules styled( K2, etc..) you need to add extra time, if style is different for every page, than it take more, plus configuration. Basically conversion is not that difficult, just replace main text with content and add some blocks/regions. I would say average about 8 hours
As already mentioned, it depends... there's not really anywhere near enough info to give even a rough estimate.
Is Joomla already installed? If it is installed and the desired template is in place, then cutting-and-pasting some page content can take a few minutes if it's just text. If not, allow 2-3 hours for basic installation, including debugging and testing of the standard components like sending email. Then another 1-2 hours for basic installation of components. Testing, debugging and setup of Virtuemart can take a lot longer, depending on what options for shipping and payments you want.
If you're using a good pre-built template that you're 100% happy with then there should be little to do there, but just positioning and adding discovering what menus work best in which module positions can take a lot of time. Often there is no support in a template for particular components so further styling for the additions is required. Purchased templates vary wildly in quality, some are just not worth the effort, and sometimes template developers take quick-and-dirty shortcuts to get components to look good in their demo, but can take hours to sort out to be useful for anything else.
If you want the Joomla site to look like your old site, or have a custom template built, or radically convert an existing template it can take the length of a piece of string. (One of my clients will easily spend 20 hours endlessly asking me to slightly change spacing, fonts, and colors after signing off on a design and promising that he wouldn't make any more changes. I guess because he can't visualize how things look until he sees the completed site.)
There are plenty of good photo galleries that are bug-free. That shouldn't take long, especially if it comes with a template you already like.
So you may be wondering why all the estimates above vary so much. It just depends on what you've got and what you are really looking for, and what experience the "programmer" has, if it is even in Joomla, or some other CMS, or PHP, or whatever.
Step one in a project like this is to find a programmer you trust not to rip you off; get reliable references if it's someone new. Then get as good an estimate from him or her as he can give, bearing in mind he might have no experience of how you work or how detailed you've been in laying out the plan. Getting an estimate from someone else is not worth much. Then get progress reports as you go along to see where the hours are going so you can judge how to proceed cost-effectively.
is hard to answer at your Q's, the info providet by you is insuficient to make an estimation.
You didn't mentioned anything about customization, complexity of functionalities, integration of those component. Also the time frame depend on the programmer experience and knowledge in Joomla! not just Html knowledge.
Usually, to install some component is easy, lets say for the components you mentioned i need something like 6-8 h, but this is just the installation process. From here to a good joomla! website is a long way. The more time consuming is the customization and integration of all the functionalities, and this depend on clients requests.
You mentioned Virtuemart, this also can be a bottle neck, Virtuemart installation depend on the shop categories and products no., shipping integration, payment method, images processing,
Other Issue can be the template integration, for a good website is better to have the same look on all the components, VM, Photo gallery etc Your template is acquired one or is a custom development?
But to answer at your Q 1,
1) 5 to 10 pages should take 2-4h (text edit, custom typography)
2) 40 to 80h for Video module, photo gallery module, Virtuemart shopping chart
Keep in mind that is just a rough estimation.
Roland
In my opinion, it should take no more than 2-3 hours with minimal configuration included. The only thing that would need more time than this is the video component. Configuration of these components should make the time vary.

Ways to enhance a trial user's first time experience

I am looking for some ideas on enhancing a trial-user's user experience when he uses a product for the first time. The product is aimed at a particular domain and has various features/workflows. Experienced users of the product naturally find interesting ways to combine features to get the results they want (somewhat like using an IDE from a programmer's perspective).Trial users get to use all features of the product in a limited fashion (For ex: If there is a search functionality, the trial-user might see only the top 20 results, or he may be allowed to search only a 100 times). My question is: What are the best ways to help a trial-user explore/understand the possibilities of the product in the trial period, especially in the first 20 - 60 mins before the user gives up on the product?
Edit 1: The product is a desktop app (served via JNLP, so no install required) and as pointed out in the comments, the expectations can be different in this case. That said, many webapps do take a virtual desktop form and so, all suggestions are welcome.
Check out how blinksale.com handles this. It's an invoicing app, but to prevent it from looking too empty for a new account, they show static images in places where you'd actually have content if you used the app. Makes it look less barren at first until you get your own data in.
if you can, avoid feature limiting a trial. it stops the user from experiencing what the product is ACTUALLY like. It also prevents a user from finding out if a feature actually works like they want/expect/need it to.
if you have a trial version, and you can, optimise it for first time use. focus on / highlight the features that allow the user to quickly and easily get benefits for useful output from the system.
allow users to export any data they enter into a trial system - and indicate that this is possible/easy. you don't want them to be put off from trying something because of a potential for wasted effort.
avoid users being required to do lots of configuration before using a trial. prepopulate settings based on typical/common/popular settings. you may also want to consider having default settings for different types of usage. e.g. "If you want to see what the system is like for scenario X, use configuration J. If you want to see what the system is like for use case Y, use configuration K." where J & K are collections of settings best suited to a particular type of usage.
I'll speak from personal experience while evaluating trial applications.
The most annoying trial applications are those which keep popping up nag screens or constantly reminding me that I'm using a trial. Trials which act exactly like the real product from the beginning till the end of the trial period are just awesome. Limited features are annoying, the only exception I can think of when you could use it is where you have rarely used feature which would allow people to exploit the trial (by using this "once-in-lifetime" needed feature and uninstalling). If you have for example video editing software trial which puts "trial" watermark on output, I'd uninstall it as soon as I'd notice it. In my opinion trial should seamlessly integrate into user work-flow so that once the trial ends they would think "Hey, I have been using this awesome program almost each day since I got the trial, I absolutely have to buy it." Sure some people will exploit it, but at the end you should target the group which will use your product in daily work-flow instead of one time users. Even if user "trials" it 2 times per year, he will keep coming back to your product and might even buy it after 2nd or 3rd "one-time use".
(Sorry for the wall of the text and rant)
As for how to improve the first session. I usually find my way around programs easily, but one time only pop-up/screen (or with check-box to never show it again) with videos showing off best features and intended work-flow are quite helpful. Also links to sample documents might be helpful. If your application can self-present itself (for example slide-show about the your slide-show program) you could include such document. People don't like to read long and boring help files, but if you have designer in your team, you could ask him to make a short colourful intro pdf. Also don't throw all the features at the user at the same time. Split information into simple categories and if user is interested into one specific category keep feeding him more specific information. That's why videos are so good, with 3-6 x ~3-5 minute videos you can tell a lot. Also depending how complex your program is you could include picture with information where specific things are located on the screen.
Just my personal opinion, I have never made a trial myself. Hope it helps.
An interactive walk through/lab exercise that really highlights the major and exciting offerings of your application.
Example: Yahoo mail does the same when the users opt to use new mail interface
There are so many ways you can go with this. I still can't claim to have found the best approach.
However, my plan from the beginning with my online (Silverlight) software was to give away something thousands of people will find useful and can use for free. The free version is pretty well representative of the professional product, with only a few features missing that enhance productivity (I'm working on those professional features now). And then I do have a nag popup that comes up every 5 minutes suggesting that you should buy it. That popup can be dismissed as many times as you want. I know that popup will annoy some people but I suppose that's the trade off. There is no perfect plan. But I don't think the occasional nag popup scares that many people away, especially when it can be dismissed with a single click.
I was inspired by Balsamiq Mockups, which has been hugely successful over the past couple years. My trial/nag popup way of doing things was copied almost exactly from Balsamiq. I honestly don't know if this is the ideal plan, but it has obviously worked for them. By the way, I think another reason for Balsamiq's success is that the demo doesn't have to be downloaded & installed. Since the demo is in Flash, there's a very high conversion rate of users actually trying it and becoming addicted to it.

How can the CAPTCHA process be more user friendly or better implemented?

I have used CAPTCHA on my various web sites in the standard manner where I generated some obfuscated string of characters (odd pair of words, random number, etc.) in an image for the user to manually reproduce in a text box. I am also aware of recaptcha.net which extends the basic functionality of screening bots from humans and as well as helping to digitize books. I just came across another way of performing CAPTCHA with the AJAX Fancy CAPTCHA jQuery plugin which rather than asking the user to reproduce a string instead asks the user to drag an image that is readily recognizable (scissors, pencil, book, etc.) into an area that is equally recognizable. When I saw this I had to say to myself "WOW...that's cool!"
Question: Does anyone out there have any other examples of a neat and different way of performing CAPTCHA without having to generate a random string of characters into an image for the user to try and read (or regenerate until they can) so that they can manually type it into a box?
I'd like to see ReCAPTCHA implemented for images that a computer can't tell whether or not they're pornography. Web filter companies could pay free porn sites to use this system to better fill out their blacklists. The free porn sites could then make more porn, and the web filters would have more porn to block.
You can have your users tell dogs and cats apart. Microsoft's Asirra.
I know I am not particularly helpful in this answer, feel free to downvote me if it's the case, but I want to present my technical opinion (albeit of a non-expert) on captchas.
As someone said, the captcha is an antipattern of the web. Its purpose is to let you demonstrate that you are human, by doing something that only a human (purposely) can do.
Fact is that, despite the captchas, the only achieved result has been to improve pattern recognition for software, producing better bots. In this sense, it can be said that the final, real purpose of captchas was not to select humans from bots, but select better bots (or cheap workers) from lousy ones.
What you are asking is actually a matter of current research. I've seen stuff like selecting cats from dogs, solving simple math problems, recognizing apples from oranges, counting the number of people in a photo, but in the end I doubt you will get something more proficient or user friendly than what's currently available. In the end, the pure fact of having to solve a captcha is user-unfriendly.
A CAPTCHA should be a last resort, having tried other alternatives. For example you can use a honeypot technique, that uses a form field that’s invisible to a user but visible to a bot – if it gets filled in, you know it’s not from a human.
In some cases you can experiment with softer CAPTCHAs like riddles or simple math problems. The best tactic - from a User Experience perspective - is to start as soft as possible, and only ramp up if bots become a real problem.
Have you tried Friendly Captcha? It solves the puzzle by itself (without specific input by the user). The user only has to interact with the page (moving the mouse, jumping between input fields) and the puzzle will be resolved bit by bit.

What are some great web based interfaces that you use on a day to day basis?

I definitely appreciate a good interface and as a developer, I try to create them for my users. But appreciating a good interface and designing one are a different thing. I'm looking for good interfaces (such as IMHO StackOverflow, Gmail) as examples of good UI from which I can model my own UI's.
I personally think that Netflix has an excellent web UI. Responsive, easy to navigate. Not mutch CRUD going on, but I find it very comfortable.
Pretty much anything by google, really. They're all very simple and to the point, focusing on usability.
You should get yourself a copy of both Don't Make Me Think and The Non-Designer's Design Book for your base knowledge/insight.
From there, it's much easier for you to dissect and analyze the layouts you already know and like, and recreate them for your own amusement.
edit: To mitigate misunderstanding, the point I'm trying to make is that you probably don't need as many good examples of nice layouts, if you know what to look for. For example, I can be shown a thousand haute couture dresses, and I still couldn't make one myself, because I don't know what to look for.
My favorites
Stack Overflow: This is a WIKI so it's not a rep point grab. I just really love the interface on this site. Been to too many crappy Q/A sites
Google Reader
MSDN: It's gotten a ton better in recent years and is a great way to grab little esoteric details about various APIs
iStockPhoto.com it's simple, effective and handles a large amount of information and data without getting bogged down. It also doesn't get in the way of the info you are looking for.
A good user interface fulfills a specific need of its users effectively.
As an example, here is a site (translation) that I have created for finding out what food is available in the cafeterias of the University of Helsinki. The typical use case is that when a student is hungry, he needs to know what food is available in the neighborhood student cafeterias (which are cheap for students), so that he can choose where to eat and what. He knows where each of those cafeterias is, but does not know what food they have today.
That site shows all the needed information at once. Because the students typically have a couple of cafeterias where they go, they can either bookmark the page with those cafeterias selected, or save the selection as a cookie. After that they can reach their goal without any navigation on the web site.
I don't use it on a day-to-day basis, but I'm very impressed with the Perseus Project digital library.
Here's a link to a poem from Catullus' Carmina in Latin as an example of the interface. Some features that I really like:
Click on the bar near the top to jump to any poem in the work. Larger chunks of the bar represent larger sections of the work (poems, chapters, however that particular work is logically broken up by the author).
Click on a Latin word in the poem to bring up a window (be patient; it seems to take a while) with lexicon entries, user voting and statistics on the word form (i.e. what the inflection means in the context of the sentence; it can be ambiguous in Latin) and so forth.
There are a number of resources down the right column, including various English translations, notes, references, etc. Any of them can be either shown in the right column, or swapped out with whatever is in the main content area in the center.
One of my personal favs: newspond.com

Resources