Auto-Completion [closed]

Auto-Completion [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
How does Google or amazon implement the auto-suggestion at their search box. I am looking for the algorithm with technology stack.
PS: I have searched over the net and found this and this and many many more. But I am more interested in not what they do but how they do it. NoSQL database to store the phases? or is it sorted or hashed according to keyword's? So to rephrase the question: Given the list of different searches ignoring personalization, geographic-location etc, How do they store, manage and suggest it so well.

This comes under the domain of stastical language processing problems. Take a look at spelling suggestion article by Norvig. Auto - completion will use a similar mechanism.
The idea is that from past searches, you know the probability of phrases (or better called bigram, trigram, ngram). For each such phrase, the auto complete selects the one having max value of
P(phrase|word_typed) = P(word_typed|phrase) P(phrase) / P(word_typed)
P(phrase|word_typed) = Probability that phrase is right phrase if word typed
so far is word_typed
Norvig's article is a very accessible and great explanation of this concept.

Google takes your input and gives TOP4 RESULTS ACCORDING TO THE RANK IDs [if results are less it returns parameters as empty strings] given to different keywords which differ dynamically by the hit and miss count.
Then, they make a search query and returns 4 fields with url, title, and 2 more fields in Json, omnibox then populates the data using prepopulate functions in Chrome trunk.

Related

Do Google SEO Content Keywords matter and should I remove sidebar content?- [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 6 years ago.
Improve this question
I run a Magento based store website.
At the side of every product page we have delivery information.
Because of this, Google Webmaster tools picks up words such as 'delivery' 'orders' 'returns' as significant keywords - rather than more relevant 'industry specific' keywords.
Does it matter that he gives 'delivery' a higher significant rating?
Should I remove the delivery info from the side of each page?
Or is there a way to disavow keywords to tell Google that 'delivery' isn't relevant?
Or maybe turn the text info at the side into a graphic instead?
Many thanks!

Before SEO, you should always consider what is best for your user. If displaying shipping information in the sidebar is going to enhance the user's experience, leave it. If the information could be put on a page and a link can be added the sidebar, do that.
Having said that, I wouldn't worry about it. Unless you're trying to rank for the keywords 'return' or 'delivery', you're not likely to notice any sort of algorithm penalty that comes from having the words appear all over the website.
Furthermore, a keyword stuffing penalty is applied to each page individually. You should be careful with stuffing your keywords in tags on the side, as it will increase the keyword density.

A generic algorithm for extracting product data from web pages [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Preface: this might seem to be a very beginner-level question maybe stupid or ill-formulated. That's why I don't require a determined answer, but just a hint, a point, which I can start with.
I am thinking of script, which would allow me to parse product pages of different online retailers, such as Amazon, for instance. The following information is to be extracted from the product page:
product image
price
availability (in stock/out of stock)
The key point in the algorithm is that, once implemented, it should work for any retailer, for any product page. So it is pretty universal.
What techniques would allow implementation of such an algorithm? Is it even possible to write such a universal parser?

If the information on the product page is marked up in a structured, machine-readable way, e.g. using schema.org microdata, then you can just parse the page HTML into a DOM tree, traverse the tree to locate the microdata elements, and extract the data you want from them.
Unfortunately, many sites still don't use such structured data markup — they just present the information in a human-readable form, with no consideration given for machine parsing. In such cases, you'll need to customize your data extraction code for each site, so that it knows where the information you want is located on the page. Parsing the HTML and then working with the DOM is still often a good first step, but the rest will have to be site-specific (and may need to be updated whenever the site changes its design).
Of course, you can also try to come up with heuristic methods for locating relevant data, like, say, assuming that a number following a $ sign is probably a price. Of course, such methods are also likely to occasionally produce incorrect matches (like, say, mistaking the "$10" in "Order now and save $10!" for a price). You can adjust and refine your heuristics to be smarter about such things, but no matter how good you get at it, there will always be some new and unexpected cases that you haven't anticipated.

Google Places API - can I save place_id? [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed last month.
Improve this question
We develop a platform for building travel itineraries.
The travel plan (=trip) is combined of places ordered by a user defined flow.
We want to use Google Places API for searching places. We would like to store a place_id, and use it for retrieving a trip info. The place_id will be used for fetching that place details from Google.
A place_id will be saved for future use only if a user decided to include that place within his trip itinerary.
Is it permitted according to the terms of use ?
Thanks!
Orit

Yes you can
Place IDs are exempt from the caching restrictions stated in Section 10.1.3 of the Google Maps APIs Terms of Service. You can therefore store place ID values indefinitely.
Referencing a Place with a Place ID

I am currently asking the exact same question to myself.
Reading throught the Google Places API documentation (as you also did I guess), I haven't found a clear/explicit answer to that question.
However, several parts of the documentation make me think that place_id can be saved and used later on to retrieve a place result.
In the "Place Details Results" section, it is said that the "id" property is deprecated and should be replaced by place_id. As "id" was said to be a stable identifier, I conclude that place_id is also a stable identifier.
Another interesting part is the one about the "alt_ids": it is said that a given place can see its place_id changing over the time: when the SCOPE changes, a new place_id is attributed to this place. So, I would said that:
a place_id is unique and stable for a given place and a given SCOPE (APP|GOOGLE), as long as the place exists.
a given place will remain searchable using any the place_id previously attributed to this place
using an APP scope place_id, there is no guaranty that the result sent in the response has the same place_id (it is not a problem, but it need to be kept in my mind, from a developing point of view)
At the end, unfortunately, I have no definitive answer. It is just my current assumptions. I hope somebody will help us with that question.
Regards
Philippe

Weird Dynamic SQL in VBA Heisenbug [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a piece of code that generates a pretty long SQL statement, with some dynamic elements to it. This code has been written from the formatted query that I wrote as a parameter to my database querying function: Up to whitespace, the two queries are exactly the same. Indeed, if I copy the generated SQL and apply :%s/\s\{2,}/ /g to it in Vim, the output is identical to the original query (comments removed), with :%j followed by :%s/\s\{2,}/ /g applied... However, the queries produce different outputs!
Actually, they produce different outputs some of the time. When I tried investigating this in my querying tool, the VBA-generated SQL still didn't work as expected, whereas the original did. When I applied the above whitespace-removing transformations to the VBA-generated query, it did work; but what's weird is that the originally generated query (with the extra whitespace) suddenly started working! However, it's inconsistent: there's no deterministic pattern (under my control) that will guarantee the extra whitespace version to work. (My guess is that this may be a caching phenomena, courtesy of the database server.)
Anyway, I guess my question concerns whitespace: I was always under the impression that whitespace was irrelevant to SQL, beyond delimitation. Is this not the case, or is something else going on here? Maybe the generated SQL string is too long (> 6kb)... Any ideas?

One idea to make the query less complex from app perspective is to take the complex part and bury it in a view, then from app, just
select c1,c1, ... from myview;

How can I index my source code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
Are there any tools out there that will index source code, client side, and provide blazing fast search results?
How can I index our internal source code? is related but covers server side tools.

Everything and Locate32 are nice indexing-tools on the windows platform. Just one problem, they only index the file-names.
DocFetcher is another solution, it tries to index the content of the files, but have big memory issues as it cannot index the content of bigger files, and just skips them
I'm also on the search for something to index my data, and i want some tool like locate32 wich is supernice to integrate with the windows shell, but it would be nice to get it to index the content of files also, only brute word indexing, no magic to be done to the data, but let me do plain wildcard searches, like words starting with, ending with, and containing.
But the search is still on.. (for an app, that is..)

Install ctags.
Then ctags -R in the root of your source tree. Many editors, including Vim, can use the resulting tags file to give near-instant search results.

I know this is an old question, but maybe this will help someone else.
Take a look at CodeIDX: http://sourceforge.net/projects/codeidx/.
Using CodeIDX you can index multiple directories using filetype filters and search the created index.
You can open multiple searches at the same time and the results can be viewed in a preview.

Using GNU Global you can get browsable, searchable source code. You can run this locally too or use all the tools that go with it (like less to go straight to a function definition).
See http://www.tamacom.com/tour/kernel/linux/ for an example of the Linux Kernel.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio