Save queries from AJAX autosuggest search - ajax

I want to save search queries from an AJAX autosuggest search textbox. When the user types in a character the search results are immediately shown.
The problem is to decide when a string is considered to be a query. When searching for "Lemon" it's not desirable to log L, Le, Lem, Lemo, Lemon. In this case only Lemon should be saved.
Also, sometimes a misspelled word is also of interest. "Lemmon" would be desirable to save since it would give the website owner valuable feedback about search queries that result in no items, when the user probably was expecting some.
Any ideas?

You cannot programmatically decide, when it is a query, but the user can. You have to take the user-actions and save when he consideres it a real query.
For example:
You display some autosuggest, and the user clicks on it. Now you only save this click as his search query (and maybe what he wrote into the searchbox)
When the user submits the form, you save his query as a "Searchable World" and compare it to your autosuggest list.
You have a Database of useful words, and when he types in one of these, you save this (by a counter?)
You should combine the first 2 Solutions to get a quite intelligent Database, but then you'll get intelligent data!


ElasticSearch: A way to know which term hit in which field?

there are usecases where I really would like to know which term was matched in which field by my search. With this information I would like to disclose the information which field caused the hit to the user on my webpage. I also would like to know the term playing part in the hit. In my case it is a database identifier, so I would take the matched term - an ID - get the respective database record and display useful information to the user.
I currently know two ways: Highlighting and the explain API. However, the first requires stored values which seems unnecessary. The second is meant for debugging only and is rather expensive so I wouldn't want it to run with every query.
I don't know another way which is confusing: The highlighting algorithms need the information I want to use anyway, can't I just get it somehow?
On a related note, I would also be interested in the opposite case: Which term did not hit at all? This information would allow for features like "terms that didn't match your query" like Google does sometimes (where the respective words are shown in grey-strikeout).
Thanks for hints!

Query multiple strings in a field in kibana3?

I am using Logstash 1.4.1, elasticsearch 1.1.1, kibana 3.1 for analyzing my logs. I get the parsed fields (from log) in Kibana 3.
Now, I have often query on a particular field for many strings. Eg: auth_message is a field and I may have to query for like 20 different strings (all together or separately).
If together:
auth_message: "login failed" OR "user XYZ" OR "authentication failure" OR .........
If separate queries:
auth_message: "login failed"
auth_message: "user XYZ"
auth_message: "authentication failure"
So user cannot remember 20 strings for a field to be searched for. Is there a way to store or present it to user to select the strings he wants to search for.
Can this be done using ELK ?
First, "pin" your query. Meaning that once you have made a query you are statisfied with, click the small colored circle, make the drop-down menu appear and click the "pin" button.
Then in every panel of your interface, go to Configure -> Queries, and in the dropdown list chose which query should be charted in this panel, you can select either all, pinned, unpinned, or select particular queries among the pinned ones and you can save your dashboard with the pinned queries
If I understand correctly, you would like users to be able to select any of your queries or all. I don't see an easy way you could do that but I think that you can save all of your criteria either as a single pinned global query or as multiple pinned separate queries, then configure all of your panels to display only unpinned data, finally have your users reload the whole interface and in case you chose the global query solution: unpin it and edit it to remove unwanted terms, and in case you chose to have a subquery by criteria, unpin every required one.
Alternatively, if some combinations of terms are often needed, you could save one kibana dashboard for each.

Sphinx reverse search - when new item is added, execute searches on existing stored keywords

I have an app where people can list stuff to sell/swap/give away, with 200-character descriptions. Let's call them sellers.
Other users can search for things - let's call them buyers.
I have a system set up using Django, MySQL and Sphinx for text search.
Let's say a buyer is looking for "t-shirts". They don't get any results they want. I want the app to give the buyer the option to check a box to say "Tell me if something comes up".
Then when a seller lists a "Quicksilver t-shirt", this would trigger a sort of reverse search on all saved searches to notify those buyers that a new item matching their query has been listed.
Obviously I could trigger Sphinx searches on every saved search every time any new item is listed (in a loop) to look for matches - but this would be insane and intensive. This is the effect I want to achieve in a sane way - how can I do it?
You literally build a reverse index!
Store the 'searches' in the databases, and build an index on it.
So 't-shirts' would be a document in this index.
Then when a new product is submitted, you run a query against this index. Use 'Quorum' syntax or even match-any - to get matches that only match one keyword.
So in your example, the query would be "Quicksilver t-shirt"/1 which means match Quicksilver OR t-shirt. But the same holds with much longer titles, or even the whole description.
The result of that query would be a list of (single word*) original searches that matched. Note this also assumes you have your index setup to treat - as a word char.
*Note its slightly more complicated if you allow more complex queries, multi keywords, or negations and an OR brackets, phrases etc. But in this case the reverse search jsut gives you POTENTIAL matches, so you need to confirm that it still matches. Still a number of queries, but you you dont need to run it on all
btw, I think the technical term for these 'reverse' searches is Prospective Search

Exact phrase search using lucene without increasing number of fields

For a phrase search, we want to bring up results only if there's an exact match (without ignoring stopwords). If it's a non-phrase search, we are fine displaying results even if the root form of the word matches etc.
We currently pass our data through standardTokenizer, StopFilter, PorterStemFilter and LowerCaseFilter. Due to this when user wants to search for "password management", search brings up results containing "password manager".
If I remove StemFilter, then I will not be able to match for the root form of the word for non-phrase queries. I was thinking if I should index the same data as part of two fields in document.
I have asked same question at Different indexing and search strategies on same field without doubling index size?. However folks at office are not happy about indexing the same data as part of two fields. (we currently have around 20 text fields in lucene document). Is there any way to support both the cases I listed above using TokenFilters?
Say, for a StopFilter, make changes so that it emits both the input token and ? (for ignored word) with same position increments. Similarly for StemFilter, it emits both the input token and stemmed token with same position increments. Basically input and output tokens (even ignored ones) have same positions.
Is it safe to go ahead with this approach? Has anyone else faced the requirements listed here? Are there any Filters readily available which do something similar to what I mentioned in my approach?
I don't understand what you mean by "input and output tokens." Are you storing the data twice - once as stemmed and once non-stemmed?
If you aren't storing it twice, I don't think your method will work. Suppose the stored word is jumping and they search for jumped. Your query parser can emit jump and jumped but it still won't match jumping unless you have a value stored as jump.
And if you're going to store the value once as stemmed and once as non-stemmed, then why not just store it in two fields? Then you won't have to deal with weird tokenizer changes.

Sharepoint 2010: Full text plus faceted search over an External Content List using Search Services (or possibly FAST)

I have an External List over a products table in our database. I want to be able to build a search form over it via a full text search; in addition to being able to filter down on properties on my initial search.
For example, say I'm looking for DVDs under 10.00 in product DB. I want to be able to have a search box where I enter "DVD OR Movie", but I also want to be able to have a price box where I could enter a max price of 9.99.
My impressions of SP2010 search solutions is that its easy enough to perform a full text search over an EL with Search Services, but at the same time being able to filter down by additional attributes doesn't appear to be possible out of the box. I know with FAST, I can do a full text search; then filter down the results on the result page via each item's properties. However, we're building custom functionality on the results page allowing users to add an item from the search result set to another list; so I can't use Search Services or FAST's results page.
I'm thinking my best bet is CAML; but my readings on the subject lead my to believe CAML doesn't support full text search. I could also try LinqToSharepoint; but that doesn't support full text search either.
Given my circumstances, do I have any other options besides CAML or Linq? Any constructive input is greatly appreciated.
On solution is to use the FAST FSIS product. This is the full version of FAST. It will require some extra configuration to index the data in the way you want. That version of FAST will allow you to explicitly define your fields.
