Google Mini - phrase matching - google-search-appliance

is there a setting in the google mini search appliance that allows you to control how sensitive the results are? for example, searching for the phrase "building permit" will return results; searching for "building permit zoo" will return none. can i get it to give me results containing any of the words? thank you.

As far as I'm aware there's no actual sensitivity parameter, but you can use the alternate advanced search parameters to search for the following:
as_q: with all of the words
as_epq: with the exact phrase
as_oq: with at least one of the words
as_eq: without the words
Rather than the standard q parameter.
These work (mostly) the same as the advanced search parameters on google.com: http://www.google.co.uk/advanced_search?hl=en

Related

Searching for a term as both a single string and multi worded string

I'm setting up my elastic instance in a schema-less manner (no up front mappings) and the application requires users be able to search against a field that contains a word that may or may not be tokenized into multiple strings. For example, the field may contain the word "ONETWO". The spec requires that a user should be able to search "ONETWO", "ONE", and "TWO" and retrieve that same document. There doesn't seem any easy way to accomplish this even with a custom tokenizer (and I don't think there SHOULD be an easy way to do this -- or any way at all). Just want to confirm my thoughts.
Its very easy to cater your requirement using the custom analyzer which uses the n-gram tokenizer, You can even pass it to a lowercase token filter, so that in your case even your text was ONETWO but if user searches for one, One, ONE he should get a result. Although for this you need to apply a different analyzer search time read more about it https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html.
Refer https://devticks.com/how-to-improve-your-full-text-search-in-elasticsearch-with-ngram-tokenizer-e346f29f8ddb for more information and let me know if you need any information.

autocomplete and search in Elasticsearch

Is there any possibility to make a search on two non-complete words in the same field using Elasticsearch in Rails? I mean the situation when I could successfully search for example "victorian buildings" phrase by inserting into search input for example "vict bui" phrase (only beginnings of words, also with fuzziness).
Partial match (word_start, text_start etc. available in Searchkick) doesn't work in this project. I've also tried using wildcard queries, but it also failed. Maybe writing some custom mappings/settings would be a good idea?
Can I ask you for any suggestions on what to search/read to do this task?
Try this example
"%#{params[:place]}%"
Since % is a wildcard, doing a like on '%%' matches everything,
and you get all the records in the result.

Search algorithm options for ontology querying?

I have developed a tool that enables searching of an ontology I authored. It submits the searches as SPARQL queries.
I have received some feedback that my search implementation is all-or-none, or "binary". In other words, if a user's input doesn't exactly match a term in the ontology, they won't get any hit at all.
I have been asked to add some more flexible, or "advanced" search algorithms. Indexing and bag-of-words searching were suggested.
Can anyone give some examples of implementing search methods on an ontology that don't require a literal match?
FIrst of all, what kind of entities are you trying to match (literals, or string casts of URIs?), and what kind of SPARQL queries are you running now? Something like this?
?term ?predicate "user input" .
If you are searching across literals, you can make the search more flexible right off the bat by using case-insensitive regular expression filtering, although this will probably make your searches slower, and it won't catch cases where some of the word tokens are present but in a different order. In the following example, your should probably constrain the types of ?term and ?predicate first, or even filter on a string datatype on ?userInput
?term ?predicate ?someLiteral .
FILTER(regex(?someLiteral), "user input", "i"))
Several triplestores offer support for full-text searching and result scoring. These are often extensions to the SPARQL language.
For example, Virtuoso and some others offer a bif:contains predicate. Virtuoso also offers the faceted search web interface (plus a service, I think.) I have been pleased with the web-based full text search in Blazegraph and Stardog, but I can't say anything at this point about using them with a SPARQL query to get a score on a search pattern. Some (GraphDB) even support explicit integration with Lucene or Solr*, so you may be able to take advantage of their search languages.
Finally... are you using a library like the OWL API or RDF4J to access your ontology? If so, you could certainly save the relationships between your terms and any literals in a Java native data structure, and then directly use a fuzzy search component like Lucene to index each literal as a "document" and then search the user input across the index.
Why don't you post your ontology and give an example of a search you would like to peform in a non-binary way. I (or someone else) can try to show you a minimal implementation.
*Solr integration only appears to be offered in the commercially-licensed version of GraphDB

Wild card searches with query_string

Is it possible to enable wild card queries by default using query_string?
I'm having to manually append * to each of the terms. I had a look at the documentation but couldn't find anything.
No there is no way to enable it. You can enable/disable using wildcards "allow_leading_wildcard" the way how it works, that ES try to match tokens. So if you search for car it will match car until you search car* then it will match cars (sure it depends on analysis but further there is link for you to read).
I dont know case what you want to do but you should look to dealing with language. It should help also note that using leading wildcard could have performance issues that is why sometimes is better to disable it.

How to search usenet for programming questions?

I've been using usenet searches since about 1995 to get programming information, mostly for microsoft APIs. First searching via dejanews, and now google "groups" which bought out dejanews. Over the last few years I've noticed a steady decline in the quantity of search results for usenet from google, and today I find I'm completely unable to get a working usenet search on their advanced group search page. I'm used to searching on "microsoft.*" sometimes suplemented with "microsoft" or "microsoft*". Just try to find a post from 1996-1998 time period on "database" in either the comp.* or microsoft.* hierarchies, and if you can do it, please show your search expression. There should be thousands of results.
http://groups.google.com/groups/search?safe=off&q=database+group%3Amicrosoft*&btnG=Rechercher&as_mind=1&as_minm=1&as_miny=1996&as_maxd=1&as_maxm=1&as_maxy=1999&as_drrb=b&sitesearch=
seems to work nicely... 994 results (no thousands but still...)
It appears to be problem with the advanced search form. I can't get the one at
http://groups.google.fr/advanced_search?hl=fr&q=&hl=fr&
to work either. But I can use the basic form with "database group:microsoft*" and I get many results as expected.
http://www.google.ca/groups/search?safe=off&q=database+group%3Acomp.*&btnG=Search&sitesearch=
returns 3,000 results
The advanced search isn't working for me either:
Broken advanced search results URL
However, removing lr=selected from the query string in that URL makes it work, for some reason:
Working advanced search results URL
In fact, hitting the search button again on the broken advanced search results page will return those results as well for me.
Or actually, it's only partly working, since entering multiple comma-separated groups in the advanced search form (or using the group: search operator) doesn't quite work as expected and ends up adding all the words in the additional group names as search keywords too.
You could try learning Julian dates and use the daterange search operator:
Search results using daterange:

Resources