I am used to FTS techs doing this for me but imagine I input mongo fac into Elastic Search.
I expect it to be able to find mongo factory or mongodb something equally, however, it does not.
Assume I have a single field called title. I have three documents with the titles:
Mongo Factory
Mongodb something
cheese
I have a single boolean should clause with:
array('prefix' => array(
'title' => 'mongo fac'
)),
Using default analyzers, no special configuration, mongo factory will be found but not mongodb something.
What I want is for monogdb something to appear in the results as well, basically for Elastic Search to tokenize the keywords; as well as searching for mongo fac it should also search for mongo and fac.
Except for tokenizing myself what else can I do to get elastic search to work the way I want to, perferrably using their tokenizer as a means to tokenize my keywords?
For reference to others who come across this question: I didn't find a valid solution in the end so I just wrote a function to tokenise the words myself and form separate prefix queries and it works as it should.
Related
I am using Spring Boot and Elasticsearch and I am trying to use three character searches but the searches only match on five characters or more.
If I have a user name of 'Bob Smith' I can find the match searching for 'Smith' but searching for 'Bob' does not find a match.
I suspect this is something that needs to be changed in my class ''SearchMappingConfig implements HibernateOrmSearchMappingConfigurer'' but I can't find any information about changing the size of the tokens needed to successfully match a result.
My ''#Entity'' tables have ''#FullTextField(analyzer = "english")'' annotations on the fields I want included in the token searches.
How do I change the length of the search match?
Ideally I would like any three letters to form a match, so a search for 'Ron' would match 'Ronald' and 'Laronda'
Elasticsearch 7.14
Spring Boot 2.7.6
I have been reading Spring Boot and Elasticsearch documentation but cannot find any information about changing the match length.
Hibernate is able to use an Elasticsearch or Lucene client. Our existing project uses Lucene and it would have been a large undertaking to replace that.
The recommended solution is to create new analyzers so that incoming data creates smaller tokens, I didn't want to change analyzers on my existing database.
A lot of the documentation I was able to find pointed to using the Elasticsearch query builder or the Hibernate 5 method of using a wildcard.
I tested our Elasticsearch and found that the wildcard solution would work.
I ended up using the Hibernate 6 method for wildcard searching and it works well.
SearchResult<DataClass> result = searchSession.search(DataClass.class)
.where(f -> f.wildcard()
.fields(
"firstname",
"lastname",
"username",
"currentLegalName")
.matching("*"+searchText.toLowerCase()+"*"))
.fetch(10);
long totalHitCount = result.total().hitCount();
logger.debug("Search results size {}", totalHitCount);
I am exploring deepset haystack and found it very interesting for multiple use cases like a chatbot, search engine, document search, etc
But have not found any reference where I can create multiple indexes for different documents and search based on indexes. I thought of using meta tags for conditional search(on a particular area) by tagging the documents first and then using the params parameter of query API but the same doesn't seem to work and throws an error(I used its vanilla docker-compose based setup)
You can use multiple indices in the same document store if you want to support multiple use cases, indeed. The write_documents method of the document store has a parameter index so that you can store documents for your different use cases in different indices. In the same way, you can pass an index parameter to the query method.
As you expected, there is an alternative solution that uses the meta field of documents. However, the format needs to be slightly different. Your query needs to have the following format:
{"query": "What's the capital town?", "params": {"filters": {"name": "75_Algeria75.txt"}}}
and your documents need to have the following format:
{'text': 'Algeria is...', 'meta':{'name': "75_Algeria75.txt"}}
I am using Liferay 7.1 together with ElasticSearch and all I want to do is to search for (EXAMPLE): "This is a test".
But in this case "is" and "a" are stop words, they get filtered out, and therefore I do get results that I do not want like : "This test rocks".
I am using a BooleanQuery like this:
BooleanQuery keywordQuery = new BooleanQueryImpl();
keywordQuery.addTerms(KEYWORDS, keyword, false);
Keyword in this case is "this is a test".
Can anyone tell me how to make the BooleanQuery not filter out stop words ?
Best regards,
Daniel
Stop-Words are a concept of the analysis phase when indexing. So your index does not contain "is" and "a". Therefore, there is no param at query time to use stop words.
What you could do, is to use a different search index attribute which contains the full content with stop words. This depends on your configuration, maybe the is already an attribute without stopword, or you need to add one using a Index Post-Proccessor or modify your elastic Mapping Configuration.
Please check your documents structure (e.g. with elastic HQ) to inspect the attributes for stopwords.
Before asking this question, I have searched much about my problem. I need to make full text search from mongodb in spring framework. Up to now I just tried something with regex, but it does not cover my requirement. For example, I have a search string as 'increased world population' , and my search algorithm should return documents well-matched to search string or documents including at least one word from search string. I know Lucene does full text search, but I don't know how to implement it with my mongodb spring data and I dont know whether spring data already offer full text search. I need a tutorial which explain that.
what I have done up to now:
Criteria textCriteri = Criteria.where("title").regex(searchStr.trim().replaceAll(" +", " "), "i");
Query query = new Query(locationCriteria).addCriteria(textCriteri).limit(Consts.MONGO_QUERY_LIMIT);
List<MyObject> advs = mongoTemplate.find(query, MyObject.class);
You can create a 'text' index in mongodb and search through that, see http://docs.mongodb.org/manual/core/index-text/
Depending on your search queries, you probably want to use a more powerful search engine like ElasticSearch (as you mentioned Lucene).
I'm integrating elasticsearch into an asset tracking application. When I setup the mapping initially, I envisioned the 'brand' field being a single-term field like 'Hitachi', or 'Ford'. Instead, I'm finding that the brand field in the actual data contains multiple terms like: "MB 7 A/B", "B-7" or even "Brush Bull BB72X".
I have an autocomplete component setup now that I configured to do autocomplete against an edgeNGram field, and perform the actual search against an nGram field. It's completely useless the way I set it up because users expect the search results to be restricted to what the autocomplete matches.
Any suggestions on the best way to setup my mapping to support autocomplete and subsequent searches against a multiple term field like this? I'm considering a terms query against a keyword field, or possibly a match query with 'and' as the operator? I also have to deal with hyphens like "B-7".
you can use phrase suggest, the guide is here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters.html
the phrase suggest guide is here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-phrase.html