How to make Country-State-City like search in elastic Search - elasticsearch

I want to make dependent search like when user type country and select country then on next dropdown/text search result would be from that particular Countries state, after selecting state on next text search would only based on that selected state. can anyone help to achieve this kind thing via elastic search.
i am new to elastic Search and i had basic idea of it, but didn't get idea how to do this kind of stuff where i need to search from child and feel data like map

Fist, it is important to understand how Elasticsearch store its data. You can find this kind of info here: https://www.elastic.co/guide/en/elasticsearch/reference/master/documents-indices.html
So, basically what you need is build a query with two must terms.
One for the object type (Country, State, etc).
Other for the name ("Los Angeles", "Massachussets", etc). If you want a autocomplete feature you could add a wildcard query in your list. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html
Obs: When you store your State object do not forget to store the Country name together. Since Elasticsearch is non relational you have to have Country name indexed in the State document.
Hope that it helps

Related

Elasticsearch: Add field on the fly like driving distance by a user searching

Is there a way to dynamically append a driving distance of the CONSUMER in ES?
I am trying to create an app and sort PROVIDERS by driving distance.
One thing I could think of is get all the PROVIDERS within a range then put them in an array. Iterate to add driving distance then sort array. But this is not efficient.
Any suggestions?
thank you!
You can use runtime fields to achieve what you want, A runtime field is a field that is evaluated at query time. Runtime fields enable you to:
Add fields to existing documents without reindexing your data Start
working with your data without understanding how it’s structured
Override the value returned from an indexed field at query time
Define fields for a specific use without modifying the underlying
schema
For more information you can check Elasticsearch blog post here, and for runtime fields here.

Elastic Search and Search Ranking Models

I am new to Elastic Search. I would like to know if the following steps are how typically people use ES to build a search engine.
Use Elastic Search to get a list of qualified documents/results based on a user's input.
Build and use a search ranking model to sort this list.
Use this sorted list as the output of the search engine to the user.
I would probably add a few steps
Think about your information model.
What kinds of documents are you indexing?
What are the important fields and what field types are they?
What fields should be shown in the search result?
All this becomes part of your mapping
Index documents
Are the underlying data changing or can you index it just once?
How are you detecting new docuemtns/deletes/updates?
This will be included in your connetors, that can be set up in multiple ways, for example using the Documents API
A bit of trial and error to sort out your ranking model
Depending on your use case, the default ranking may be enough.
have a look at the Search API to try out different ranking.
Use the search result list to present the results to the end user

Good way to exclude records in SOLR or Elasticsearch

For a matchmaking portal, we have one requirement where in, if a customer viewed complete profile details of a bride or groom then we have to exclude that profile from further search results. Currently, along with other detail we are storing the viewed profile ids in a field (Comma Separated) against that bride or groom's details.
Eg., if A viewed B, then in B's record under the field saw_me we will add A (comma separated).
while searching let say the currently searching members id is 123456 then we will fire a query like
Select * from profiledetails where (OTHER CON) AND 123456 not in saw_me;
The problem here is the saw_me field value is growing like anything, is there any better way to handle this requirement? Please guide.
If this is using Solr:
first, DON'T add the 'AND NOT ...' clauses along with the main query in q param, add them to fq. This have many benefits (the fq will be cached)
Until you get to a list of values that is maybe 1000s this approach is simple and should work fine
After you reach a point where the list is huge, maybe it time to move to a post filter with a high cost ( so it is looked up last). This would look up docs to remove in an external source (redis, db...).
In my opinion no matter how much the saw_me field grows, it will not make much difference in search time.Because tokens are indexed inversely and doc_values are created at index time in column major fashion for efficient read and has support for caching from OS. ES handles these things for you efficiently.

Elasticsearch with multiple search criteria

I am try to build a full text search engine using elasticsearch. We have a application which has conferences running across the globe. We have the future and past conferences data. For a POC we have already loaded the conferences details into elasticsearch and it contains fields like title,date,venue,geo_location of the venue as document.
I am able to do simple search using match all query. And also by using function_score I can get the current on going conferences and also using user geo location i can get nearby conferences to users location.
But there are some uses cases where i got stuck and could not proceed. Use cases are.
1) If user try to search with "title + location" then I should not use the user current geo location rather whatever user has provided the city_name use that place geo location and retrieve those doc. Here I know some programming is also required.
2) User search with "title + year", for ex. cardio 2014. User interested to see all the caridology conf of 2014 and it should retrieve that year documents only. But using function score it is retrieving the current years documents.
First of all let me know that above two use cases can be handled in single query. I am thinking to handle it one request, but got stuck.
A proper solution would require you to write your own query parser in your application (outside of elasticsearch) that will parse the query and extract dates, locations, etc. Once all features are extracted, the parser should generate a bool query where each feature would become an appropriate must clause. So, the date would became a range query, the location - geo_location query and everything else would go into a match query for full text matching. Then this query can be sent to elasticsearch.

Elasticsearch indexed database table column structure

I have a question regarding the setup of my elasticsearch database index... I have created a table which I have rivered to index in elasticsearch. The table is built from a script that queries multiple tables to denormalize data making it easier to index by a unique id 1:1 ratio
An example of a set of fields I have is street, city, state, zip, which I can query on, but my question is , should I be keeping those fields individually indexed , or be concatenating them as one big field like address which contains all of the previous fields into one? Or be putting in the extra time to setup parent-child indexes?
The use case example is I have a customer with billing info coming from one direction, I want to query elasticsearch to see if that customer already exists, or at least return the closest result
I know this question is more conceptual than programming, I just can't find any information of best practices.
Concatenation
For the first part of your question: I wouldn't concatenate the different fields into a field containing all information. Having multiple fields gives you the advantage of calculating facets and aggregates on those fields, e.g. how many customers are from a specific city or have a specific zip. You can still use a match or multimatch query to query for information from different fields.
In addition to having the information in separate fields I would use multifields with an analyzed and not_analyzed part (fieldname.raw). This again allows for aggregates, facets and sorting.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/mapping-multi-field-type.html
Think of 'New York': if you analyze it will be stored as ['New', 'York'] and you will not be able to see all People from 'New York'. What you'd see are all people from 'New' and 'York'.
_all field
There is a special _all field in elasticsearch which does the concatenation in the background. You don't have to do it yourself. It is possible to enable/disable it.
Parent Child relationship
Concerning the part whether to use nested objects or parent child relationship: I think that using a parent child relationship is more appropriate for your case. Nested objects are stored in a 'flattened' way, i.e. the information from the nested objects in arrays is stored as being part of one object. Consider the following example:
You have an order for a client:
client: 'Samuel Thomson'
orderline: 'Strong Thinkpad'
orderline: 'Light Macbook'
client: 'Jay Rizzi'
orderline: 'Strong Macbook'
Using nested objects if you search for clients who ordered 'Strong Macbook' you'd get both clients. This because 'Samuel Thomson' and his orders are stored altogether, i.e. ['Strong' 'Thinkpad' 'Light' 'Macbook'], there is no distinction between the two orderlines.
By using parent child documents, the orderlines for the same client are not mixed and preserve their identity.

Resources