Full Text Search as well as Terms Search on same filed of Elasticsearch - elasticsearch

I'm from MySql background. So I don't know much about elasticsearch and it's working.
Here is my requirements
There will be table of resulted records with sorting option on all the column. There will be filter option from where user will select multiple values for multiple columns (e.g, City should be from City1, City2, City3 and Category should be from Cat2, Cat22, Cat6). There will be also search bar where user will enter some text and full text search will be applied on some fields (i.e, City, Area etc).
This image will give better understanding.
Where I'm facing problem is Full Text Search. I have tried some mapping but every time I have to compromise either on Full Text Search or Terms Search. So I think there is no any way to apply both search on same field. But as I told, I don;t know much about elasticsearch. So if any one have solution, it will be appreciated.
Here is what I have applied currently which makes sorting and Terms Searching enable but Full Text Search is not working.
{
"mappings":{
"my_type":{
"properties":{
"city":{
"type":"string",
"index":"not_analyzed"
},
"category":{
"type":"string",
"index":"not_analyzed"
},
"area":{
"type":"string",
"index":"not_analyzed"
},
"zip":{
"type":"string",
"index":"not_analyzed"
},
"state":{
"type":"string",
"index":"not_analyzed"
}
}
}
}
}

You can update the mapping with multifields with two mappings one for full text and another for terms search. Here's a sample mapping for city.
{
"city": {
"type": "string",
"index": "not_analyzed",
"fields": {
"fulltext": {
"type": "string"
}
}
}
}
Default mapping is for terms search, so when terms search is required, you could simple query in "city" field. But, you need full-text search, query must be performed on "city.fulltext". Hope this helps.

Full-text search won't work on not_analyzed fields and sorting won't work on analyzed fields.
You need to use multi-fields.
It is often useful to index the same field in different ways for different purposes. This is the purpose of multi-fields. For instance, a string field could be mapped as a text field for full-text search, and as a keyword field for sorting or aggregations:
For example :
{
"mappings": {
"my_type": {
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
} ...
}
}
}
}
Use the dot notation to sort by city.raw :
{
"query": {
"match": {
"city": "york"
}
},
"sort": {
"city.raw": "asc"
}
}

Related

Merging fields in Elastic Search

I am pretty new to Elastic Search. I have a dataset with multiple fields like name, product_info, description etc., So while searching a document, the search term can come from any of these fields (let us call them as "search core fields").
If I start storing the data in elastic search, should I derive a field which is a concatenated term of all the "search core fields" ? and then index this field alone ?
I came across _all mapping concept and little confused. Does it do the same ?
no, you don't need to create any new field with concatenated terms.
You can just use _all with match query to search a text from any field.
About _all, yes, it searches the text from any field
The _all field has been removed in ES 7, so it would only work in ES 6 and previous versions. The main reason for this is that it used too much storage space.
However, you can define your own all field using the copy_to feature. You basically specify in your mapping which fields should be copied to your custom all field and then you can search on that field.
You can define your mapping like this:
PUT my-index
{
"mappings": {
"properties": {
"name": {
"type": "text",
"copy_to": "custom_all"
},
"product_info": {
"type": "text",
"copy_to": "custom_all"
},
"description": {
"type": "text",
"copy_to": "custom_all"
},
"custom_all": {
"type": "text"
}
}
}
}
PUT my-index/_doc/1
{
"name": "XYZ",
"product_info": "ABC product",
"description": "this product does blablabla"
}
And then you can search on your "all" field like this:
POST my-index/_search
{
"query": {
"match": {
"custom_all": {
"query": "ABC",
"operator": "and"
}
}
}
}

I want to find exact term of sub string, exact term not just part of the term

I have group of json documents from wikidata (http://www.wikidata.org) to index to elasticsearch for search.
It has several fields. For example, it looks like below.
{
eId:Q25338
eLabel:"The Little Prince, Little Prince",
...
}
Here, what I want to do is for user to search 'exact term', not part of the term. Meaning, if a user search 'prince', I don't want to show this document in the search result. When user types the whole term 'the little prince' or 'little prince', I want to make this json included in the search result, namely.
Should I pre-process all the comma separate sentence (some eLabel has tens of elements in the list) and make it bunch of different documents and make the keyword term field respectively?
If not, how can I make a mapping file to make this search as expected?
My current Mappings.json.
"mappings": {
"entity": {
"properties": {
"eLabel": { # want to replace
"type": "text" ,
"index_options": "docs" ,
"analyzer": "my_analyzer"
} ,
"eid": {
"type": "keyword"
} ,
"subclass": {
"type": "boolean"
} ,
"pLabel": {
"type": "text" ,
"index_options": "docs" ,
"analyzer": "my_analyzer"
} ,
"prop_id": {
"type": "keyword"
} ,
"pType": {
"type": "keyword"
} ,
"way": {
"type": "keyword"
} ,
"chain": {
"type": "integer"
} ,
"siteKey": {
"type": "keyword"
},
"version": {
"type": "integer"
},
"docId": {
"type": "integer"
}
}
}
}
Should I pre-process all the comma separate sentence (some eLabel has tens of elements in the list) and make it bunch of different documents and make the keyword term field respectively?
This is exactly what you should do. Elasticsearch can't process the comma-separated list for you. It will think your data is just 1 whole string. But if you preprocess it, and then make the resulting field a Keyword field, that will work very well - it's exactly what the Keyword field type is designed for. I'd recommend using a Term query to search for exact matches. (As opposed to a Match query, a Term query does not analyse the incoming query and is thus more efficient.)

Using both term and match query on same text field?

I have an index with a text field.
"state": {
"type": "text"
}
Now suppose there are two data.
"state": "vail"
and
"state": "eagle vail"
For one of my requirements,
- I need to do a term level query, such that if I type "vail", the search results should only return states with "vail" and not "eagle vail".
But another requirement for different search on the same index,
- I need to do a match query for full text search, such that if I type "vail", "eagle vail" should display as well.
So my question is, how do I do both term level and full text search in this field, as for doing a term level query, I would have to set it as "keyword" type such that it wont be analyzed.
You can use "multi-field" feature to achieve this. Here is a mapping:
{
"mappings": {
"my_type": {
"properties": {
"state": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
In this case state will act as text field (tokenized) whereas state.raw will be keyword (single-token). When indexing a document you should only set state. state.raw will be created automatically.

Mapping in elasticsearch

Good morning, In my code I can't search data which contain separate words. If I search on one word all good. I think problem in mapping. I use postman. When I put in URL http://192.168.1.153:9200/sport_scouts/video/_mapping and use method GET I get:
{
"sport_scouts": {
"mappings": {
"video": {
"properties": {
"hashtag": {
"type": "string"
},
"id": {
"type": "long"
},
"sharing_link": {
"type": "string"
},
"source": {
"type": "string"
},
"title": {
"type": "string"
},
"type": {
"type": "string"
},
"user_id": {
"type": "long"
},
"video_preview": {
"type": "string"
}
}
}
}
}
}
All good title have type string but if I search on two or more words I get empty massive. My code in Trait:
public function search($data) {
$this->client();
$params['body']['query']['filtered']['filter']['or'][]['term']['title'] = $data;
$search = $this->client->search($params)['hits']['hits'];
dump($search);
}
Then I call it in my Controller. Can you help me with this problem?
The reason that your indexed data can't be found is caused by a mismatch of the analyzing during indexing and a strict term filter when querying the data.
With your mapping configuration, you are using the default analyzing which (besides many other operations) does a tokenizing. So every multi-word data you insert is split at punctuation or whitespaces. If you insert for example "some great sentence", elasticsearch maps the following terms to your document: "some", "great", "sentence", but not the term "great sentence". So if you do a term filter on "great sentence" or any other part of the original value containing a whitespace, you will not get any results.
Please see the elasticsearch docs on how to configure your mapping for indexing without analyzing (https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-intro.html#_index_2) or consider doing a match query instead of a term filter on the existing mapping (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html).
Please be aware that if you switch to not_analyzed you will be disabling many of the great fuzzy fulltext query functionality. Of course you can set up a mapping that does both, analyzed and not_analyzed in different fields. Then it's up on you to decide on which field you want to query on.

How can I get a search term with a space to be one search term

I have an elasticsearch index, with a field called "name" with a mapping as follows:
"name": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
},
Now let's say I have a record "Brooklyn Technical High School".
I would like somebody searching for "brooklyn t*" to have that show up. For example: http://myserver/_search?q=name:brooklyn+t*
It seems however to be tokening the search term, and searching for both "brooklyn" and "t", because I get back results like: "Ps 335 Granville T Woods".
I would like it to search the not_analyzed term using the whole term. Enclosing it in quotes doesn't seem to help either.
You need to use the term query -
Term query wont analyzer/tokenize the string before it apply the search.
{
"query": {
"term": {
"user": "kimchy"
}
}
}

Resources