Apply language Analyzer in Object type property/field in Elasticsearch - elasticsearch

I need to apply my spanish analyzer to a object type parameter, but it returns error. How could i do i apply it? Inside that field are an array of JSON Objects.
Thanks!
$params = [
'index' => 'cl',
'body' => [
'analysis' => [
'analyzer' => [
'spanish' => [
'type' => 'custom',
'tokenizer' => 'standard',
'filter' => ['lowercase', 'asciifolding', 'spanish_stop', 'spanish_stemmer']
]
],
'filter' => [
'spanish_stemmer' => [
'type' => 'stemmer',
'name' => 'spanish'
],
'spanish_stop' => [
'type' => 'stop',
'stopwords' => ['_spanish_', 'del', 'de', 'el', 'ella', 'los', 'los', 'las', 'la', 'a', 'un', 'y']
]
]
],
'mappings' => [
'people' => [
'_all' => ['enabled' => true],
'properties' => [
'user_info' => ['type' => 'text', 'analyzer' => 'spanish', 'search_analyzer' => 'spanish'],
'resumes' => ['type' => 'object', 'analyzer' => 'spanish', 'search_analyzer' => 'spanish'],
'resume_details' => ['type' => 'object', 'analyzer' => 'spanish', 'search_analyzer' => 'spanish'],
'applications' => ['type' => 'object', 'analyzer' => 'spanish', 'search_analyzer' => 'spanish'],
],
],
],
]
];

Related

Elasticsearch mapping with nested objects

It is my mapping. product_base is nested object also manufacturer is child of product_base and nested object. When i create my doc, if product_base count more than 1 , seems product_base.0, product_base.1 . However i couldnt use filter normally. Is my mapping wrong?
When i use filter for example , returns null because in my elasticserach doc. product_base seems as product_base.0 , product_base.1 ....
'path' => 'product_base.lang',
'query' => [
'bool' => [
'must' => [
['match' => [
product_base.lang.id_product' => 1]],
['match' => [
'product_base.lang.id_lang' => 2]],
],
],
$params = 'index' => product,
'body' => [
'mappings' => [
'_source' => [
'enabled' => true,
],
'properties' => [
'id_product' => ['type' => 'integer'],
'id_product_base' => ['type' => 'integer'],
'lang' => [
'type' => 'nested',
'properties' => [
'id_product' => ['type' => 'integer'],
'id_lang' => ['type' => 'byte'],
'name' => ['type' => 'text']
],
],
'product_base' => [
'type' => 'nested',
'properties' => [
'id_product_base' => ['type' => 'integer'],
'id_manufacturer' => ['type' => 'integer'],
'id_product_zone_group' => ['type' => 'text'],
'status' => ['type' => 'byte'],
'manufacturer' => [
'type' => 'nested',
'properties' => [
'id_manufacturer' => ['type' => 'integer'],
'name' => ['type' => 'text'],
'status' => ['type' => 'byte'],
],
],
]
]
]
]
]
]
];

How can I do a nested search on elastic search?

I am having troubles constructing a search query in ES 7.4
Here is my mapping:
[
'settings' => [
'number_of_shards' => 1,
'number_of_replicas' => 1,
'analysis' => [
'filter' => [
'filter_stemmer' => [
'type' => 'stemmer',
'language' => 'english'
]
],
'analyzer' => [
'g_analyzer' => [
'type' => 'custom',
'filter' => ['lowercase', 'stemmer'],
'tokenizer' => 'standard'
],
"no_stopwords" => [
"type" => "standard",
"stopwords" => []
],
]
]
],
'mappings' => [
'_source' => [
'enabled' => true
],
'properties' => [
'id' => [
'type' => 'integer'
],
'title' => [
'type' => 'text',
"analyzer" => "g_analyzer",
],
'description' => [
'type' => 'text',
"analyzer" => "g_analyzer",
],
'jobStatus' => [
'type' => 'text'
],
'videoId' => [
'type' => 'text',
],
'thumbnail' => [
'type' => 'text'
],
'playlistId' => [
'type' => 'text'
],
'channelId' => [
'type' => 'text'
],
'publishedDate' => [
"type" => "date",
],
'created_at' => [ //date video was updated
"type" => "date",
],
'updated_at' => [ //date video was updated
"type" => "date",
],
'url' => [
'type' => 'text'
],
'subtitles' => [
'type' => 'nested',
'properties' => [
'id' => [
'type' => 'integer'
],
'start_time' => [
'type' => 'float'
],
'end_time' => [
'type' => 'float'
],
'text' => [
'type' => 'text',
"analyzer" => "g_analyzer",
],
'langcode' => [
'type' => 'text'
],
]
]
]
]
];
What query do I need to search for the text "bill gates" in the subtitles, and return the subtitle "bill gates" was found in, as well as the subtitle above and below the hit?
As of now I am not having your sample docs and expected docs so can't try it local and provide you complete query but as you are using nested datatype, you need to make use of nested queries.
Nested queries are used to query the nested datatype and same official doc as some examples as well, see if you can follow them, and provide what you try and from there we can help you.
I figured out how to do the nested query:
$body = [
'query' => [
'nested' => [
'inner_hits'=>[
'size'=>3
],
'path' => 'subtitles',
'query' => [
'bool' => [
'must'=>[
[
'match'=>[ 'subtitles.text'=>$searchTerm ]
]
]
]
]
]
],
];
Doing this will add an inner hits with the subtitles with the actual found terms

How to include mapped fields subfield in result in Elasticsearch?

For aggregation, I have a raw value of my field. But I can't access this value in my query. For example, in my case I have a brand Tommy Hilfiger and it's raw value tommy-hilfiger as a brand.keyword. How to include this value in a search results?
'body' => [
'settings' => [
'analysis' => [
'filter' => [
'remove_spaces_inside' => [
'type' => 'pattern_replace',
'pattern' => '\\s+',
'replacement' => ' '
],
'convert_spaces' => [
'type' => 'pattern_replace',
'pattern' => '\\s+',
'replacement' => '-'
],
],
'char_filter' => [
'convert_amp' => [
'type' => 'pattern_replace',
'pattern' => '&',
'replacement' => 'and'
]
],
'analyzer' => [
'slug' => [
'char_filter' => ['convert_amp'],
'tokenizer' => 'keyword',
'filter' => ['trim', 'lowercase', 'asciifolding', 'remove_spaces_inside', 'convert_spaces']
],
'format' => [
'char_filter' => ['convert_amp'],
'tokenizer' => 'keyword',
'filter' => ['trim', 'remove_spaces_inside']
]
]
]
],
'mappings' => [
'my_type' => [
'properties' => [
'brand' => [
'type' => 'string',
'fields' => [
'keyword' => [
'type' => 'string',
'analyzer' => 'slug',
'index_options' => 'docs',
]
]
]
]
]
]
]
Upd.
In my case, I store brand in 2 fields: default "Tommy Hilfiger" for full-text search, formatted keyword (slug) "tommy-hilfiger" for exact search. I can aggregate data by slug, but can't get this field in my query. For example, this query return all records with brand Tommy Hilfiger, but only default values, not a slug.
'body' => [
'_source' => [
'brand',
'brand.keyword'
],
'query' => [
'bool' => [
'must' => [
[
'terms' => [
'brand.keyword' => [
'tommy-hilfiger',
]
]
]
]
]
]
]

Multi match and highlighting in elasticsearch

When I try to match one field in query everything works fine with highlighting in elasticsearch.
When I try to use:
$params = [
'index' => 'my_index',
'type' => 'articles',
'body' => [
'from' => '0',
'size' => '10',
'query' => [
'bool' => [
'must' => [
'match' => [ 'content' => 'what I want to search' ]
]
]
],
'highlight' => [
'pre_tags' => ['<mark>'],
'post_tags' => ['</mark>'],
'fields' => [
'content' => [ 'fragment_size' => 150, 'number_of_fragments' => 3 ]
]
],
]
];
everything works, but when I try to catch multiple fields, my search works correctly, but highlighting disappears.
'match' => [ 'content' => 'what I want to search' ],
'match' => [ 'type' => 1 ]
Do you know how to achieve functional highlighting, when I want apply search on two different fields with two different queries?
try this:
$params = [
'index' => 'my_index',
'type' => 'articles',
'body' => [
'from' => '0',
'size' => '10',
'query' => [
'bool' => [
'must' => [
'match' => [ 'content' => 'what I want to search' ]
]
],
'filter' => ['type' => 1]
]
] ],
'highlight' => [
'pre_tags' => ['<mark>'],
'post_tags' => ['</mark>'],
'fields' => [
'content' => [ 'fragment_size' => 150, 'number_of_fragments' => 3 ]
]
],
]

Ignore Results Outside of Distance Range

I am working with ElasticSearch for an application which deals with "posts". I currently have it working with a geo_point so that it will return all posts ordered by distance from the end-user. While this is working I also need to work in one more aspect for the system.
Posts can be paid for and for instance if I were to pay for my post and choose "Local" as the area range then this post should only show to end-users which are less than or equal to 20 miles away.
I have a column on my index named spotlight_range, is there a way I can create a query to say ignore all records if the spotlight_range = 'Local' and the distance is > 20 miles? I need to do this for several different spotlight ranges. For instance Regional may be 100 miles or less, etc.
My current query looks like this
$params = [
'index' => 'my_index',
'type' => 'posts',
'size' => 25,
'from' => 0,
'body' => [
'sort' => [
'_geo_distance' => [
'post_location' => [
'lat' => '44.4759',
'lon' => '-73.2121'
],
'order' => 'asc',
'unit' => 'mi'
]
],
'query' => [
'filtered' => [
'query' => [
'match_all' => []
],
'filter' => [
'geo_distance' => [
'distance' => '100mi',
'post_location' => [
'lat' => '44.4759',
'lon' => '-73.2121'
]
]
]
]
]
]
];
My index is setup with the following fields.
'id' => ['type' => 'integer'],
'title' => ['type' => 'string'],
'description' => ['type' => 'string'],
'price' => ['type' => 'integer'],
'shippable' => ['type' => 'boolean'],
'username' => ['type' => 'string'],
'post_location' => ['type' => 'geo_point'],
'post_location_string' => ['type' => 'string'],
'is_spotlight' => ['type' => 'boolean'],
'spotlight_range' => ['type' => 'string'],
'created_at' => ['type' => 'date', 'format' => 'yyyy-MM-dd HH:mm:ss'],
'updated_at' => ['type' => 'date', 'format' => 'yyyy-MM-dd HH:mm:ss']
My end goal for this is not specifically to search for distance < X and range = Y but rather to have it filter them out for all types based on distances I specify. The search should return ALL types of ranges but also filter out anything past my specified distance for each range type based on the users lat/lon passed into the query.
I have been looking for a solution to this online without much luck.
I would add a circle geo_shape to the document, centered on post_location and with a radius corresponding to the spotlight_range since you know both information at indexing time. That way you can encode into each post its corresponding "reach".
...
'post_location' => ['type' => 'geo_point'],
'spotlight_range' => ['type' => 'string'],
'reach' => ['type' => 'geo_shape'], <---- add this
So a "local" document would look something like this once indexed
{
"spotlight_range": "local",
"post_location": {
"lat": 42.1526,
"lon": -71.7378
},
"reach" : {
"type" : "circle",
"coordinates" : [-71.7378, 42.1526],
"radius" : "20mi"
}
}
Then the query would feature another geo_shape centered on the user's location with the chosen radius and would only retrieve documents whose reach intersects the circle shape in the query.
$params = [
'index' => 'my_index',
'type' => 'posts',
'size' => 25,
'from' => 0,
'body' => [
'sort' => [
'_geo_distance' => [
'post_location' => [
'lat' => '44.4759',
'lon' => '-73.2121'
],
'order' => 'asc',
'unit' => 'mi'
]
],
'query' => [
'filtered' => [
'query' => [
'match_all' => []
],
'filter' => [
'geo_shape' => [
'reach' => [
'relation' => 'INTERSECTS',
'shape' => [
'type' => 'circle',
'coordinates' => [-73.2121, 44.4759],
'radius' => '20mi'
]
]
]
]
]
]
]
];

Resources