How to display the snippets in elastic search? - elasticsearch

I was working on project to display the matching snippets with its exact locations in the respective document. We are indexing PDF and HTML documents into elastic server. On searching for the text in the indexed documents, first we need to display text around matched text from the document. Then, on clicking this entry, we should bring the document and position at the exact location by highlighting the matched text.
Any help will be highly appreciated.
CM

You can use highlight for highlighting the matched text.
GET /index/type/_search
{
"query" : {
"match_phrase" : {
"field" : "some text"
}
},
"highlight": {
"fields" : {
"field" : {}
}
}
}
You can refer here.

Related

How can I query Elasticsearch to output the exact position of a searched keyword or sentence?

I indexed several documents into my Elasticsearch cluster and queried the Elasticsearch cluster using some keywords and sentences, the output from my query displayed the entire documents where the sentences or keywords where be found.
I want a case where if a query is carried out, it should display just the paragraph where the sentence or keyword can be found and also show the page number it was found.
You can use highlighting functionality with source filtering. So it will show only field which is required and you can hide the remaining field.
You can set _source to false so it will return only highlighted field. If you want to search on different field and highlight on different field then you can set require_field_match to false. Please refer the elastic doc for more referance.
GET /_search
{
"_source":false,
"query": {
"match": { "content": "kimchy" }
},
"highlight": {
"require_field_match":false,
"fields": {
"content": {}
}
}
}

Elastic exact matching and substring matching together

I know that Elastic have "keyword" type in order to find something with exact matching. Ex:
"address": { "type": "keyword"}
That's cool. exact matching works!
but I would like to have both "exact matching" and "sub-string" matching. So I decided to create the following mapping:
"address": { "type": "text" , "index": true }
Problem
If I have "text" type, how can I search exact matching string? (not sub-string). I've tried several ways but does not works:
GET testing_index/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"address" : "washington"
}
}
}
}
}
or
GET testing_index/_search
{
"query": {
"match": {
"address" : "washington"
}
}
}
I need just something universal mapping:
to find exact string
to find sub-strings
I hope elastic can do this.
By default, text fields use the default analyzer, which drops most punctuation, breaks up text into individual words, and lower cases them. For instance, the standard analyzer would turn the string “Quick Brown Fox!” into the terms [quick, brown, fox]. As you can imagine, this makes it difficult to write an exact match query against the text field. For your use case, I suggest one of 2 options:
store as keyword, and accomplish sub-string-like matching using wildcard or fuzzy queries. Wildcard queries, in particular queries with a leading wildcard, are notoriously slow, so proceed with caution.
store the field twice: one as keyword and one as text. Obvious downside here is bloating the size of the index.
For more background, see the "Term Query" Elasticsearch documentation, and in particular the section on "Why doesn’t the term query match my document?": https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html

elasticsearch doesn't suggesting anything if the exact word is used as text?

I'm using text suggester of elasticsearch. My index contains a document which has a filed name and its value is crick
{
"suggest": {
"my-suggest" : {
"text" : "crick",
"term" : {
"field" : "name",
"sort": "score"
}
}
}
}
it return no match, it only returns a value if there is a misspelled
if I pass the exact text it return nothing any idea !!
You are not using suggest_mode
The suggest mode controls what suggestions are included or controls for what suggest text terms, suggestions should be suggested. Three possible values can be specified:
missing: Only provide suggestions for suggest text terms that are not in the index. This is the default.
popular: Only suggest suggestions that occur in more docs then the original suggest text term.
always: Suggest any matching suggestions based on terms in the suggest text.
Since you haven't mentioned suggest_mode it is picking missing by default.
use this settings
{
"suggest": {
"my-suggest" : {
"text" : "crick",
"term" : {
"field" : "name",
"sort": "score",
"suggest_mode": "always"
}
}
}
}

Multi word partial search

i am very new to the Elastic search.
Like to know how to search partial multi word search.
\
for ex :
My document
{
"title":"harry porter"
}
i need this document with search with following string
1.)har por
same as sql query (select * from books where title like '%har%' or title like '%por%')
Using a completion suggester will provide most of the feature you want. It will find words starting with an arbitrary string, like "har" or "por".
Check out this question for a full example on how to set up a completion suggester.
As described in the documentation, you can achieve multi-word search (i.e. returning "harry horter" from a search for "por") by creating your analyzer with the option preserve_position_increments set to false
PUT books
{
"mappings": {
"book" : {
"properties" : {
"suggest" : {
"type" : "completion",
"preserve_position_increments": false
},
"title" : {
"type": "keyword"
}
}
}
}
}
Refer to this : Edge NGram Tokenizer
This helps in partial multi-word search (similar to autocomplete suggestions). Hope this helps!

Full-text schema in ElasticSearch

I'm (extremely) new to ElasticSearch so forgive my potentially ridiculous question. I currently use MySQL to perform full-text searches, and want to move this to ElasticSearch. Currently my table has a fulltext index spanning three columns:
title,description,tags
In ES, each document would therefore have title, description and tags fields, allowing me to do a fulltext search for a general phrase, or filter on a given tag.
I also want to add further searchable fields such as username (so I can retrieve posts by a given user). So, how do I specify that a fulltext search should match title OR description OR tags but not username?
From the OR filter example, I'd assume I'd have to use something like this:
{
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"or" : [
{
"term" : { "title" : "foobar" }
},
{
"term" : { "description" : "foobar" }
},
{
"term" : { "tags" : "foobar" }
}
]
}
}
}
Coming at this new, it doesn't seem like this is very efficient. Is there a better way of doing this, or do I need to move the username field to a separate index?
This is fine.
I general I would suggest getting familiar with ElasticSearch mapping types and options.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping.html

Resources