how to index a file with elasticsearch 5.5.1 - elasticsearch

I'm new to Elasticsearch. I have successfully installed Elasticsearch with Kibana, X-pack and ingest-attachment. I have both Elasticsearch and Kibana running. I have kept it simple at the moment with the install using default options on a windows 2012 server. I have a directory on another drive w\mydocs and at the moment it just has 3 plain text files in it, but I will want to add others like pdf and doc file types. So now I want to get these files into Elasticsearches index. I have tried using the following link as a guide Attaching pdf docs in Elasticsearch, however I cannot get it to work.
Here's how I have set up the index and pipeline:
PUT _ingest/pipeline/docs
{
"description": "documents",
"processors" : [
{
"attachment" : {
"field": "data",
"indexed_chars" : -1
}
}]
}
PUT myindex
{
"mappings" : {
"documents" : {
"properties" : {
"attachment.data" : {
"type": "text",
"analyzer": "standard"
}
}
}
}
}
Then to get the first document in I use the following:
PUT localhost:9200/documents/1?pipeline=docs -d #/w/mydocs/README.TXT
and the error that I receive is:
{
"error": {
"root_cause": [
{
"type": "parse_exception",
"reason": "request body is required"
}
],
"type": "parse_exception",
"reason": "request body is required"
},
"status": 400
}

you still have to send valid JSON to Elasticsearch, even when indexing binary data. This means, that you have to encode your document as base64 and then put it into a JSON document like this
{
"data" : "base64encodedcontentofyourfile"
}

I was advised not to use the ingest-attachment, but instead to use FsCrawler. I managed to get Fscrawler working without having to convert anything to base64.

Related

when applying ingest pipeline it fails with parse execption request body or source parameter is required

Execute this request
PUT _ingest/pipeline/add_test_pipeline
{
"description": "A description for your pipeline",
"processors": [
{
"set": {
"field": "fieldname",
"value": "1"
}
}
]
}
and get error:
{
"error" : {
"root_cause" : [
{
"type" : "parse_exception",
"reason" : "request body or source parameter is required"
}
],
"type" : "parse_exception",
"reason" : "request body or source parameter is required"
},
"status" : 400
}
request works on one cluster but does not on another.
From discuss.elastic.co
this page from google can not open.
When you issue the PUT command to create a new document using your pipeline, you need to specify the document int he request body:
PUT filebeat-7.6.2-2020.05.02-000001/_doc/ny1z03EBxpMbNnRZgGlQ?pipeline= add_test_pipeline
...will not work and produce the error you're seeing because you're not sending any document in the body. The pipeline alone will not create a document out of the blue, you need to provide one, even if it's empty like this:
PUT filebeat-7.6.2-2020.05.02-000001/_doc/ny1z03EBxpMbNnRZgGlQ?pipeline= add_test_pipeline
{
}

Elastic Search: Alternative of flattened datatype in Elastic Search 7.1

I have two Elastic Search version one is 7.3 and the second is 7.1. I am using flattened data type for Elastic Search 7.3 and I also want to use this data type in Elastic Search 7.1. So that I can store my data as I stored in Elastic Search 7.3.
I researched about flattened data type and get to know that it's supported to 7.x but when I tried in 7.1 it gives me the mapper_parsing_exception error.
What I tried is as shown below.
In Elastic Search 7.3
Index Creation
PUT demo-flattened
Response:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "demo-flattened"
}
Insert Mapping
PUT demo-flattened/_mapping
{
"properties": {
"host": {
"type": "flattened"
}
}
}
Response:
{
"acknowledged": true
}
In Elastic Search 7.1
PUT demo-flattened
Response:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "demo-flattened"
}
Insert Mapping
PUT demo-flattened/_mapping
{
"properties": {
"host": {
"type": "flattened"
}
}
}
Response:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "No handler for type [flattened] declared on field [host]"
}
],
"type": "mapper_parsing_exception",
"reason": "No handler for type [flattened] declared on field [host]"
},
"status": 400
}
I want to use the flattened data type in Elastic Search 7.1. Is there any alternative to use flattened data type in the 7.1 version because flattened data type is supported from Elastic Search 7.3.
Any help or suggestions will be appreciated.
First the flattened is available in 7.1 with X-pack (X-pack is paid feature),
so what I think you can use object type with enabled flag as false
This will help you store that field as it is without any indexing.
{
"properties": {
"host": {
"type": "object",
"enabled": false
}
}
}
Check the version of your ElasticSearch. If its the OSS version, then it won't work for you.
You can check it by running GET \ in the Kibana. You would get something like:
{
"version" : {
"number" : "7.10.2",
"build_flavor" : "oss",
}
}
But for ElasticSearch that does support flattened type, you would get something like:
"version" : {
"number" : "7.10.2",
"build_flavor" : "default",
}
}
You can find more details on the official Kibana Github page No handler for type [flattened] declared on field [state] #52324.
Interally, it works like this
Similarities in the way values are indexed, flattened fields share much of the same mapping and search functionality as keyword fields
Here, You have only one field called host. You can replace this with keyword.
What similarities:
Mapping:
"labels": {
"type": "flattened"
}
Data:
"labels": {
"priority": "urgent",
"release": ["v1.2.5", "v1.3.0"],
"timestamp": {
"created": 1541458026,
"closed": 1541457010
}
}
During indexing, tokens are created for each leaf value in the JSON object. The values are indexed as string keywords, without analysis or special handling for numbers or dates
To query them, you can use "term": {"labels": "urgent"} or "term": {"labels.release": "v1.3.0"}.
When it is keyword, you can have them as separate fields.
{
"host":{
"type":"keyword"
}
}
Reference

elasticsearch routing on specific field

hi I want to set custom routing on specific field "userId" on my Es v2.0.
But it giving me error.I don't know how to set custom routing on ES v2.0
Please guys help me out.Thanks in advance.Below is error message, while creating custom routing with existing index.
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [_routing] has unsupported parameters: [path : userId]"
}
],
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [_routing] has unsupported parameters: [path : userId]"
},
"status": 400
}
In ES 2.0, the _routing.path meta-field has been removed. So now you need to do it like this instead:
In your mapping, you can only specify that routing is required (but you cannot specify path anymore):
PUT my_index
{
"mappings": {
"my_type": {
"_routing": {
"required": true
},
"properties": {
"name": {
"type": "string"
}
}
}
}
}
And then when you index a document, you can specify the routing value in the query string like this:
PUT my_index/my_type/1?routing=bar
{
"name": "foo"
}
You can still use custom routing based on a field from the data being indexed. You can setup a simple pipeline and then use the pipeline every time you index a document, or you can also change the index settings to use the pipeline whenever the index receives a document indexing request.
Read about the pipeline here
Do read above and below the docs for more clarity. It's not meant for setting custom routing, but can be used for the purpose. Custom routing was disabled for a good reason that the field to be used can turn out to have null values leading to unexpected behavior. Hence take care of that issue by yourself.
For routing, here is a sample pipeline PUT :
PUT localhost:9200/_ingest/pipeline/nameRouting
Content-Type: application/json
{
"description" : "Set's the routing value from name field in document",
"processors" : [
{
"set" : {
"field": "_routing",
"value": "{{_source.name}}"
}
}
]
}
The index settings will be :
{
"settings" : {
"index" : {
"default_pipeline" : "nameRouting"
}
}
}

Copy field to payload

When using suggesters in Elastic, it is possible to give a payload when indexing a document. Each time the suggester will be used, its payload will be given with the suggestions.
I would like to add the value of the id field of a document to the payload. While it is easy to do so at index-time, I would want to handle it in the mapping, because I don't want to change the way I convert documents to JSON.
I tried the following:
POST test
{
"mappings" : {
"type1" : {
"properties" : {
"id": {"type": "String", "copy_to": ["field1_suggest.payload"]},
"field1" : { "type" : "string", "copy_to": ["field1_suggest"]},
"field1_suggest":{"type": "completion", "payloads": true}
}
}
}
}
POST test/type1/1
{
"id": "payload",
"field1": "my value"
}
This fails, since "payload" is not a real field of field_suggest:
"error": "MapperParsingException[attempt to copy value to non-existing object [field1_suggest.payload]]", "status": 400
How can I automatically include fields in the payload? If it is not possible, I guess I will have to use mainstream queries to get suggestions for completion...

Error when creating an index using mappings file

I installed Elasticsearch 1.5.2 on CentOS as a service.
I tried to add mappings using PUT:
curl -XPUT $ES_HOST/my_index -d'
{
"mappings": {
"my_type": {
"properties": {
"field": {
"type": "nested"
}
}
}
}
}'
That request works fine and creates new index with correct mappings.
Instead of putting mappings manually I want to store mappings on server in config files. For that I created file /etc/elasticsearch/mappings/my_index/all_mappings.json with the same content as previous request body. After that I'm trying to create index curl -XPUT $ES_HOST/my_index but the error occurs
{
"error": "MapperParsingException[mapping [all_mappings]]; nested:
MapperParsingException[Root type mapping not empty after parsing!
Remaining fields: [mappings : {my_type={properties={field={type=nested}}}}]]; ",
"status": 400
}
I tried to remove mappings field in config json but nothing changed.
The name of the file is the mapping name. So, for /mappings/my_index/all_mappings.json you should have the index my_index and the type called all_mappings.
Also, the content of the file should be:
{
"properties": {
"field": {
"type": "nested"
}
}
}
These being said, do the following:
create a my_type.json file under /etc/elasticsearch/mappings/my_index folder
put inside that file the following:
{
"properties": {
"field": {
"type": "nested"
}
}
}
call PUT /test_my_index
check the mapping: GET /test_my_index/_mapping

Resources