Exclude only 1 field from indexing - elasticsearch

I need your help with ElasticSearch field indexes.
I'm sending Jsons to ES and I have only 1 field that I send XML and I never search on this field.
I need your help to exclude ONLY this field from indexing. The field that I would like to exclude is: rawXml
Please let me know if you need more details from me.
Is this possible?
This is the Json i'm sending now :
{
"_index": "logstash-2015.05.05",
"_type": "XXXXX",
"_id": "XXXXXX",
"_score": null,
"_source": {
"#version": 1,
"#timestamp": "2015-05-05T00:00:11.489Z",
"host": "XXXXXXXXX",
"application_name": "XXXXXXX",
"logger_name": "XXXXXXXXX",
"level": "DEBUG",
"thread_name": "XXXXXXX",
"name": "XXXXXXXX",
"rawXml": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<XXXXX xmlns=\"http://XXXXXXXX\" Version=\"14.000\">\n <POS>\n <Source DKNumber=\"XXXXXX\"/>\n </POS>\n <Unique>\n <ID PCC=\"XXXXX\" GDS=\"XXXXX\" NAME=\"XXXXXX\" TYPE=\"XXXXXX\"/>\n </Unique>\n</OpenHotelHubSessionCreateRQ>\n",
"type": "XXXXXXXX",
"message": "Event: XXXXXXX",
"requestContext": {
"wml-locale": "en_AU",
"accountId": XXXXXXX,
"agentName": "XXXXXXX",
"deviceType": "XXXXXX",
"cwtTraveller": {
"clientTopGuid": "XXXXX",
"travellerGuid": "XXXXXXX",
"clientTopTravelerTypeGuid": "XXXXXX",
"clientSubGuid": "XXXXXXXX"
},
"providerSessionId": null,
"variantId": X
},
"path": "XXXXXXXXXXX",
"hostname": "XXXXXXXXX"
},
"sort": [
1430784011489
]
}

Related

how to skip particular fields from inserting in elasticsearch? Don't want to fields like "message", "event", "log"

In the record, I am not inserting fields like "message", "event", or "log". These fields are autogenerated while inserting records from CSV file using logstash somehow which I would like not to get there.
The record in the index looks like follows:
"_index": "jmeter2",
"_id": "dsfdsfdsf",
"_score": 1,
"_source": {
"Samples": "1083",
"Received KB/sec": "178.9",
"99th pct": "1350",
"log": {
"file": {
"path": "/Users/abc/Downloads/opt/jenkins/workspace/agg_report2.csv"
}
},
"host": {
"name": "dfdsfdsffs"
},
"#timestamp": "2022-11-22T07:15:29.052181Z",
"95th pct": "659",
"Min": "112",
"Max": "3829",
"#version": "1",
"Throughput": "7.2",
"Label": "ACTIVITY_DETAIL",
"90th pct": "338",
"Build_number": "abcd1111",
"Error %": "0.00%",
"Median": "207",
"message": "ACTIVITY_DETAIL,1083,270,207,338,659,1350,112,3829,0.00%,7.2,178.9,251.61",
"event": {
"original": "ACTIVITY_DETAIL,1083,270,207,338,659,1350,112,3829,0.00%,7.2,178.9,251.61"
},
"Average Response Time": "270",
"Stddev": "251.61"
}
}
You can add a remove_field statement to your csv filter :
filter {
csv {
remove_field => [ "message", "event", "log" ]
}
}
https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-remove_field

Elasticsearch - Delete query among nested object

I'm new to Elasticsearch, and I cannot find a Delete query.
Here is an example of an document in myIndex :
{
"_index": "myIndex",
"_type": "_doc",
"_id": "IPc5kn8Bq7SuVr5qM9dq",
"_score": 1,
"_source": {
"code": "1234567",
"matches": [
{
"hostname": "hostnameA.com",
"url": "https://www.hostnameA.com/....",
},
{
"hostname": "hostnameB.com",
"url": "https://www.hostnameB.com/....",
},
{
"hostname": "hostnameC.com",
"url": "https://www.hostnameC.com/....",
},
{
"hostname": "hostnameD.com",
"url": "https://www.hostnameD.com/....",
},
]
}
}
Let's say this index contains 10k documents.
I would like a query to remove all the item from my array matches where the hostname is equal to hostnameC.com, and keeping all the others.
Anyone would have an idea to help me?

How to get proposed time using Microsoft Graph API?

When the other person declined the meeting and proposed a new time. In Outlook, you can see the proposed time.
Now I am trying to use Microsoft Graph API to get that proposed time.
For example, the original meeting date is 2018-03-08, and the other person declined and proposed a new date 2018-03-12.
I tried
GET /beta/me/messages/{messageId}=?$expand=microsoft.graph.eventMessage/event
However, I cannot find the proposed time from the result returned. How can I get it? Thanks
{
"#odata.context": "https://graph.microsoft.com/beta/$metadata#users('576552d5-3bc0-42a6-a53d-bfceb405db23')/messages/$entity",
"#odata.type": "#microsoft.graph.eventMessage",
"#odata.etag": "W/\"DAAAABYAAACpTc/InBsuTYwTUBb+VIb4AADoRAyI\"",
"id": "AAMkADBlZTUwNTkxLWVmODgtNDVhNC1iZjhlLTdjNjA1ODZlMDI5MgBGAAAAAACUbnk-iwQZRbXMgkfKtmYhBwCpTc-InBsuTYwTUBb_VIb4AAAAAAEMAACpTc-InBsuTYwTUBb_VIb4AADnwc8mAAA=",
"createdDateTime": "2018-03-06T22:29:10Z",
"lastModifiedDateTime": "2018-03-06T22:29:11Z",
"changeKey": "DAAAABYAAACpTc/InBsuTYwTUBb+VIb4AADoRAyI",
"categories": [],
"receivedDateTime": "2018-03-06T22:29:11Z",
"sentDateTime": "2018-03-06T22:29:05Z",
"hasAttachments": false,
"internetMessageId": "<MWHPR15MB18399806CC97C61817C9A2B18BD90#MWHPR15MB1839.namprd15.prod.outlook.com>",
"subject": "New Time Proposed: Test",
"bodyPreview": "",
"importance": "normal",
"parentFolderId": "AAMkADBlZTUwNTkxLWVmODgtNDVhNC1iZjhlLTdjNjA1ODZlMDI5MgAuAAAAAACUbnk-iwQZRbXMgkfKtmYhAQCpTc-InBsuTYwTUBb_VIb4AAAAAAEMAAA=",
"conversationId": "AAQkADBlZTUwNTkxLWVmODgtNDVhNC1iZjhlLTdjNjA1ODZlMDI5MgAQAOnuCMgoRLdGs-1scw6i7EU=",
"conversationIndex": "AdO1mnWL6e4IyChEt0az/WxzDqLsRQAABO5D",
"isDeliveryReceiptRequested": null,
"isReadReceiptRequested": false,
"isRead": false,
"isDraft": false,
"webLink": "https://outlook.office365.com/owa/?ItemID=AAMkADBlZTUwNTkxLWVmODgtNDVhNC1iZjhlLTdjNjA1ODZlMDI5MgBGAAAAAACUbnk%2FiwQZRbXMgkfKtmYhBwCpTc%2FInBsuTYwTUBb%2BVIb4AAAAAAEMAACpTc%2FInBsuTYwTUBb%2BVIb4AADnwc8mAAA%3D&exvsurl=1&viewmodel=ReadMessageItem",
"inferenceClassification": "focused",
"unsubscribeData": [],
"unsubscribeEnabled": false,
"meetingMessageType": "meetingDeclined",
"type": "singleInstance",
"isOutOfDate": false,
"isAllDay": false,
"isDelegated": false,
"body": {
"contentType": "html",
"content": "Hi"
},
"sender": {
"emailAddress": {
"name": "Rose",
"address": "rose#example.com"
}
},
"from": {
"emailAddress": {
"name": "Rose",
"address": "rose#example.com"
}
},
"toRecipients": [
{
"emailAddress": {
"name": "Jack",
"address": "jack#example.com"
}
}
],
"ccRecipients": [],
"bccRecipients": [],
"replyTo": [],
"mentionsPreview": null,
"flag": {
"flagStatus": "notFlagged"
},
"startDateTime": {
"dateTime": "2018-03-08T04:00:00.0000000",
"timeZone": "UTC"
},
"endDateTime": {
"dateTime": "2018-03-08T05:00:00.0000000",
"timeZone": "UTC"
},
"location": {
"displayName": "Test",
"locationType": "default",
"uniqueIdType": "unknown"
},
"recurrence": null,
"event#odata.context": "https://graph.microsoft.com/beta/$metadata#users('576552d5-3bc0-42a6-a53d-bfceb405db23')/messages('AAMkADBlZTUwNTkxLWVmODgtNDVhNC1iZjhlLTdjNjA1ODZlMDI5MgBGAAAAAACUbnk-iwQZRbXMgkfKtmYhBwCpTc-InBsuTYwTUBb_VIb4AAAAAAEMAACpTc-InBsuTYwTUBb_VIb4AADnwc8mAAA%3D')/microsoft.graph.eventMessage/event/$entity",
"event": {
"#odata.etag": "W/\"qU3PyJwbLk2ME1AW/lSG+AAA6EQMeA==\"",
"id": "AAMkADBlZTUwNTkxLWVmODgtNDVhNC1iZjhlLTdjNjA1ODZlMDI5MgBGAAAAAACUbnk-iwQZRbXMgkfKtmYhBwCpTc-InBsuTYwTUBb_VIb4AAAAAAENAACpTc-InBsuTYwTUBb_VIb4AADnwkmiAAA=",
"createdDateTime": "2018-03-06T22:28:32.3852279Z",
"lastModifiedDateTime": "2018-03-06T22:29:11.4604154Z",
"changeKey": "qU3PyJwbLk2ME1AW/lSG+AAA6EQMeA==",
"categories": [],
"originalStartTimeZone": "Pacific Standard Time",
"originalEndTimeZone": "Pacific Standard Time",
"iCalUId": "040000008200E00074C5B7101A82E0080000000000000000000000000000000000000000310000007643616C2D5569640100000033324633333433392D433744452D344338362D393046452D44424639314131363444323900",
"reminderMinutesBeforeStart": 15,
"isReminderOn": true,
"hasAttachments": false,
"subject": "Test",
"bodyPreview": "Test",
"importance": "normal",
"sensitivity": "normal",
"isAllDay": false,
"isCancelled": false,
"isOrganizer": true,
"responseRequested": true,
"seriesMasterId": null,
"showAs": "busy",
"type": "singleInstance",
"webLink": "https://outlook.office365.com/owa/?itemid=AAMkADBlZTUwNTkxLWVmODgtNDVhNC1iZjhlLTdjNjA1ODZlMDI5MgBGAAAAAACUbnk%2FiwQZRbXMgkfKtmYhBwCpTc%2FInBsuTYwTUBb%2BVIb4AAAAAAENAACpTc%2FInBsuTYwTUBb%2BVIb4AADnwkmiAAA%3D&exvsurl=1&path=/calendar/item",
"onlineMeetingUrl": null,
"responseStatus": {
"response": "organizer",
"time": "0001-01-01T00:00:00Z"
},
"body": {
"contentType": "html",
"content": "Hi"
},
"start": {
"dateTime": "2018-03-08T04:00:00.0000000",
"timeZone": "UTC"
},
"end": {
"dateTime": "2018-03-08T05:00:00.0000000",
"timeZone": "UTC"
},
"location": {
"displayName": "Test",
"locationType": "default",
"uniqueId": "Test",
"uniqueIdType": "private"
},
"locations": [
{
"displayName": "Test",
"locationType": "default",
"uniqueId": "Test",
"uniqueIdType": "private"
}
],
"recurrence": null,
"attendees": [
{
"type": "required",
"status": {
"response": "declined",
"time": "2018-03-06T22:29:05Z"
},
"emailAddress": {
"name": "Rose",
"address": "rose#example.com"
}
}
],
"organizer": {
"emailAddress": {
"name": "Jack",
"address": "jack#example.com"
}
}
}
}
The API doesn't expose this information directly. The related entities just don't have the fields defined to show this information. However, the data is obviously there, you just need to know how to get to it. The answer lies in extended properties.
Basically since Graph doesn't expose these values, we need to dig into the MAPI properties that store this data. After a little digging through the Exchange server protocol docs I found PidLidAppointmentProposedStartWhole and PidLidAppointmentProposedEndWhole. We just need to translate those MAPI property definitions into the Graph extended property syntax.
From the doc, those are both in the PSETID_Appointment namespace, which uses the GUID {00062002-0000-0000-C000-000000000046}. PidLidAppointmentProposedStartWhole uses id 0x8250, and PidLidAppointmentProposedEndWhole uses id 0x8251. So those should translate to:
PidLidAppointmentProposedStartWhole: 'SystemTime {00062002-0000-0000-C000-000000000046} Id 0x8250'
PidLidAppointmentProposedEndWhole: 'SystemTime {00062002-0000-0000-C000-000000000046} Id 0x8251'
If we use those in an $expand clause as per Get singleValueLegacyExtendedProperty, we get something like:
GET /me/mailfolders/inbox/messages?$expand=singleValueExtendedProperties($filter=id eq 'SystemTime {00062002-0000-0000-C000-000000000046} Id 0x8250' or id eq 'SystemTime {00062002-0000-0000-C000-000000000046} Id 0x8251')
And the response looks something like this (other properties omitted):
{
"subject": "New Time Proposed: Let's meet",
"singleValueExtendedProperties": [
{
"id": "SystemTime {00062002-0000-0000-c000-000000000046} Id 0x8250",
"value": "2018-03-20T20:00:00Z"
},
{
"id": "SystemTime {00062002-0000-0000-c000-000000000046} Id 0x8251",
"value": "2018-03-20T21:00:00Z"
}
]
}

Update records in ElasticSearch

I had like to update the logdate column for ALL records in a specific index. From what I have read so far, it seems that it is not possible? I am correct?
Here's a sample of a document:
{
"_index": "logstash-01-2015",
"_type": "ufdb",
"_id": "AU__EvrALg15uxY1Wxf9",
"_score": 1,
"_source": {
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"#version": "1",
"#timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-08-14T04:50:05.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
}
You are correct, that is not possible.
There's been an open issue asking Update by Query for long time, and I'm not sure it's going to be implemented anytime soon since it is very problematic for the underlying lucene engine. It requires deleting all documents and reindexing them.
An Update by Query Plugin is available on github, but it's experimental and I never tried it.
UPDATE 2018-05-02
The original answer is quite old. Update By Query is now supported.
You can use the partial update API.
To test it, I created a trivial index:
PUT /test_index
Then created a document:
PUT /test_index/doc/1
{
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"#version": "1",
"#timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-08-14T04:50:05.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
Now I can do a partial update on the document with:
POST /test_index/doc/1/_update
{
"doc": {
"logdate": "2015-09-25T12:20:00.000Z"
}
}
If I retrieve the document:
GET /test_index/doc/1
I will see that the logdate property has been updated:
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_version": 2,
"found": true,
"_source": {
"message": "2015-08-14 06:50:05 [31946] PASS level2 10.249.10.70 level2 ads http://ad.360yield.com/unpixel.... GET",
"#version": "1",
"#timestamp": "2015-09-24T11:17:57.389Z",
"type": "ufdb",
"file": "/usr/local/ufdbguard/logs/ufdbguardd.log",
"host": "PROXY-DEV",
"offset": "3983281700",
"logdate": "2015-09-25T12:20:00.000Z",
"status": "PASS",
"group": "level2",
"clientip": "10.249.10.70",
"category": "ads",
"url": "http://ad.360yield.com/unpixel....",
"method": "GET",
"tags": [
"_grokparsefailure"
]
}
}
Here is the code I used to test it:
http://sense.qbox.io/gist/236bf271df6d867f5f0c87eacab592e41d3095cf

How to search exact text in nested document in elasticsearch

I have a index like this,
"_index": "test",
"_type": "products",
"_id": "URpYIFBAQRiPPu1BFOZiQg",
"_score": null,
"_source": {
"currency": null,
"colors": [],
"api": 1,
"sku": 9999227900050002,
"category_path": [
{
"id": "cat00000",
"name": "B1"
},
{
"id": "abcat0400000",
"name": "Cameras & Camcorders"
},
{
"id": "abcat0401000",
"name": "Digital Cameras"
},
{
"id": "abcat0401005",
"name": "Digital SLR Cameras"
},
{
"id": "pcmcat180400050006",
"name": "DSLR Package Deals"
}
],
"price": 1034.99,
"status": 1,
"description": null,
}
And i want to search only exact text ["Camcorders"] in category_path field.
I did some match query, but it search all the products which has "Camcorders" as a part of the text. Can some one help me to solve this.
Thanks
To search in nested field use like following query
{
"query": {
"term": {
"category_path.name": {
"value": "b1"
}
}
}
}
HOpe it helps..!
you could add one more nested field raw_name with not_analyzed analyzer and match against it.

Resources