Check for field name exists in AWS OpenSearch index - elasticsearch

I want to check whether the field exists in the index of OpenSearch or not. I expect the response should be in Boolean 'True' Or 'False'.
In Elasticsearch to check index exists or not, we use the following code
if es.indices.exists(index="index"):
Is there any function/logic to check for field names present in the AWS OpenSearch index? If the field does not exist, then I have to return the response as "Field not found in the index"
My Sample Code:
from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth
import boto3
host = 'my-test-domain.us-east-1.es.amazonaws.com'
region = 'us-east-1'
credentials = boto3.Session().get_credentials()
auth = AWSV4SignerAuth(credentials, region)
index_name = 'movies'
client = OpenSearch(
hosts = [{'host': host, 'port': 443}],
http_auth = auth,
use_ssl = True,
verify_certs = True,
connection_class = RequestsHttpConnection
)
q = 'miller'
query = {
'size': 5,
'query': {
'multi_match': {
'query': q,
'fields': ['title^2', 'director']
}
}
}
response = client.search(
body = query,
index = index_name
)
print('\nSearch results:')
print(response)

Related

ElasticSearch: Bucketing on a subfield

I have an index, whose mapping has a company_tag keyword field, and a data.company_name keyword field. The company_tag is not a sub-field. The data is an object field. However, when I try the following
from elasticsearch_dsl import Search,A
s = Search()
a = A('terms', field='data__company_name')
s.aggs.bucket("company_name",a)
response = s.execute()
response.success()
The success is True, but response.aggs.to_dict() returns:
{'company_name': {'doc_count_error_upper_bound': 0,
'sum_other_doc_count': 0,
'buckets': []}}
If I do
s = Search()
a = A('terms', field='company_tag')
s.aggs.bucket("company_name",a)
response = s.execute()
response.success()
I do get a success and non-empty buckets.
Why is that?

Not able to read data from elasticsearch using alpakka

I was trying to read document(json data stored in ES) from elasticsearch using alpakka.
I got this alpakka-Elasticsearch.
Here it says that you can stream messages from or to Elasticsearch using the
ElasticsearchSource, ElasticsearchFlow or the ElasticsearchSink.
I tried to impliment ElasticsearchSource method. So my code looks like this
val url = "http://localhost:9200"
val connectionSettings = ElasticsearchConnectionSettings(url)
val sourceSettings = ElasticsearchSourceSettings(connectionSettings)
val elasticsearchParamsV7 = ElasticsearchParams.V7("category_index")
val copy = ElasticsearchSource
.typed[CategoryData](
elasticsearchParamsV7,
query = query,
sourceSettings
).map { message: ReadResult[CategoryData] =>
println("Inside message==================> "+message)
WriteMessage.createIndexMessage(message.id, message.source)
} .runWith(
ElasticsearchSink.create[CategoryData](
elasticsearchParamsV7,ElasticsearchWriteSettings(connectionSettings)
)
)
println("Final data==============>. "+copy)
At the end, copy value returning Future[Done].
But I was not able to read data from ES.
Is there Something I missing?
And also is there any other way using akka http client api to do the same?
What is preferred way to use ES in akka?
To read data from Elasticsearch, sth like this should be enough:
val matchAllQuery = """{"match_all": {}}"""
val result = ElasticsearchSource
.typed[CategoryData](
elasticsearchParamsV7,
query = matchAllQuery,
sourceSettings
).map { message: ReadResult[CategoryData] =>
println("Read message==================> "+message)
}.runWith(Sink.seq)
result.onComplete(res => res.foreach(col => println(s"Read: ${col.size} records")))
If the type CategoryData does not match correctly to what is stored in the index, the query may not return results.
If in doubt, it's possible to read raw JSON:
val elasticsearchSourceRaw = ElasticsearchSource
.create(
elasticsearchParamsV7,
query = matchAllQuery,
settings = sourceSettings
)

Iterating through map of objects in terraform with route 53 records

I am trying to figure out how to read from additional values in Terraform using for / for_each using Terraform 0.12.26
dns.tfvars
mx = {
"mywebsite.org." = {
ttl = "3600"
records = [
"home.mywebsite.org.",
"faq.mywebsite.org."
]
}
"myotherwebsite.org." = {
ttl = "3600"
records = [
"home.myotherwebsite.org."
]
}
}
variables.tf
variable "mx" {
type = map(object({
ttl = string
records = set(string)
}))
}
mx.tf
locals {
mx_records = flatten([
for mx_key, mx in var.mx : [
for record in mx.records : {
mx_key = mx_key
record = record
ttl = mx.ttl
}]
])
}
resource "aws_route53_record" "mx_records" {
for_each = { for mx in local.mx_records : mx.mx_key => mx... }
zone_id = aws_route53_zone.zone.zone_id
name = each.key
type = "MX"
ttl = each.value.ttl
records = [
each.value.record
]
}
In mx.tf, I can comment out the second value, faq.mywebsite.org, and the code works perfectly. I cannot figure out how to set up my for loop and for each statements to get it to "loop" through the second value. The first error I had received stated below:
Error: Duplicate object key
on mx.tf line 13, in resource "aws_route53_record" "mx_records":
13: for_each = { for mx in local.mx_records : mx.mx_key => mx }
|----------------
| mx.mx_key is "mywebsite.org."
Two different items produced the key "mywebsite.org." in this 'for'
expression. If duplicates are expected, use the ellipsis (...) after the value
expression to enable grouping by key.
To my understanding, I do not have two duplicate values helping to form the key so I should not have to use the ellipsis, but I tried using the ellipsis anyway to see if it would apply properly. After adding on the ellipsis after the value expression, I got this error:
Error: Unsupported attribute
on mx.tf line 20, in resource "aws_route53_record" "mx_records":
20: each.value.record
|----------------
| each.value is tuple with 2 elements
This value does not have any attributes.
Any advice on this issue would be appreciated.
UPDATE
Error: [ERR]: Error building changeset: InvalidChangeBatch: [Tried to create resource record set [name='mywebsiteorg.', type='MX'] but it already exists]
status code: 400, request id: dadd6490-efac-47ac-be5d-ab8dad0f4a6c
It's trying to create the record, but it already created because of the first record in the list.
I think you could just construct a map of your objects with key being the index of mx_records list (note the idx being the index):
resource "aws_route53_record" "mx_records" {
for_each = { for idx, mx in local.mx_records : idx => mx }
zone_id = aws_route53_zone.zone.zone_id
name = each.value.mx_key
type = "MX"
ttl = each.value.ttl
records = [
each.value.record
]
}
The above for_each expressions changes your local.mx_records from list(objects) to map(objects), where the map key is idx, and the value is the original object.
Update:
I verified in Route53 and you can't duplicate codes. Thus may try using orginal mx variable:
resource "aws_route53_record" "mx_records" {
for_each = { for idx, mx in var.mx : idx => mx }
zone_id = aws_route53_zone.zone.zone_id
name = each.key
type = "MX"
ttl = each.value.ttl
records = each.value.records
}
Moreover, if you want to avoid flatten function and for loop local variable, you can access the object in the map as:
resource "aws_route53_record" "mx_records" {
for_each = var.mx
zone_id = aws_route53_zone.zone.zone_id
name = each.key
type = "MX"
ttl = each.value["ttl"]
records = each.value["records"]
}

boto3 ec2 create instance with a name

I am new to AWS and using boto3 to launch an instance. However, I notice that when I create the instance, the "Name" field is empty. So, the way I create it is as follows:
def create_instance(ami, instance_type, device_name, iam_role, volume_type,
volume_size,
security_groups, key_name, user_data):
s = boto3.Session(region_name="eu-central-1")
ec2 = s.resource('ec2')
res = ec2.create_instances(
IamInstanceProfile={'Name': iam_role},
ImageId=ami,
InstanceType=instance_type,
SecurityGroupIds=security_groups,
KeyName=key_name,
UserData=user_data,
MaxCount=1,
MinCount=1,
InstanceInitiatedShutdownBehavior='terminate',
BlockDeviceMappings=[{
'DeviceName': device_name,
'Ebs': {
'DeleteOnTermination': True,
'VolumeSize': volume_size,
'VolumeType': volume_type
}
}]
)
instance = res[0]
while instance.state['Name'] == 'pending':
time.sleep(5)
instance.load()
return instance.public_ip_address, instance.public_dns_name
There does not seem to be a simple way to specify the name of the launched instance. How can one do this?
Put a tag with key Name with your instance name as a value.
TagSpecifications=[
{
'ResourceType': 'instance',
'Tags': [
{
'Key': 'Name',
'Value': '<What you want>'
},
]
},
],

NEST Search not found any result

Just find out about nest. I already insert some number of document in Elastic Search. Right now I want to search the data based on my type, subcriberId. I did run through curl and it works just fine. But when I tried using nest, no result found.
My curl which work:
http://localhost:9200/20160902/_search?q=subscribeId:aca0ca1a-c96a-4534-ab0e-f844b81499b7
My NEST code:
var local = new Uri("http://localhost:9200");
var settings = new ConnectionSettings(local);
var elastic = new ElasticClient(settings);
var response = elastic.Search<IntegrationLog>(s => s
.Index(DateTime.Now.ToString("yyyyMMdd"))
.Type("integrationlog")
.Query(q => q
.Term(p => p.SubscribeId, new Guid("aca0ca1a-c96a-4534-ab0e-f844b81499b7"))
)
);
Can someone point what I did wrong?
A key difference between your curl request and your NEST query is that the former is using a query_string query and the latter, a term query. A query_string query input undergoes analysis at query time whilst a term query input does not so depending on how subscribeId is analyzed (or not), you may see different results. Additionally, your curl request is searching across all document types within the index 20160902.
To perform the exact same query in NEST as your curl request would be
void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var connectionSettings = new ConnectionSettings(pool)
// set up NEST with the convention to use the type name
// "integrationlog" for the IntegrationLog
// POCO type
.InferMappingFor<IntegrationLog>(m => m
.TypeName("integrationlog")
);
var client = new ElasticClient(connectionSettings);
var searchResponse = client.Search<IntegrationLog>(s => s
.Index("20160902")
// search across all types. Note that documents found
// will be deserialized into instances of the
// IntegrationLog type
.AllTypes()
.Query(q => q
// use query_string query
.QueryString(qs => qs
.Fields(f => f
.Field(ff => ff.SubscribeId)
)
.Query("aca0ca1a-c96a-4534-ab0e-f844b81499b7")
)
)
);
}
public class IntegrationLog
{
public Guid SubscribeId { get; set; }
}
This yields
POST http://localhost:9200/20160902/_search
{
"query": {
"query_string": {
"query": "aca0ca1a-c96a-4534-ab0e-f844b81499b7",
"fields": [
"subscribeId"
]
}
}
}
this specifies the query_string query in the body of the request which is analogous to using the q query string parameter to specify the query.

Resources