Can't get geo_point to work with Bonsai on Heroku - heroku

I'm trying to use a geo_point field on Heroku/Bonsai but it just doesn't want to work.
It works in local, but whenever I check the mapping for my index on Heroku/Bonsai it says my field is a string: "coordinates":{"type":"string"}
My mapping looks like this:
tire.mapping do
...
indexes :coordinates, type: "geo_point", lat_lon: true
...
end
And my to_indexed_json like this:
def to_indexed_json
{
...
coordinates: map_marker.nil? ? nil : [map_marker.latitude, map_marker.longitude].join(','),
...
}.to_json
end
In the console on Heroku I tried MyModel.mapping and MyModel.index.mapping and the first one correctly has :coordinates=>{:type=>"geo_point", :lat_lon=>true}.

Here's how I got this to work. Index name 'myindex' type name 'myindextype'
On the local machine
curl -XGET https://[LOCAL_ES_URL]/myindex/myindextype/_mapping
save the output to a .json file. example: typedefinition.json (or hand build one)
{
"myindextype":{
"properties":{
"dataone":{"type":"string"},
"datatwo":{"type":"double"},
"location":{"type":"geo_point"},
"datathree":{"type":"long"},
"datafour":{"type":"string"}
}
}
}
On heroku enter the command
heroku config
and get the BONSAI_URL. Put it in the follwoing commands in place of [BONSAI_URL]. (https://asdfasdfdsf:asdfadf#asdfasdfasdf.us-east-1.bonsai.io/myindex)
curl -XDELETE https://[BONSAI_URL]/myindex
curl -XPOST https://[BONSAI_URL]/myindex
curl -XPUT -d#typedefinition.json https://[BONSAI_URL]/myindex/myindextype/_mapping
curl -XGET https://[BONSAI_URL]/myindex/myindextype/_mapping
Deletes the indes if it exists.
Createds an empty index.
Uses the .json file as a definition for mapping.
Get the new mapping to make sure it worked.

Related

Escape character in index name when using elasticdump on cli

elasticdump is a great tool for moving and saving indices. Copying one index data from one index to another is simply this command:
$ elasticdump --input=http://user:pass#localhost:9200/my-index --key=... --output=https://user:pass#my-cluster:9200/my-index --tlsAuth --type=data
In my application I dynamically create indices. Somehow I managed to have an unescaped character in some indices, which is ’. One resulting index name is for example mens’s_clothing.
When I now want to copy the data from that index to another, elasticdump gives me this error:
$ elasticdump --input=https://...#my-cluster:9200/mens’s_clothing --key=... --output=https://...#localhost:9200/mens’s_clothing --tlsAuth --type=data
...
... Error Emitted => Request path contains unescaped characters
Fair enough. Let's URL-encode the index name:
$ elasticdump --input=https://...#my-cluster:9200/mens%E2%80%99s_clothing --key=... --output=https://...#localhost:9200/mens%E2%80%99s_clothing --tlsAuth --type=data
...
{
_index: 'mens%E2%80%99s_clothing',
_type: '_doc',
_id: '367f9125_1_1',
status: 400,
error: {
type: 'invalid_index_name_exception',
reason: 'Invalid index name [mens%E2%80%99s_clothing], must be lowercase',
index_uuid: '_na_',
index: 'mens%E2%80%99s_clothing'
}
}
Elasticsearch returns this error, so elasticdump doesn't understand that the index name is encoded, because when I try to create the index with curl it works:
$ curl -X PUT "https://...#localhost:9202/men%E2%80%99s_clothing"
{"acknowledged":true,"shards_acknowledged":true,"index":"men’s_clothing"}
Does anyone know how to tell elasticdump that the index name is URI-encoded?

Backup and restore some records of an elasticsearch index

I wish to take a backup of some records(eg latest 1 million records only) of an Elasticsearch index and restore this backup on a different machine. It would be better if this could be done using available/built-in Elasticsearch features.
I've tried Elasticsearch snapshot and restore (following code), but looks like it takes a backup of the whole index, and not selective records.
curl -H 'Content-Type: application/json' -X PUT "localhost:9200/_snapshot/es_data_dump?pretty=true" -d '
{
"type": "fs",
"settings": {
"compress" : true,
"location": "es_data_dump"
}
}'
curl -H 'Content-Type: application/json' -X PUT "localhost:9200/_snapshot/es_data_dump/snapshot1?wait_for_completion=true&pretty=true" -d '
{
"indices" : "index_name",
"type": "fs",
"settings": {
"compress" : true,
"location": "es_data_dump"
}
}'
The format of backup could be anything, as long as it can be successfully restored on a different machine.
you can use _reinex API. it can take any query. after reindex, you have a new index as backup, which contains requested records. easily copy it where ever you want.
complete information is here: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
In the end, I fetched the required data using python driver because that is what I found the easiest for the given use case.
For that, I ran an Elasticsearch query and stored its response in a file in newline-separated format and then I later restored data from it using another python script. A maximum of 10000 entries are returned this way along with the scroll ID to be used to fetch next 10000 entries and so on.
es = Elasticsearch(timeout=30, max_retries=10, retry_on_timeout=True)
page = es.search(index=['ct_analytics'], body={'size': 10000, 'query': _query, 'stored_fields': '*'}, scroll='5m')
while len(page['hits']['hits']) > 0:
es_data = page['hits']['hits'] #Store this as you like
scrollId = page['_scroll_id']
page = es.scroll(scroll_id=scrollId, scroll='5m')

How to compress with 'best_compression' elasticsearch data

How can I compress all elasticsearch data (existing one as well as new data) with the "best_compression" option?
I know since 5.00 version I can't put "index.codec: best_compression" in the elasticsearch.yml file. I've read the log which indicates that it's deccaped and I should use
curl -XPUT 'http://localhost:9200/_all/_settings?preserve_existing=true' -d '{"index.codec" : "best_compression"}'
But when used I'm given the following error:
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Can't update non dynamic settings [[index.codec]] for open indices [[logstash-dns-2018.07.30/xHq6UfgsSD2M1dBZhV3cOg], [logstash-2018.07.27/7U7uUsEORFqXtJtrk4KvDw], [logstash-dns-2018.07.27/Xbx15QXOQ5KJAK7iop_54Q], [logstash-http-2018.07.27/q0Rs65a3TjW4NJfcljUHEw], [logstash-flow-2018.07.30/0Erbh2TcRgmFJLMLr8Ka8w], [logstash-2018.07.30/boOd8BdrQV2QoziKaZ_2lw], [logstash-alert-2018.07.27/o5yqwdNqR5yAcbJ-HCNVHw], [logstash-alert-2018.07.30/pp6ZWKLISECVzUiCDDeydQ], [logstash-tls-2018.07.30/rZi6KfC7RtqOVjUt7CCqDQ], [logstash-ssh-2018.07.27/wKi-p6slSqO0-vbwRqS1ZA], [.kibana/XaFQRcEXTW6jLUCmBijzKQ], [logstash-tls-2018.07.27/hbiXYCzjRumh3ND6up9vNw], [logstash-flow-2018.07.27/XfspJr1TS4y6MnCgAmRq1g], [logstash-fileinfo-2018.07.27/9VWyBHsqRmO4QsnN-gdt_w], [logstash-http-2018.07.30/U9JO9Cp-QQO7gvRNoHt7FQ], [logstash-fileinfo-2018.07.30/nlwHeDOsQ3ii8CLxcgE3Ag]]"}],"type":"illegal_argument_exception","reason":"Can't update non dynamic settings [[index.codec]] for open indices [[logstash-dns-2018.07.30/xHq6UfgsSD2M1dBZhV3cOg], [logstash-2018.07.27/7U7uUsEORFqXtJtrk4KvDw], [logstash-dns-2018.07.27/Xbx15QXOQ5KJAK7iop_54Q], [logstash-http-2018.07.27/q0Rs65a3TjW4NJfcljUHEw], [logstash-flow-2018.07.30/0Erbh2TcRgmFJLMLr8Ka8w], [logstash-2018.07.30/boOd8BdrQV2QoziKaZ_2lw], [logstash-alert-2018.07.27/o5yqwdNqR5yAcbJ-HCNVHw], [logstash-alert-2018.07.30/pp6ZWKLISECVzUiCDDeydQ], [logstash-tls-2018.07.30/rZi6KfC7RtqOVjUt7CCqDQ], [logstash-ssh-2018.07.27/wKi-p6slSqO0-vbwRqS1ZA], [.kibana/XaFQRcEXTW6jLUCmBijzKQ], [logstash-tls-2018.07.27/hbiXYCzjRumh3ND6up9vNw], [logstash-flow-2018.07.27/XfspJr1TS4y6MnCgAmRq1g], [logstash-fileinfo-2018.07.27/9VWyBHsqRmO4QsnN-gdt_w], [logstash-http-2018.07.30/U9JO9Cp-QQO7gvRNoHt7FQ], [logstash-fileinfo-2018.07.30/nlwHeDOsQ3ii8CLxcgE3Ag]]"},"status":400}
Solved:
Close all indices:
http://localhost:9200/_all/_close'
Apply best_compression to all
curl -XPUT 'http://localhost:9200/_all/_settings' -d '{"index.codec" : "best_compression"}'
Open all indices:
curl -XPOST 'http://localhost:9200/_all/_open'

How to delete all attributes from the schema in solr?

Deleting all documents from solr is
curl http://localhost:8983/solr/trans/update?commit=true -d "<delete><query>*:*</query></delete>"
Adding a (static) attribute to the schema is
curl -X POST -H 'Content-type:application/json' --data-binary '{ "add-field":{"name":"trans","type":"string","stored":true, "indexed":true},}' http://localhost:8983/solr/trans/schema
Deleting one attribute is
curl -X POST -H 'Content-type:application/json' -d '{ "delete-field":{"name":"trans"}}' http://arteika:8983/solr/trans/schema
Is there a way to delete all attributes from the schema?
At least in version 6.6 of the Schema API and up to the current version 7.5 of it, you can pass multiple commands in a single post (see 6.6 and 7.5 documenation, respectively). There are multiple accepted formats, but the most intuitive one (I think) is just passing an array for the action you want to perform:
curl -X POST -H 'Content-type: application/json' -d '{
"delete-field": [
{"name": "trans"},
{"name": "other_field"}
]
}' 'http://arteika:8983/solr/trans/schema'
So. How do we obtain the names of the fields we want to delete? That can be done by querying the Schema:
curl -X GET -H 'Content-type: application/json' 'http://arteika:8983/solr/trans/schema'
In particular, the copyFields, dynamicFields and fields keys in the schema object in the response.
I automated clearing all copy field rules, dynamic field rules and fields as follows. You can of course use any kind of script that is available to you. I used Python 3 (might work with Python 2, I did not test that).
import json
import requests
# load schema information
api = 'http://arteika:8983/solr/trans/schema'
r = requests.get(api)
# delete copy field rules
names = [(o['source'], o['dest']) for o in r.json()['schema']['copyFields']]
payload = {'delete-copy-field': [{'source': name[0], 'dest': name[1]} for name in names]}
requests.post(api, data = json.dumps(payload),
headers = {'Content-type': 'application/json'})
# delete dynamic fields
names = [o['name'] for o in r.json()['schema']['dynamicFields']]
payload = {'delete-dynamic-field': [{'name': name} for name in names]}
requests.post(api, data = json.dumps(payload),
headers = {'Content-type': 'application/json'})
# delete fields
names = [o['name'] for o in r.json()['schema']['fields']]
payload = {'delete-field': [{'name': name} for name in names]}
requests.post(api, data = json.dumps(payload),
headers = {'Content-type': 'application/json'})
Just a note: I received status 400 responses at first, with null error messages. Had a bit of a hard time figuring out how to fix those, so I'm sharing what worked for me. Changing the default of updateRequestProcessorChain in solrconfig.xml to false (default="${update.autoCreateFields:false}") and restarting the Solr service made those errors go away for me. The fields I was deleting were created automatically, that may have something to do with that.

Couchdb view Queries

Could you please help me in creating a view. Below is the requirement
select * from personaccount where name="srini" and user="pup" order by lastloggedin
I have to send name and user as input to the view and the data should be sorted by lastloggedin.
Below is the view I have created but it is not working
{
"language": "javascript",
"views": {
"sortdatetimefunc": {
"map": "function(doc) {
emit({
lastloggedin: doc.lastloggedin,
name: doc.name,
user: doc.user
},doc);
}"
}
}
}
And this the curl command iam using:
http://uta:password#localhost:5984/personaccount/_design/checkdatesorting/_view/sortdatetimefunc?key={\"name:srini\",\"user:pup\"}
My Questions are
As sorting will be done on key and I want it on lastloggedin so I have given that also in emit function.
But iam passing name and user only as parameters. Do we need to pass all the parameters which we give it in key?
First of all I want to convey to you for the reply, I have done the same and iam getting errors. Please help
Could you please try this on your PC, iam posting all the commands :
curl -X PUT http://uta:password#localhost:5984/person-data
curl -X PUT http://uta:password#localhost:5984/person-data/srini -d '{"Name":"SRINI", "Idnum":"383896", "Format":"NTSC", "Studio":"Disney", "Year":"2009", "Rating":"PG", "lastTimeOfCall": "2012-02-08T19:44:37+0100"}'
curl -X PUT http://uta:password#localhost:5984/person-data/raju -d '{"Name":"RAJU", "Idnum":"456787", "Format":"FAT", "Studio":"VFX", "Year":"2010", "Rating":"PG", "lastTimeOfCall": "2012-02-08T19:50:37+0100"}'
curl -X PUT http://uta:password#localhost:5984/person-data/vihar -d '{"Name":"BALA", "Idnum":"567876", "Format":"FAT32", "Studio":"YELL", "Year":"2011", "Rating":"PG", "lastTimeOfCall": "2012-02-08T19:55:37+0100"}'
Here's the view as you said I created :
{
"_id": "_design/persondestwo",
"_rev": "1-0d3b4857b8e6c9e47cc9af771c433571",
"language": "javascript",
"views": {
"personviewtwo": {
"map": "function (doc) {\u000a emit([ doc.Name, doc.Idnum, doc.lastTimeOfCall ], null);\u000a}"
}
}
}
I have fired this command from curl command :
curl -X GET http://uta:password#localhost:5984/person-data/_design/persondestwo/_view/personviewtwo?startkey=["SRINI","383896"]&endkey=["SRINI","383896",{}]descending=true&include_docs=true
I got this error :
[4] 3000
curl: (3) [globbing] error: bad range specification after pos 99
[5] 1776
[6] 2736
[3] Done descending=true
[4] Done(3) curl -X GET http://uta:password#localhost:5984/person-data/_design/persondestwo/_view/personviewtwo?startkey=["SRINI","383896"]
[5] Done endkey=["SRINI","383896"]
I am not knowing what this error is.
I have also tried passing the parameters the below way and it is not helping
curl -X GET http://uta:password#localhost:5984/person-data/_design/persondestwo/_view/personviewtwo?key={\"Name\":\"SRINI\",\"Idnum\": \"383896\"}&descending=true
But I get different errors on escape sequences
Overall I just want this query to be satisfied through the view :
select * from person-data where Name="SRINI" and Idnum="383896" orderby lastTimeOfCall
My concern is how to pass the multiple parameters from curl command as I get lot of errors if I do the above way.
First off, you need to use an array as your key. I would use:
function (doc) {
emit([ doc.name, doc.user, doc.lastLoggedIn ], null);
}
This basically outputs all the documents in order by name, then user, then lastLoggedIn. You can use the following URL to query.
/_design/checkdatesorting/_view/sortdatetimefunc?startkey=["srini","pup"]&endkey=["srini","pup",{}]&include_docs=true
Second, notice I did not output doc as the value of your query. It takes up much more disk space, especially if your documents are fairly large. Just use include_docs=true.
Lastly, refer to the CouchDB Wiki, it's pretty helpful.
I just stumbled upon this question. The errors you are getting are caused by not escaping this command:
curl -X GET http://uta:password#localhost:5984/person-data/_design/persondestwo/_view/personviewtwo?startkey=["SRINI","383896"]&endkey=["SRINI","383896",{}]descending=true&include_docs=true
The & character has a special meaning on the command-line and should be escaped when part of an actual parameter.
So you should put quotes around the big URL, and escape the quotes inside it:
curl -X GET "http://uta:password#localhost:5984/person-data/_design/persondestwo/_view/personviewtwo?startkey=[\"SRINI\",\"383896\"]&endkey=[\"SRINI\",\"383896\",{}]descending=true&include_docs=true"

Resources