Elasticsearch 2.0: how to delete by query in Java - elasticsearch

I am trying to upgrade to ES 2.0. I have downloaed ES 2.0 and installed it on my Windows machine.
In my pom.xml, I have the following:
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>2.0.0-rc1</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.plugin</groupId>
<artifactId>delete-by-query</artifactId>
<version>2.0.0-rc1</version>
</dependency>
In my Java code, I did delete by query in the following way when using ES 1.7.3:
StringBuilder b = new StringBuilder("");
b.append("{");
b.append(" \"query\": {");
b.append(" \"term\": {");
b.append(" \"category\": " + category_value );
b.append(" }");
b.append(" }");
b.append("}");
client = getClient();
DeleteByQueryResponse response = client.prepareDeleteByQuery("myindex")
.setTypes("mydocytype")
.setSource(b.toString())
.execute()
.actionGet();
I am hoping to replace this:
DeleteByQueryResponse response = client.prepareDeleteByQuery("myindex")
.setTypes("mydocytype")
.setSource(b.toString())
.execute()
.actionGet();
with ES 2.0 way. Googled but failed to find an example for it. The online API documentation seems too abstract to me. How can I do it?
Another question: Do I have to install delete-by-query plugin in Elasticsearch server?
Thanks for any pointer!
UPDATE
I followed Max's suggestion, and here is what I have now:
First, when create the client, make settings look like the following:
Settings settings = Settings.settingsBuilder()
.put("cluster.name", "mycluster")
.put("plugin.types", DeleteByQueryPlugin.class.getName())
.build();
Second, at the place doing delete-by-query:
DeleteByQueryResponse rsp = new DeleteByQueryRequestBuilder(client, DeleteByQueryAction.INSTANCE)
.setIndices("myindex")
.setTypes("mydoctype")
.setSource(b.toString())
.execute()
.actionGet();
I also installed delete by query plugin by running the following in the root directory of ES:
bin\plugin install delete-by-query
I get errors if I do not install this plugin.
After all these steps, ES related parts work just fine.

plugin.types have been deprecated in ES 2.1.0 (source). So the accepted solution will result in a NullPointerException.
The solution is to use the addPlugin method:
Client client = TransportClient.builder().settings(settings())
.addPlugin(DeleteByQueryPlugin.class)
.build()
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName("host",9300));

I believe you can use this:
DeleteByQueryResponse rsp = new DeleteByQueryRequestBuilder(client, DeleteByQueryAction.INSTANCE)
.setTypes("mydocytype")
.setSource(b.toString())
.execute()
.actionGet();
You have to add plugin type to your settings:
Settings settings = Settings.settingsBuilder()
.put("plugin.types", DeleteByQueryPlugin.class.getName())
If you have remote server you have to install the plugin.

From Elastic 5 in onwards...
final BulkIndexByScrollResponse response = DeleteByQueryAction.INSTANCE.newRequestBuilder(super.transportClient)
.filter(
QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery("_type", "MY_TYPE")) // Trick to define and ensure the type.
.must(QueryBuilders.termQuery("...", "...")))
.source("MY_INDEX")
.get();
return response.getDeleted() > 0;
Oficial documentation

firstly:
add elasticsearch-2.3.3/plugins/delete-by-query/delete-by-query-2.3.3.jar to build path.
then:
Client client = TransportClient.builder().settings(settings)
.addPlugin(DeleteByQueryPlugin.class)
.build()
.addTransportAddress(new InetSocketTransportAddress(
InetAddress.getByName("192.168.0.224"), 9300));

Related

groovy command curl on windows Jenkins

I have a groovy script that work on Linux Jenkins
import groovy.json.JsonSlurper
try {
List<String> artifacts = new ArrayList<String>()
//jira get summery for list by issue type story and label demo and project 11411
def artifactsUrl = 'https://companyname.atlassian.net/rest/api/2/search?jql=project=11411%20and%20issuetype%20in%20(Story)%20and%20labels%20in%20(demo)+&fields=summary' ;
def artifactsObjectRaw = ["curl", "-u", "someusername#xxxx.com:tokenkey" ,"-X" ,"GET", "-H", "Content-Type: application/json", "-H", "accept: application/json","-K", "--url","${artifactsUrl}"].execute().text;
def parser = new JsonSlurper();
def json = parser.parseText(artifactsObjectRaw );
//insert all result into list
for(item in json.issues){
artifacts.add( item.fields.summary);
}
//return list to extended result
return artifacts ;
}catch (Exception e) {
println "There was a problem fetching the artifacts " + e.message;
}
This script return all the names from Jira jobs by the API ,
But when I tried to run this groovy on Windows Jenkins the script will not work because windows do not have the command curl
def artifactsObjectRaw = ["curl", "-u","someusername#xxxx.com:tokenkey" ,"-X" ,"GET", "-H", "Content-Type: application/json", "-H", "accept: application/json","-K","--url","${artifactsUrl}"].execute().text;
how should I preform this command?
The following code:
import groovy.json.JsonSlurper
try {
def baseUrl = 'https://companyname.atlassian.net'
def artifactsUrl = "${baseUrl}/rest/api/2/search?jql=project=MYPROJECT&fields=summary"
def auth = "someusername#somewhere.com:tokenkey".bytes.encodeBase64()
def headers = ['Content-Type': "application/json",
'Authorization': "Basic ${auth}"]
def response = artifactsUrl.toURL().getText(requestProperties: headers)
def json = new JsonSlurper().parseText(response)
// the below will implicitly return a list of summaries, no
// need to define an 'artifacts' list beforehand
def artifacts = json.issues.collect { issue -> issue.fields.summary }
} catch (Exception e) {
e.printStackTrace()
}
is pure groovy, i.e. no need for curl. It gets the items from the jira instance and returns a List<String> of summaries. Since we don't want any external dependencies like HttpBuidler (as you are doing this from jenkins) we have to manually do the basic auth encoding.
Script tested (the connecting and getting json part, did not test the extraction of summary fields) with:
Groovy Version: 2.4.15 JVM: 1.8.0_201 Vendor: Oracle Corporation OS: Linux
against an atlassian on demand cloud instance.
I removed your jql query as it didn't work for me but you should be able to add it back as needed.
Install curl and set the path in environment variable of windows.
Please follow the link to download curl on windows.
I would consider using HTTP request plugin when making HTTP Requests.
Since you are using a plugin, it does not matter if you are running in Windows or .
Linux as your Jenkins Host

ElasticsearchStatusException contains unrecognized parameter: [ccs_minimize_roundtrips]]]

I am trying to do a simple search on ElasticSearch server and getting teh following error
ElasticsearchStatusException[Elasticsearch exception [type=illegal_argument_exception, reason=request [/recordlist1/_search] contains unrecognized parameter: [ccs_minimize_roundtrips]]]
The query String :
{"query":{"match_all":{"boost":1.0}}}
I am using :
elasticsearch-rest-high-level-client (maven artifact)
SearchRequest searchRequest = new SearchRequest(INDEX);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
searchRequest.source(searchSourceBuilder);
try
{
System.out.print(searchRequest.source());
SearchResponse response = getConnection().search(searchRequest,RequestOptions.DEFAULT);
SearchHit[] results=response.getHits().getHits();
for(SearchHit hit : results)
{
String sourceAsString = hit.getSourceAsString();
System.out.println( gson.fromJson(sourceAsString, Record.class).year);
}
}
catch(ElasticsearchException e)
{
e.getDetailedMessage();
e.printStackTrace();
}
catch (java.io.IOException ex)
{
ex.getLocalizedMessage();
ex.printStackTrace();
}
This usually occurs on porting from elastic-search version 6.X.X to 7.X.X.
You should reduce the elastic-search version to 6.7.1 and try running it.
Since you are using maven you should make sure your dependencies should be like:
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>6.7.1</version>
</dependency>
I ran into this same issue when i had by mistake my 6.5 cluster still running while using the 7.2 API. Once I started up my 7.2 cluster the exception went away.
Problem here is the movement of version, probably you were using elastic search 6.x.x and now using 7.x.x
You can definitely solve this by having your elastic search server of 7.x.x.
Elasticsearch 6.x.x used to have type of document
(where you could give type to your documents)
but Elasticsearch 7.x.x onwards it has no type or
default type _doc, so you need to have _doc as your type
while creating mapping.
Maybe you can find this from stackTrace of exception:
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://127.0.0.1:9200], URI [/recordlist1/_search?rest_total_hits_as_int=true&typed_keys=true&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=true&ignore_throttled=false&search_type=query_then_fetch&batched_reduce_size=512], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"request [/_search] contains unrecognized parameters: [ignore_throttled], [rest_total_hits_as_int]"}],"type":"illegal_argument_exception","reason":"request [/_search] contains unrecognized parameters: [ignore_throttled], [rest_total_hits_as_int]"},"status":400}
So, You can try this GET method by curl, which come to the same error message.
curl -XGET http://127.0.0.1:9200/recordlist1/_search?rest_total_hits_as_int=true&typed_keys=true&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=true&ignore_throttled=false&search_type=query_then_fetch&batched_reduce_size=512
I've tried delete 'rest_total_hits_as_int=true' ... Case Closed.
You should check your es-server's version by elasticsearch -V and client’s version in maven.
In high-level client, they add rest_total_hits_as_int=true by default, and I find no access to set it to false.
you can refer to
org.elasticsearch.client.RequestConverters#addSearchRequestParams Line:395 <v6.8.10>
I had no other choice but matching client to match server.
Why it's so Exciting ?
ehn... after all, it is "High Level".

How do I use the DSC package resource to install MongoDB?

I tried what seemed to be the straight-forward approach, and added a Package resource in my node configuration for the MongoDB MSI. I got the following error: "Could not get the https stream for file".
Here's the package configuration I tried:
package MongoDB {
Name = "MongoDB 3.6.11 2008R2Plus SSL (64 bit)"
Path = "https://fastdl.mongodb.org/win32/mongodb-win32-x86_64-2008plus-ssl-3.6.11-signed.msi"
ProductId = "88F7AA23-BDD2-4EBE-9985-EBB5D2E23E83"
Arguments = "ADDLOCAL=`"all`" SHOULD_INSTALL_COMPASS=`"0`" INSTALLLOCATION=`"C:\MongoDB\Server\3.6`""
}
(I had $ConfigurationData references in there, but substituted for literals for simplicity)
I get the following error:
Could not get the https stream for file
Possible TLS version issue? I found that Invoke-WebRequest needed the following to get it to work with that same mongo download URL. Is there a way to do this with the package resource?
[Net.ServicePointManager]::SecurityProtocol = "tls12, tls11, tls"
Using nmap to interrogate both nodejs.org and fastdl.mongodb.org (which is actually on cloudfront) it was indeed true that TLS support differed. Node still supports TLS version 1.0, which so happens to work with PowerShell. But MongoDB's site only supports TLS versions 1.1 or 1.2.
As I mentioned in my question, I suspected that setting the .Net security protocol work, and indeed it does. There's no way to add arbitrary script to the DSC package resource, so I needed to make a script block just to run this code, and have the package resource depend on it.
This is what I got to work:
Node $AllNodes.Where{$_.Role -contains 'MongoDBServer'}.NodeName {
Script SetTLS {
GetScript = { #{ Result = $true } }
SetScript = { [Net.ServicePointManager]::SecurityProtocol = "tls12, tls11, tls" }
TestScript = { $false } #Always run
}
package MongoDB {
Ensure = 'Present'
Name = 'MongoDB 3.6.11 2008R2Plus SSL (64 bit)'
Path = 'https://fastdl.mongodb.org/win32/mongodb-win32-x86_64-2008plus-ssl-3.6.11-signed.msi'
ProductId = ''
Arguments = 'ADDLOCAL="all" SHOULD_INSTALL_COMPASS="0" INSTALLLOCATION="C:\MongoDB\Server\3.6"'
DependsOn = '[Script]SetTLS'
}
...

Search Guard With Spring Data ES

I followed steps mentioned here to secure my local ES installation using SearchGuard (no tag exist for it on SO). Now, it is reachable via Postman only through basic authentication with username password as default admin/admin.
Now, I need to allow my Spring Data ES project to be able to access this ES installation.
I tried:
Settings esSettings = Settings.settingsBuilder()
.put("path.home", ".")
.put("cluster.name", clusterName)
.put("searchguard.ssl.transport.enabled", true)
.put("searchguard.ssl.transport.keystore_filepath", "kirk-keystore.jks")
.put("searchguard.ssl.transport.truststore_filepath", "truststore.jks")
.put("searchguard.ssl.transport.enforce_hostname_verification", false)
.put("request.headers.sg.impersonate.as", "admin")
.build();
TransportClient client = TransportClient.builder().settings(esSettings)
.build().addTransportAddress(
new InetSocketTransportAddress(InetAddress.getByName(elasticsearchHost), elasticsearchPort));
client.prepareGet().putHeader("Authorization", "Basic " + Base64.encodeBase64("admin:admin".getBytes())).get();
return client;
Added header as suggested here.
But all I get is:
[elasticsearch[Meteorite][generic][T#3]] INFO org.elasticsearch.client.transport -
[Meteorite] failed to get node info for {#transport#-1}{127.0.0.1}{127.0.0.1:9300}, disconnecting...
org.elasticsearch.transport.NodeDisconnectedException: [][127.0.0.1:9300][cluster:monitor/nodes/liveness] disconnected
I need to get and post new data in ES (2.4.4).

path.home is not configured in elasticsearch

Exception in thread "main" java.lang.IllegalStateException: path.home is not configured
at org.elasticsearch.env.Environment.(Environment.java:101)
at org.elasticsearch.node.internal.InternalSettingsPreparer.prepareEnvironment(InternalSettingsPreparer.java:81)
at org.elasticsearch.node.Node.(Node.java:128)
at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:145)
at org.elasticsearch.node.NodeBuilder.node(NodeBuilder.java:152)
at JavaAPIMain.main(JavaAPIMain.java:43)
//adding document to elasticsearch using java
Node node = nodeBuilder().clusterName("myapplication").node();
Client client = node.client();
client.prepareIndex("kodcucom", "article", "1")
.setSource(putJsonDocument("ElasticSearch: Java",
"ElasticSeach provides Java API, thus it executes all operations " +
"asynchronously by using client object..",
new Date(),
new String[]{"elasticsearch"},
"Hüseyin Akdoğan")).execute().actionGet();
How about trying this one:
NodeBuilder.nodeBuilder()
.settings(Settings.builder()
.put("path.home", "/path/to/elasticsearch/home/dir")
.node();
Credits: https://github.com/elastic/elasticsearch/issues/15325
Always ask Google about your error message first. There are more than 5k results for your problem.
if you are using intellij or eclipse,
edit configuration and add the below line in your VMoptions
-Des.path.home={dropwizard installation directory}
for example in my mac
-Des.path.home=/Users/supreeth.vp/elasticsearch-2.3.4/bin

Resources