I have a 4 node elasticsearch cluster. I have a .net console application that is designed to fill the cluster with data which comes from sql. Everything works fine as long as I keep the rate of records being added (or deleted) fairly low. If I increase the number of threads eventually I will see timeout errors from my console app. The cluster has a total of 48 cores and the average time it takes to index a record is about .1 seconds.
I have been able to get it to do about 7000 records (documents) per second. I never see any exceptions thrown from elasticsearch.net that indicate low resources. I never see any of the indexing queues overloaded. The servers never peak to more than about 10% cpu. It looks like the issue is not the cluster or it's configuration but something in the nest connection. Here is my code for the connection:
//set up the es client
Uri node = new Uri(ConfigurationManager.AppSettings["ESConnectionString"]);
var connectionPool = new SniffingConnectionPool(new[] { node });
ConnectionSettings settings = new ConnectionSettings(connectionPool);
settings.SetDefaultPropertyNameInferrer(p => p); //ditch the camelcase
settings.SniffOnConnectionFault(true);
settings.SniffOnStartup(true);
settings.SniffLifeSpan(TimeSpan.FromMinutes(1));
settings.SetPingTimeout(3000);
settings.SetTimeout(5000);
settings.MaximumRetries(5);
//settings.SetMaximumAsyncConnections(20);
settings.SetDefaultIndex("dummyindex");
settings.SetBasicAuthentication(ConfigurationManager.AppSettings["ESUser"], ConfigurationManager.AppSettings["ESPass"]);
ElasticClient client = new ElasticClient(settings);
I have the cluster set up with http.basic authentication, but I have tried with it turned on and off and there is no difference.
Here are some of the pertinent settings from the ES nodes:
discovery.zen.minimum_master_nodes: 2
discovery.zen.fd.ping_timeout: 30s
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["CACHE01","CACHE02","CACHE03","CACHE04"]
cluster.routing.allocation.node_concurrent_recoveries: 5
indices.recovery.max_bytes_per_sec: 50mb
http.basic.enabled: true
http.basic.user: "admin"
http.basic.password: "XXXXXXX"
At this point I can't seem to figure out if it's the .Net client that is the issue or the servers? Everything points to the client but I'm at a loss for what to try next.
I don't think I can use the BulkAPI because I'm essentially just replicating changes from a SQL server and in order to keep them in sync I execute the change as soon as it's received.
It seems when I'm inserting new documents I can go at a much faster pace then when updating. I have read the updating docs and it almost reads like partial updates are better than full updates, but the there is the whole get-update-delete-reindex things that seems to happen with every update.
According to the es docs I'm not supposed to tweak the thread pools or the performance settings. I don't think I'm hitting any of those limits anyway. The ES error logs don't indicate any issue either.
Anyone have advice on what I can do to track down the connection errors?
UPDATE:
This is the actual error:
Error: Unexpected result (SaveToES). Elasticsearch.Net.Exceptions.MaxRetryException: Sniffing known nodes in the cluster caused a maxretry exception of its own ---> Elasticsearch.Net.Exceptions.SniffException: Sniffing known nodes in the cluster caused a maxretry exception of its own ---> Elasticsearch.Net.Exceptions.MaxRetryException: Retry timeout 00:00:05 was hit after retrying 1 times: 'GET _nodes/_all/clear?timeout=3000'.
InnerException: WebException, InnerMessage: The operation has timed out, InnerStackTrace: at System.Net.HttpWebRequest.GetResponse()
at Elasticsearch.Net.Connection.HttpConnection.DoSynchronousRequest(HttpWebRequest request, Byte[] data, IRequestConfiguration requestSpecificConfig)
InnerException: WebException, InnerMessage: The operation has timed out, InnerStackTrace: at System.Net.HttpWebRequest.GetResponse()
at Elasticsearch.Net.Connection.HttpConnection.DoSynchronousRequest(HttpWebRequest request, Byte[] data, IRequestConfiguration requestSpecificConfig) ---> System.AggregateException: One or more errors occurred. ---> System.Net.WebException: The operation has timed out
at System.Net.HttpWebRequest.GetResponse()
at Elasticsearch.Net.Connection.HttpConnection.DoSynchronousRequest(HttpWebRequest request, Byte[] data, IRequestConfiguration requestSpecificConfig)
--- End of inner exception stack trace ---
--- End of inner exception stack trace ---
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandlerBase.ThrowMaxRetryExceptionWhenNeeded[T](TransportRequestState1 requestState, Int32 maxRetries)
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandler.RetryRequest[T](TransportRequestState1 requestState)
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandler.DoRequest[T](TransportRequestState1 requestState)
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandler.RetryRequest[T](TransportRequestState1 requestState)
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandler.DoRequest[T](TransportRequestState1 requestState)
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandler.Request[T](TransportRequestState1 requestState, Object data)
at Elasticsearch.Net.Connection.Transport.Elasticsearch.Net.Connection.ITransportDelegator.Sniff(ITransportRequestState ownerState)
--- End of inner exception stack trace ---
--- End of inner exception stack trace ---
at Elasticsearch.Net.Connection.Transport.Elasticsearch.Net.Connection.ITransportDelegator.Sniff(ITransportRequestState ownerState)
at Elasticsearch.Net.Connection.Transport.Elasticsearch.Net.Connection.ITransportDelegator.SniffClusterState(ITransportRequestState requestState)
at Elasticsearch.Net.Connection.Transport.Elasticsearch.Net.Connection.ITransportDelegator.SniffOnConnectionFailure(ITransportRequestState requestState)
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandler.RetryRequest[T](TransportRequestState1 requestState)
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandler.DoRequest[T](TransportRequestState1 requestState)
at Elasticsearch.Net.Connection.RequestHandlers.RequestHandler.Request[T](TransportRequestState1 requestState, Object data)
at Elasticsearch.Net.Connection.Transport.DoRequest[T](String method, String path, Object data, IRequestParameters requestParameters)
at Elasticsearch.Net.ElasticsearchClient.DoRequest[T](String method, String path, Object data, IRequestParameters requestParameters)
at Elasticsearch.Net.ElasticsearchClient.IndicesCreatePost[T](String index, Object body, Func2 requestParameters)
at Nest.RawDispatch.IndicesCreateDispatch[T](ElasticsearchPathInfo1 pathInfo, Object body)
at Nest.ElasticClient.<CreateIndex>b__281_0(ElasticsearchPathInfo1 p, ICreateIndexRequest d)
at Nest.ElasticClient.Nest.IHighLevelToLowLevelDispatcher.Dispatch[D,Q,R](D descriptor, Func3 dispatch)
at Nest.ElasticClient.CreateIndex(Func2 createIndexSelector)
at DCSCache.esvRepository.CreateIndex(String IndexName, String IndexVersion)
at DCSCache.esvRepository.Save(esv ItemToSave, String IndexName, String IndexVersion)
Related
I have a springboot application which uses azure sdk. I want to set the retry count to just once for authenticating since currently it uses the default value of 3 as I want the exception to be thrown without much delay for incorrect credentials.
com.azure.core.http.policy.RetryPolicy : Retry attempts have been exhausted after 3 attempts.
I tried debugging and found this, https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/resourcemanager/docs/AUTH.md but the Retry Policy only specifies after how long we can retry, not how many times. Further checking, RetryPolicy creates a new ExponentialBackOff instance - and here I see this comment:
Creates an instance of ExponentialBackoff with a maximum number of retry attempts configured by the environment property Configuration.PROPERTY_AZURE_REQUEST_RETRY_COUNT, or three if it isn't configured or is less than or equal to 0. This strategy starts with a delay of 800 milliseconds and exponentially increases with each additional retry attempt to a maximum of 8 seconds.
At this point, not sure how to proceed. Can someone point me how we can set the retries only for this particular method?
public AzureResourceManager getAzureResourceManagerClient(String clientId, String clientSecret, String tenantId,
String subscriptionId) {
AzureProfile profile = new AzureProfile(tenantId, subscriptionId, AzureEnvironment.AZURE);
TokenCredential clientSecretCredential = new ClientSecretCredentialBuilder()
.clientId(clientId)
.clientSecret(clientSecret)
.tenantId(tenantId)
.authorityHost(profile.getEnvironment().getActiveDirectoryEndpoint())
.build();
return AzureResourceManager.configure()
.authenticate(clientSecretCredential, profile)
.withSubscription(subscriptionId);
}
Randomly getting the RangeError: Maximum call stack size exceeded when send the data to connected client(s) via Socket.IO room concept. Gone through few forums it is stating that the data object may have self-referencing array Node.js + Socket.io Maximum call stack size exceeded this exception may occurs but in my code I getting the exception in both the plain string data & data object also.
Below are the sample code snips
Sending plain text
socket.emit('STATUS','OK');
Stack Trace
Error in SendPlainText : RangeError: Maximum call stack size exceeded
at TLSSocket.Socket._writeGeneric (net.js:1:1)
at TLSSocket.Socket._write (net.js:783:8)
at doWrite (_stream_writable.js:397:12)
at writeOrBuffer (_stream_writable.js:383:5)
at TLSSocket.Writable.write (_stream_writable.js:290:11)
at TLSSocket.Socket.write (net.js:707:40)
at Sender.sendFrame (/node_v0_10_36/node_modules/ws/lib/Sender.js:390:20)
at Sender.send (/node_v0_10_36/node_modules/ws/lib/Sender.js:312:12)
at WebSocket.send (/node_v0_10_36/node_modules/ws/lib/WebSocket.js:377:18)
at send (/node_v0_10_36/node_modules/engine.io/lib/transports/websocket.js:114:17)
Sending object data
var clients = socketio.sockets.adapter.rooms['ROOMID'];
if(clients != undefined && clients != null)
{
console.log('Sending data to client');
socketio.sockets.in('ROOMID').emit('DATA', data);
}
Stack Trace
Error in SendData : RangeError: Maximum call stack size exceeded
at /node_v0_10_36/node_modules/engine.io-parser/lib/index.js:236:12
at proxy (/node_v0_10_36/node_modules/after/index.js:23:13)
at /node_v0_10_36/node_modules/engine.io-parser/lib/index.js:255:7
at /node_v0_10_36/node_modules/engine.io-parser/lib/index.js:231:7
at Object.exports.encodePacket (/node_v0_10_36/node_modules/engine.io-parser/lib/index.js:79:10)
at encodeOne (/node_v0_10_36/node_modules/engine.io-parser/lib/index.js:230:13)
at map (/node_v0_10_36/node_modules/engine.io-parser/lib/index.js:253:5)
at Object.exports.encodePayload (/node_v0_10_36/node_modules/engine.io-parser/lib/index.js:235:3)
at XHR.Polling.send (/node_v0_10_36/node_modules/engine.io/lib/transports/polling.js:246:10)
at Socket.flush (/node_v0_10_36/node_modules/engine.io/lib/socket.js:431:20)
I'm writing a kafka stream 2.3.0 application to count the number of events in a session window and hopefully to print out only the final record when a session times out.
Serde<String> stringSerde = Serdes.serdeFrom(new StringSerializer(), new StringDeserializer());
Serde<MuseObject> museObjectSerde = Serdes.serdeFrom(new MuseObjectSerializer(), new MuseObjectDeserializer());
StreamsBuilder builder = new StreamsBuilder();
builder
.stream(INPUT_TOPIC, Consumed.with(stringSerde, museObjectSerde))
.map((key, value) -> {
return KeyValue.pair(value.getSourceValue("vid"), value.toString());
})
.groupByKey(Grouped.with(Serdes.String(), Serdes.String()))
.windowedBy(SessionWindows.with(Duration.ofSeconds(INACTIVITY_GAP)).grace(Duration.ZERO))
.count(Materialized.with(Serdes.String(), Serdes.Long()))
.suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded()))
.toStream()
.print(Printed.toSysOut());
However the application crashes when a session times out:
12:35:03.859 [kafka-producer-network-thread | kafka-streams-test-kgu-4c3f2398-8f67-429d-82ce-6062c86af466-StreamThread-1-producer] ERROR o.a.k.s.p.i.RecordCollectorImpl - task [1_0] Error sending record to topic kafka-streams-test-kgu-KTABLE-SUPPRESS-STATE-STORE-0000000008-changelog due to The server experienced an unexpected error when processing the request.; No more records will be sent and no more offsets will be recorded for this task. Enable TRACE logging to view failed record key and value.
org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request.
12:35:03.862 [kafka-streams-test-kgu-4c3f2398-8f67-429d-82ce-6062c86af466-StreamThread-1] ERROR o.a.k.s.p.i.AssignedStreamsTasks - stream-thread [kafka-streams-test-kgu-4c3f2398-8f67-429d-82ce-6062c86af466-StreamThread-1] Failed to commit stream task 1_0 due to the following error:
org.apache.kafka.streams.errors.StreamsException: task [1_0] Abort sending since an error caught with a previous record (key user01\x00\x00\x01m!\xCE\x99u\x00\x00\x01m!\xCE\x80\xD1 value null timestamp null) to topic kafka-streams-test-kgu-KTABLE-SUPPRESS-STATE-STORE-0000000008-changelog due to org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request.
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:138)
I've tried to comment out ".suppress..." line. It works fine without suppress() and prints out something like this
[KSTREAM-FILTER-0000000011]: [user01#1568230244561/1568230250869], MuseSession{vid='user01', es='txnSuccess', count=6, start=2019-06-26 17:11:02.937, end=2019-06-26 18:07:10.685, sessionType='open'}".
What did I miss in using suppress()? Is there another way to filter out only the session records that have been timed out?
Any help is appreciated. Thanks in advance.
suppress() requires at least broker version 0.11.0 and message format 0.11.
This question already has an answer here:
Elasticsearch bulk insert with NEST returns es_rejected_execution_exception
(1 answer)
Closed 5 years ago.
I am trying to bulk insert data from SQL to ElasticSearch index. Below is the code I am using and total number of records is around 1.5 million. I think it something to do with connection setting but I am not able to figure it out. Can someone please help with this code or suggest better way to do it?
public void InsertReceipts
{
IEnumerable<Receipts> receipts = GetFromDB() // get receipts from SQL DB
const string index = "receipts";
var config = ConfigurationManager.AppSettings["ElasticSearchUri"];
var node = new Uri(config);
var settings = new ConnectionSettings(node).RequestTimeout(TimeSpan.FromMinutes(30));
var client = new ElasticClient(settings);
var bulkIndexer = new BulkDescriptor();
foreach (var receiptBatch in receipts.Batch(20000)) //using MoreLinq for Batch
{
Parallel.ForEach(receiptBatch, (receipt) =>
{
bulkIndexer.Index<OfficeReceipt>(i => i
.Document(receipt)
.Id(receipt.TransactionGuid)
.Index(index));
});
var response = client.Bulk(bulkIndexer);
if (!response.IsValid)
{
_logger.LogError(response.ServerError.ToString());
}
bulkIndexer = new BulkDescriptor();
}
}
Code works fine but takes around 10 mins to complete. When I try to increase batch size, it fails with below error:
Invalid NEST response built from a unsuccessful low level call on
POST: /_bulk
Invalid Bulk items: OriginalException: System.Net.WebException: The
underlying connection was closed: An unexpected error occurred on a
send. ---> System.IO.IOException: Unable to write data to the
transport connection: An existing connection was forcibly closed by
the remote host. ---> System.Net.Sockets.SocketException: An existing
connection was forcibly closed by the remote host
A good place to start is with batches of 1,000 to 5,000 documents or, if your documents are very large, with even smaller batches.
It is often useful to keep an eye on the physical size of your bulk requests. One thousand 1KB documents is very different from one thousand 1MB documents. A good bulk size to start playing with is around 5-15MB in size.
I had a similar problem. My problem was solved by adding following code, before the ElasticClient connection is established:
System.Net.ServicePointManager.Expect100Continue = false;
var settings = new ConnectionSettings(Constants.ElasticSearch.Node);
var client = new ElasticClient(settings);
var response = client.Search<DtoTypes.Customer.SearchResult>(s =>
s.From(0)
.Size(100000)
.Query(q => q.MatchAll()));
It works when the size is smaller, but I want to retrieve all documents in an index that has over 100k documents. Must be a configuration setting I'm missing to get around a limit. I've also tried Take() instead of Size()
The Debug Info returned back is
"Invalid NEST response built from a unsuccesful low level call on
POST: /_search\r\n# Audit trail of this API call:\r\n - BadResponse:
Node: http://127.0.0.1:9200/ Took: 00:00:00.2964038\r\n# ServerError:
ServerError: 500Type: search_phase_execution_exception Reason: \"all
shards failed\"\r\n# OriginalException: System.Net.WebException: The
remote server returned an error: (500) Internal Server Error.\r\n at
System.Net.HttpWebRequest.GetResponse()\r\n at
Elasticsearch.Net.HttpConnection.Request[TReturn](RequestData
requestData) in
C:\users\russ\source\elasticsearch-net\src\Elasticsearch.Net\Connection\HttpConnection.cs:line
138\r\n# Request:\r\n\r\n#
Response:\r\n\r\n"
Elasticsearch has a soft limit on the amount of results it allows to return. If you want more then 10.000 results in one go, you should use the scan and scroll functionality :)
From the Elasticsearch documentation:
"Note that from + size can not be more than the
index.max_result_window index setting which defaults to 10,000. See
the Scroll API for more efficient ways to do deep scrolling."
Reference:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-from-size.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
https://nest.azurewebsites.net/nest/search/scroll.html