Api Error: TSocket read 0 bytes when using hue with hbase - hadoop

Here is my settings in hue config
[hbase]
# Comma-separated list of HBase Thrift servers for
# clusters in the format of '(name|host:port)'.
hbase_clusters=(Cluster|MasterIP:ThriftPort)
# Hard limit of rows or columns per row fetched before truncating.
## truncate_limit = 500
but when I connect to hue webpage and switch to Hbase tab, it shows
Log:
[08/Dec/2013 19:30:13 +0000] middleware INFO Processing exception: Api Error: TSocket read 0 bytes: Traceback (most recent call last):
File "/home/ubuntu/workspaces/hue/hue-master/build/env/lib/python2.7/site-packages/Django-1.4.5-py2.7.egg/django/core/handlers/base.py", line 111, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/home/ubuntu/workspaces/hue/hue-master/apps/hbase/src/hbase/views.py", line 69, in api_router
return api_dump(HbaseApi().query(*url_params))
File "/home/ubuntu/workspaces/hue/hue-master/apps/hbase/src/hbase/api.py", line 47, in query
raise PopupException(_("Api Error: %s") % e.message)
PopupException: Api Error: TSocket read 0 bytes
[08/Dec/2013 19:30:13 +0000] thrift_util INFO Thrift saw a transport exception: TSocket read 0 bytes
[08/Dec/2013 19:30:13 +0000] thrift_util WARNING Out of retries for thrift call: getTableNames
[08/Dec/2013 19:30:13 +0000] thrift_util DEBUG Thrift call: hbased.Hbase.Client.getTableNames(args=(), kwargs={})
[08/Dec/2013 19:30:13 +0000] thrift_util INFO Thrift exception; retrying: TSocket read 0 bytes
[08/Dec/2013 19:30:13 +0000] thrift_util DEBUG Thrift call: hbased.Hbase.Client.getTableNames(args=(), kwargs={})
[08/Dec/2013 19:30:13 +0000] thrift_util INFO Thrift exception; retrying: TSocket read 0 bytes
[08/Dec/2013 19:30:13 +0000] thrift_util DEBUG Thrift call: hbased.Hbase.Client.getTableNames(args=(), kwargs={})
[08/Dec/2013 19:30:13 +0000] access INFO 118.163.58.205 tracy - "POST /hbase/api/getTableList/Cluster HTTP/1.1"

Is HBase Master and the Thrift server still up? (do you have the logs?)
Also are you using security and which HBase version are you using?

Could you check the localtimes on the instances.
I have had the same issue and I have it fixed now after having the same timezones on all instances and the HUE timezone

Add this to your hbase "core-site.conf":
<property>
<name>hbase.thrift.support.proxyuser</name>
<value>true</value>
</property>
<property>
<name>hbase.regionserver.thrift.http</name>
<value>true</value>
</property>

Related

Telegram bot python erro in webhooks heroku

I deploy a bot in heroku but it give a erro
2021-04-03T02:56:36.610385+00:00 app[web.1]: 2021-04-03 02:56:36,604 - apscheduler.scheduler - INFO - Scheduler started
2021-04-03T02:56:37.316107+00:00 app[web.1]: 2021-04-03 02:56:37,315 - telegram.ext.updater - ERROR - Error while bootstrap set webhook: Bad webhook: webhook can be set up only on ports 80, 88, 443 or 8443
2021-04-03T02:56:37.316196+00:00 app[web.1]: 2021-04-03 02:56:37,316 - telegram.ext.updater - ERROR - Failed bootstrap phase after 0 retries (Bad webhook: webhook can be set up only on ports 80, 88, 443 or 8443)
2021-04-03T02:47:04.325287+00:00 app[web.1]: 2021-04-03 02:47:04,317 - telegram.ext.updater - ERROR - unhandled exception in Bot:1718309867:updater
Well, i tryed all port 80, 88, 8443 but it give me the same erro again.
My hook and port code:
#change PORT TO HEROKU
PORT = int(os.environ.get('PORT', '8443'))
updater = Updater(SECRET_KEY, use_context=True)
#hook to heroku
updater.start_webhook(listen="0.0.0.0",
port=PORT,
url_path=SECRET_KEY)
updater.bot.set_webhook('https://myapp.herokuapp.com/' + SECRET_KEY)
PS: In localhost work very well
Given the date of this question, I assume that you're using PTB v13.4+. In that case, please see this channel post, i.e. change
updater.start_webhook(listen="0.0.0.0",
port=PORT,
url_path=SECRET_KEY)
updater.bot.set_webhook('https://myapp.herokuapp.com/' + SECRET_KEY)
to
updater.start_webhook(listen="0.0.0.0",
port=PORT,
url_path=SECRET_KEY,
webhook_url='https://myapp.herokuapp.com/' + SECRET_KEY)

Is S3NativeFileSystem call killing my Pyspark Application on AWS EMR 4.6.0

My Spark application is failing when it has to access numerous CSV files (~1000 # 63MB each) from S3, and pipe them into a Spark RDD. The actual process of splitting up the CSV seems to work, but an extra function call to S3NativeFileSystem seems to be causing an error and the job to crash.
To begin, the following is my PySpark Application:
from pyspark import SparkContext
sc = SparkContext("local", "Simple App")
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
import time
startTime = float(time.time())
dataPath = 's3://PATHTODIRECTORY/'
sc._jsc.hadoopConfiguration().set("fs.s3.awsAccessKeyId", "MYKEY")
sc._jsc.hadoopConfiguration().set("fs.s3.awsSecretAccessKey", "MYSECRETKEY")
def buildSchemaDF(tableName, columnList):
currentRDD = sc.textFile(dataPath + tableName).map(lambda line: line.split("|"))
currentDF = currentRDD.toDF(columnList)
return currentDF
loadStartTime = float(time.time())
lineitemDF = buildSchemaDF('lineitem*', ['l_orderkey','l_partkey','l_suppkey','l_linenumber','l_quantity','l_extendedprice','l_discount','l_tax','l_returnflag','l_linestatus','l_shipdate','l_commitdate','l_receiptdate','l_shipinstruct','l_shipmode','l_comment'])
lineitemDF.registerTempTable("lineitem")
loadTimeElapsed = float(time.time()) - loadStartTime
queryStartTime = float(time.time())
qstr = """
SELECT
lineitem.l_returnflag,
lineitem.l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_discount) as sum_disc,
sum(l_tax) as sum_tax,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(l_orderkey) as count_order
FROM
lineitem
WHERE
l_shipdate <= '19981001'
GROUP BY
l_returnflag,
l_linestatus
ORDER BY
l_returnflag,
l_linestatus
"""
tpch1DF = sqlContext.sql(qstr)
queryTimeElapsed = float(time.time()) - queryStartTime
totalTimeElapsed = float(time.time()) - startTime
tpch1DF.show()
queryResults = [qstr, loadTimeElapsed, queryTimeElapsed, totalTimeElapsed]
distData = sc.parallelize(queryResults)
distData.saveAsTextFile(dataPath + 'queryResults.csv')
print 'Load Time: ' + str(loadTimeElapsed)
print 'Query Time: ' + str(queryTimeElapsed)
print 'Total Time: ' + str(totalTimeElapsed)
To take it step by step I start off by spinning up a Spark EMR Cluster with the following AWS CLI command (carriage returns added for readability):
aws emr create-cluster --name "Big TPCH Spark cluster2" --release-label emr-4.6.0
--applications Name=Spark --ec2-attributes KeyName=blazing-test-aws
--log-uri s3://aws-logs-132950491118-us-west-2/elasticmapreduce/j-1WZ39GFS3IX49/
--instance-type m3.2xlarge --instance-count 6 --use-default-roles
After the EMR cluster finishes provisioning I then copy over my Pyspark application onto the master node at '/home/hadoop/pysparkApp.py'. With it copied over I'm able to add the Step for spark-submit.
aws emr add-steps --cluster-id j-1DQJ8BDL1394N --steps
Type=spark,Name=SparkTPCHTests,Args=[--deploy-mode,cluster,-
conf,spark.yarn.submit.waitAppCompletion=true,--num-executors,5,--executor
cores,5,--executor memory,20g,/home/hadoop/tpchSpark.py]
,ActionOnFailure=CONTINUE
Now if I run this step over only a few of the aforementioned CSV files the final results will be generated, but the script will still claim to have failed.
I think it's associated with an extra call to S3NativeFileSystem, but I'm not certain. These are the Yarn log messages I'm getting which lead me to that conclusion. The first call appears to work just fine:
16/05/15 23:18:00 INFO HadoopRDD: Input split: s3://data-set-builder/splitLineItem2/lineitemad:0+64901757
16/05/15 23:18:00 INFO latency: StatusCode=[200], ServiceName=[Amazon S3], AWSRequestID=[ED8011CE4E1F6F18], ServiceEndpoint=[https://data-set-builder.s3-us-west-2.amazonaws.com], HttpClientPoolLeasedCount=0, RetryCapacityConsumed=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=2, ClientExecuteTime=[77.956], HttpRequestTime=[77.183], HttpClientReceiveResponseTime=[20.028], RequestSigningTime=[0.229], CredentialsRequestTime=[0.003], ResponseProcessingTime=[0.128], HttpClientSendRequestTime=[0.35],
While the second one does not seem to execute properly, resulting in "Partial Results" (206 Error):
16/05/15 23:18:00 INFO S3NativeFileSystem: Opening 's3://data-set-builder/splitLineItem2/lineitemad' for reading
16/05/15 23:18:00 INFO latency: StatusCode=[206], ServiceName=[Amazon S3], AWSRequestID=[10BDDE61AE13AFBE], ServiceEndpoint=[https://data-set-builder.s3.amazonaws.com], HttpClientPoolLeasedCount=0, RetryCapacityConsumed=0, RequestCount=1, HttpClientPoolPendingCount=0, HttpClientPoolAvailableCount=2, Client Execute Time=[296.86], HttpRequestTime=[295.801], HttpClientReceiveResponseTime=[293.667], RequestSigningTime=[0.204], CredentialsRequestTime=[0.002], ResponseProcessingTime=[0.34], HttpClientSendRequestTime=[0.337],
16/05/15 23:18:02 INFO ApplicationMaster: Waiting for spark context initialization ...
I'm lost as to why it's even making the second call to S3NativeFileSystem when the first one appears to have responded effectively and even split the file. Is this something that is a product of my EMR configuration? I know S3Native has file limit issues and that a straight S3 call is optimal, which is what I've tried to do, but this call seems to be there no matter what I do. Please help!
Also, to add a few other error messages in my Yarn Log in case they are relevant.
1)
16/05/15 23:19:22 ERROR ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
16/05/15 23:19:22 INFO ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Timed out waiting for SparkContext.)
2)
16/05/15 23:19:22 ERROR DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /mnt/yarn/usercache/hadoop/appcache/application_1463354019776_0001/blockmgr-f847744b-c87a-442c-9135-57cae3d1f6f0/2b/temp_shuffle_3fe2e09e-f8e4-4e5d-ac96-1538bdc3b401
java.io.FileNotFoundException: /mnt/yarn/usercache/hadoop/appcache/application_1463354019776_0001/blockmgr-f847744b-c87a-442c-9135-57cae3d1f6f0/2b/temp_shuffle_3fe2e09e-f8e4-4e5d-ac96-1538bdc3b401 (No such file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
at org.apache.spark.storage.DiskBlockObjectWriter.revertPartialWritesAndClose(DiskBlockObjectWriter.scala:162)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.stop(BypassMergeSortShuffleWriter.java:226)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
16/05/15 23:19:22 ERROR BypassMergeSortShuffleWriter: Error while deleting file /mnt/yarn/usercache/hadoop/appcache/application_1463354019776_0001/blockmgr-f847744b-c87a-442c-9135-57cae3d1f6f0/2b/temp_shuffle_3fe2e09e-f8e4-4e5d-ac96-1538bdc3b401
16/05/15 23:19:22 WARN TaskMemoryManager: leak 32.3 MB memory from org.apache.spark.unsafe.map.BytesToBytesMap#762be8fe
16/05/15 23:19:22 ERROR Executor: Managed memory leak detected; size = 33816576 bytes, TID = 14
16/05/15 23:19:22 ERROR Executor: Exception in task 13.0 in stage 1.0 (TID 14)
java.io.FileNotFoundException: /mnt/yarn/usercache/hadoop/appcache/application_1463354019776_0001/blockmgr-f847744b-c87a-442c-9135-57cae3d1f6f0/3a/temp_shuffle_b9001fca-bba9-400d-9bc4-c23c002e0aa9 (No such file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:88)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:140)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Order of precedence for spark configurations is :
SparkContext (code/application) > Spark-submit > Spark-defaults.conf
So couple of things to point here -
Use YARN cluster as deploy mode and master in your spark submit command -
spark-submit --deploy-mode cluster --master yarn ...
OR
spark-submit --master yarn-cluster ...
Remove "local" string from line sc = SparkContext("local", "Simple App") in your code. Use conf = SparkConf().setAppName(appName)
sc = SparkContext(conf=conf) to initialize Spark context.
Ref - http://spark.apache.org/docs/latest/programming-guide.html

elasticsearch nest SniffingConnectionPool not working

I'm using Nest.ElasticClient to connect to Elasticsearch cluster. The cluster is located in Azure VM with just one node.
the cluster is accessible outside the vm by url : http://xxxx.cloudapp.net:9200 and also accessible by ElasticClient if not using SniffingConnectionPool. But not accessible by ElasticClient if using SniffingConnectionPool.
Here is the network config
network.host: [_local_, _site_]
Below is the source code I'm using to get client and check index exists.
var pool = new SniffingConnectionPool(urls.Select(url => new Uri(url)));
ConnectionSettings config = new ConnectionSettings(pool) ;
client = new Nest.ElasticClient(config);
IExistsResponse indexExistsResponse = client.IndexExists(indexName);
The debug info message when I try to use the client to check whether a Index exists, the hostnanme and ip address is modified:
Invalid NEST response built from a unsuccessful low level call on HEAD: /globalleads
# Audit trail of this API call:
- SniffOnStartup: Took: 00:00:00.9846171
- SniffSuccess: Node: http://xxxx.cloudapp.net:9200/ Took: 00:00:00.9595496
- PingFailure: Node: http://10.85.xxx.xx:9200/ Exception: PipelineException Took: 00:00:21.4154660
- SniffOnFail: Took: 00:00:21.1967809
- SniffFailure: Node: http://10.85.xxx.xx:9200/ Exception: PipelineException Took: 00:00:21.1787333
# OriginalException: Elasticsearch.Net.ElasticsearchClientException: One or more errors occurred. ---> System.AggregateException: One or more errors occurred. ---> Elasticsearch.Net.PipelineException: Failed sniffing cluster state. ---> System.AggregateException: One or more errors occurred. ---> Elasticsearch.Net.PipelineException: An error occurred trying to establish a connection with the specified node.
at Elasticsearch.Net.RequestPipeline.Sniff() in D:\dev\git\elasticsearch-net-2.x\src\Elasticsearch.Net\Transport\Pipeline\RequestPipeline.cs:line 326
--- End of inner exception stack trace ---
--- End of inner exception stack trace ---
at Elasticsearch.Net.RequestPipeline.Sniff() in D:\dev\git\elasticsearch-net-2.x\src\Elasticsearch.Net\Transport\Pipeline\RequestPipeline.cs:line 341
at Elasticsearch.Net.RequestPipeline.SniffOnConnectionFailure() in D:\dev\git\elasticsearch-net-2.x\src\Elasticsearch.Net\Transport\Pipeline\RequestPipeline.cs:line 301
at Elasticsearch.Net.Transport`1.Ping(IRequestPipeline pipeline, Node node) in D:\dev\git\elasticsearch-net-2.x\src\Elasticsearch.Net\Transport\Transport.cs:line 179
at Elasticsearch.Net.Transport`1.Request[TReturn](HttpMethod method, String path, PostData`1 data, IRequestParameters requestParameters) in D:\dev\git\elasticsearch-net-2.x\src\Elasticsearch.Net\Transport\Transport.cs:line 68
--- End of inner exception stack trace ---
--- End of inner exception stack trace ---
# Audit exception in step 2 PingFailure:
Elasticsearch.Net.PipelineException: An error occurred trying to establish a connection with the specified node.
at Elasticsearch.Net.RequestPipeline.Ping(Node node) in D:\dev\git\elasticsearch-net-2.x\src\Elasticsearch.Net\Transport\Pipeline\RequestPipeline.cs:line 248
# Audit exception in step 4 SniffFailure:
Elasticsearch.Net.PipelineException: An error occurred trying to establish a connection with the specified node.
at Elasticsearch.Net.RequestPipeline.Sniff() in D:\dev\git\elasticsearch-net-2.x\src\Elasticsearch.Net\Transport\Pipeline\RequestPipeline.cs:line 326
# Request:
<Request stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
# Response:
<Response stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>

jersey+springboot can't response correct http status [duplicate]

I've encountered the same issue as in this question, using Spring Boot 1.3.0 and not having my controllers annotated with #RestController, just #Path and #Service. As the OP in that question says,
this is, to me, anything but sensible
I also can't understand why would they have it redirect to /error. And it is very likely that I'm missing something, because I can only give back 404s or 200s to the client.
My problem is that his solution doesn't seem to work with 1.3.0, so I have the following request flow: let's say my code throws a NullPointerException. It'll be handled by one of my ExceptionMappers
#Provider
public class GeneralExceptionMapper implements ExceptionMapper<Throwable> {
private static final Logger LOGGER = LoggerFactory.getLogger(GeneralExceptionMapper.class);
#Override
public Response toResponse(Throwable exception) {
LOGGER.error(exception.getLocalizedMessage());
return Response.status(Response.Status.INTERNAL_SERVER_ERROR).build();
}
}
And my code returns a 500, but instead of sending it back to the client, it tries to redirect it to /error. If I don't have another resource for that, it'll send back a 404.
2015-12-16 18:33:21.268 INFO 9708 --- [nio-8080-exec-1] o.glassfish.jersey.filter.LoggingFilter : 1 * Server has received a request on thread http-nio-8080-exec-1
1 > GET http://localhost:8080/nullpointerexception
1 > accept: */*
1 > host: localhost:8080
1 > user-agent: curl/7.45.0
2015-12-16 18:33:29.492 INFO 9708 --- [nio-8080-exec-1] o.glassfish.jersey.filter.LoggingFilter : 1 * Server responded with a response on thread http-nio-8080-exec-1
1 < 500
2015-12-16 18:33:29.540 INFO 9708 --- [nio-8080-exec-1] o.glassfish.jersey.filter.LoggingFilter : 2 * Server has received a request on thread http-nio-8080-exec-1
2 > GET http://localhost:8080/error
2 > accept: */*
2 > host: localhost:8080
2 > user-agent: curl/7.45.0
2015-12-16 18:33:37.249 INFO 9708 --- [nio-8080-exec-1] o.glassfish.jersey.filter.LoggingFilter : 2 * Server responded with a response on thread http-nio-8080-exec-1
2 < 404
And client's side (curl):
$ curl -v http://localhost:8080/nullpointerexception
* STATE: INIT => CONNECT handle 0x6000572d0; line 1090 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* Trying ::1...
* STATE: CONNECT => WAITCONNECT handle 0x6000572d0; line 1143 (connection #0)
* Connected to localhost (::1) port 8080 (#0)
* STATE: WAITCONNECT => SENDPROTOCONNECT handle 0x6000572d0; line 1240 (connection #0)
* STATE: SENDPROTOCONNECT => DO handle 0x6000572d0; line 1258 (connection #0)
> GET /nullpointerexception HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.45.0
> Accept: */*
>
* STATE: DO => DO_DONE handle 0x6000572d0; line 1337 (connection #0)
* STATE: DO_DONE => WAITPERFORM handle 0x6000572d0; line 1464 (connection #0)
* STATE: WAITPERFORM => PERFORM handle 0x6000572d0; line 1474 (connection #0)
* HTTP 1.1 or later with persistent connection, pipelining supported
< HTTP/1.1 404 Not Found
* Server Apache-Coyote/1.1 is not blacklisted
< Server: Apache-Coyote/1.1
< Content-Length: 0
< Date: Wed, 16 Dec 2015 17:33:37 GMT
<
* STATE: PERFORM => DONE handle 0x6000572d0; line 1632 (connection #0)
* Curl_done
* Connection #0 to host localhost left intact
So it's always a 404. Unless I do have such an /error resource, then what? what am I supposed to return? All I have at that point is a GET request to /error. And I don't want those extra requests consuming resources and polluting my logs.
What am I missing? And if nothing, what should I do with my exception handling?
You can set the Jersey property ServerProperties.RESPONSE_SET_STATUS_OVER_SEND_ERROR to true.
Whenever response status is 4xx or 5xx it is possible to choose between sendError or setStatus on container specific Response implementation. E.g. on servlet container Jersey can call HttpServletResponse.setStatus(...) or HttpServletResponse.sendError(...).
Calling sendError(...) method usually resets entity, response headers and provide error page for specified status code (e.g. servlet error-page configuration). However if you want to post-process response (e.g. by servlet filter) the only way to do it is calling setStatus(...) on container Response object.
If property value is true the method Response.setStatus(...) is used over default Response.sendError(...).
Type of the property value is boolean. The default value is false.
You can set Jersey property simply by calling property(key, value) in your ResourceConfig subclass constructor.

Where's my time going mysteriously?

I have a ruby script that uses rsolr rubygem to generate XMLs and POST them to Apache Solr (javadoc Update Command) surfaced by Jetty Server. My script logs certain time using the following code
405 unless docs.empty?
406 begin
407 log.info("Adding to solr")
408 response = solr.add(docs)
409 log.info("#{(id_2*100.0)/last_id}% Done")
410 if response['responseHeader']['status'] != 0
411 log.fatal("Document ids not sent")
412 #log.fatal(Solr::Request::AddDocument.new(docs_single).to_s)
413 log.close
414 exit
415 end
416 log.info("#{Time.now.to_f - starttime}s to feed Solr. #{id_1} to #{id_2}")
417 rescue Exception => e
418 log.fatal("Document ids not sent => ")
419 #log.fatal(Solr::Request::AddDocument.new(docs_single).to_s)
420 #log.fatal(docs)
421 log.close
422 exit
423 end
The log generated goes like
I, [2011-10-09T15:03:42.617048 #30092] INFO -- : Executing - SELECT * FROM solr_feeddata_2 WHERE id >= 5879999 AND id < 5881999
I, [2011-10-09T15:03:44.086661 #30092] INFO -- : External Data1 fetch time: 1.45462989807129
I, [2011-10-09T15:03:44.109514 #30092] INFO -- : External Data2 fetch time: 0.0226790904998779
I, [2011-10-09T15:03:44.109611 #30092] INFO -- : 1.49255704879761s to fetch details from database. 5879999 to 5881999
I, [2011-10-09T15:03:44.109702 #30092] INFO -- : Adding data1, data2, building docs
I, [2011-10-09T15:03:45.912603 #30092] INFO -- : 3.29554414749146s to build documents. 5879999 to 5881999
I, [2011-10-09T15:03:45.912730 #30092] INFO -- : Adding to solr
I, [2011-10-09T15:04:24.797620 #30092] INFO -- : 61.180855194502% Done
I, [2011-10-09T15:04:24.797744 #30092] INFO -- : 42.180694103241s to feed Solr. 5879999 to 5881999
According to this log, Solr took (42.18 - 3.29 - 1.49 - 2) 35.4s to respond. (See below comment)
At the same time my Solr log for this particular update goes like
INFO: {add=[5879999, 5880000, 5880001, 5880002, 5880003, 5880004, 5880005, 5880007, ... (1468 adds)]} 0 5780
Oct 9, 2011 3:04:24 PM org.apache.solr.core.SolrCore execute
INFO: [core0] webapp=/solr path=/update params={wt=ruby} status=0 QTime=5780
Oct 9, 2011 3:04:42 PM org.apache.solr.update.processor.LogUpdateProcessor finish
This clearly shows that Solr took 5.78s to add the docs, initiate response send and closed the log updater.
Both the services run on same machine, inside the network, and their ping summary is
rtt min/avg/max/mdev = 0.008/0.010/0.022/0.006 ms
This pattern is clearly visible for every batch data processed. Despite my sincere efforts to get this mystery out, I am not able to get the reason for this behavior.
My Solr mergeFactor is 10, autoCommit is off.

Resources