fabric-sdk-go error: No peers to connect to - go

We are using the IBM Managed Blockchain based on fabric 1.2, and using the Go SDK #5e291d3a34f59beb9a8ae2bcbba388515648dc73. When we try to invoke the chaincode we have installed on the peers, we get a "no peers to connect to" error:
[fabsdk/fab] 2019/01/25 12:25:57 UTC - dispatcher.(*Dispatcher).Start.func1 -> DEBU Listening for events...
[fabsdk/fab] 2019/01/25 12:25:57 UTC - client.(*Client).connect -> DEBU ... got error in connection response: no peers to connect to
[fabsdk/common] 2019/01/25 12:25:57 UTC - retry.(*RetryableInvoker).Invoke -> DEBU Failed with err [error registering for TxStatus event: no peers to connect to] on attempt #1. Checking if retry is warranted...
[fabsdk/util] 2019/01/25 12:25:57 UTC - lazyref.(*Reference).setTimerRunning -> DEBU Timer started
[fabsdk/common] 2019/01/25 12:25:57 UTC - retry.(*RetryableInvoker).Invoke -> DEBU ... retry for err [error registering for TxStatus event: no peers to connect to] is NOT warranted after 1 attempt(s).
[fabsdk/util] 2019/01/25 12:25:57 UTC - lazyref.checkTimeStarted -> DEBU Starting timer
After putting some extra logging of our own it looks like the failure happens because the peer discovery service (https://github.com/hyperledger/fabric-sdk-go/blob/master/pkg/client/common/discovery/dynamicdiscovery/chservice.go#L72) doesn't return any peers in its response. The targets seem to be set correctly.
After dumping the GRPC response, we get
(*discovery.Response)(0xc4205cd600)(
results: <
members: <
peers_by_org: <
key: "org1"
value: <>
>
>
>
)
We also set up a different fabric network on our own with the same chaincode, which works properly and the same dump shows this instead (some parts were redacted):
(*discovery.Response)(0xc42045ed20)(
results: <
members: <
peers_by_org: <
key: "Org1MSP"
value: <
peers: <
state_info: <
payload:"<redacted>"
>
membership_info: <payload:"<redacted>" >
identity:"<redacted>"
>
peers: <
state_info: <
payload:"<redacted>"
signature:"<redacted>"
>
membership_info: <
payload:"<redacted>"
signature:"<redacted>"
>
identity:"<redacted>"
>
>
>
peers_by_org: <
key:"Org2MSP"
value: <
peers: <
state_info:<
payload:"<redacted>"
signature:"<redacted>"
>
membership_info: <
payload:"<redacted>"
>
identity:"<redacted>"
>
peers: <
state_info: <
payload:"<redacted>"
signature:"<redacted>"
>
membership_info: <
payload:"<redacted>"
signature:"<redacted>"
>
identity:"<redacted>"
>
>
>
>
>
)
I am unsure if the issue is in the configuration of the SDK or the IBM Managed fabric network. If it is the IBM network, then it seems that somehow the peers aren't aware that they are members of an organization.
Has anyone seen this behaviour before?
I searched the hyperledger official rocket chat for the "no peers to connect to" error, and some results came up, but they seemed to be caused by other reasons, such as the peers being excluded by the SDK, not due to the response to the discovery request.

Make sure you have anchor peers configured in the channel.
Make sure the peers have external endpoints configured, i.e - (CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer1.org1.example.com:7051)

Related

Nomad Artifact download issue

Operating system
Windows 10
Working on nomad - 0.11.3
Nomad Java SDK - 0.11.3.0
Nomad runs as dev mode
I am trying to download git repo using nomad job. But getting the error after loading the repo in job's allocation folder.
Error ::
2 errors occurred:
* failed to parse config:
* Root value must be object: The root value in a JSON-based configuration must be either a JSON object or a JSON array of objects.
Job file (if appropriate)
{
"id": "get_git_job",
"name": "get_git_job",
"datacenters": [
"dc1"
],
"taskGroups": [
{
"name": "get_git_group",
"tasks": [
{
"name": "get_git_task",
"driver": "raw_exec",
"resources": {
"cpu": 500,
"memoryMb": 2000
},
"artifacts": [
{
"getterSource": "github.com/hashicorp/nomad",
"relativeDest": "local/repo"
}
],
"leader": false,
"shutdownDelay": 0
}
]
}
],
"dispatched": false
}
Nomad Client logs (if appropriate)
[INFO] client.alloc_runner.task_runner.task_hook.logmon.nomad.exe: opening fifo: alloc_id=10becf73-7abc-39c6-2114-38eea708103b task=get_git_task #module=logmon path=//./pipe/get_git_task-48748a1a.stdout timestamp=2020-12-02T16:32:33.755+0530
[DEBUG] client.alloc_runner.task_runner.task_hook.artifacts: downloading artifact: alloc_id=10becf73-7abc-39c6-2114-38eea708103b task=get_git_task artifact=github.com/hashicorp/nomad
[INFO] client.alloc_runner.task_runner.task_hook.logmon.nomad.exe: opening fifo: alloc_id=10becf73-7abc-39c6-2114-38eea708103b task=get_git_task #module=logmon path=//./pipe/get_git_task-48748a1a.stderr timestamp=2020-12-02T16:32:33.761+0530
[DEBUG] client: updated allocations: index=518 total=25 pulled=22 filtered=3
[DEBUG] client: allocation updates: added=0 removed=0 updated=22 ignored=3
[DEBUG] client: allocation updates applied: added=0 removed=0 updated=22 ignored=3 errors=0
[DEBUG] nomad.deployments_watcher: deadline hit: deployment_id=64d58e2c-d695-27a8-3daa-134d90e10807 job="<ns: "default", id: "get_git_job">" rollback=false
[DEBUG] worker: dequeued evaluation: eval_id=0aa4f715-be9c-91de-e1ed-a1d9b41093bc
[DEBUG] worker.service_sched: reconciled current state with desired state: eval_id=0aa4f715-be9c-91de-e1ed-a1d9b41093bc job_id=get_git_job namespace=default results="Total changes: (place 0) (destructive 0) (inplace 0) (stop 0)
Desired Changes for "get_git_group": (place 0) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 1) (canary 0)"
[DEBUG] worker.service_sched: setting eval status: eval_id=0aa4f715-be9c-91de-e1ed-a1d9b41093bc job_id=get_git_job namespace=default status=complete
[DEBUG] worker: updated evaluation: eval="<Eval "0aa4f715-be9c-91de-e1ed-a1d9b41093bc" JobID: "get_git_job" Namespace: "default">"
[DEBUG] worker: ack evaluation: eval_id=0aa4f715-be9c-91de-e1ed-a1d9b41093bc
[WARN] client.alloc_runner.task_runner: some environment variables not available for rendering: alloc_id=10becf73-7abc-39c6-2114-38eea708103b task=get_git_task keys=
[ERROR] client.alloc_runner.task_runner: running driver failed: alloc_id=10becf73-7abc-39c6-2114-38eea708103b task=get_git_task error="2 errors occurred:
* failed to parse config:
* Root value must be object: The root value in a JSON-based configuration must be either a JSON object or a JSON array of objects.
"
[INFO] client.alloc_runner.task_runner: not restarting task: alloc_id=10becf73-7abc-39c6-2114-38eea708103b task=get_git_task reason="Error was unrecoverable"
[INFO] client.gc: marking allocation for GC: alloc_id=10becf73-7abc-39c6-2114-38eea708103b
[DEBUG] nomad.client: adding evaluations for rescheduling failed allocations: num_evals=1
[DEBUG] worker: dequeued evaluation: eval_id=0490184c-d395-3e65-d38b-8dabd70b9b9d
[DEBUG] worker.service_sched: reconciled current state with desired state: eval_id=0490184c-d395-3e65-d38b-8dabd70b9b9d job_id=get_git_job namespace=default results="Total changes: (place 0) (destructive 0) (inplace 0) (stop 0)
anyone can help with this.
The question is resolved with the help of the Nomad team. And the solvation is that I need to add a command configuration Bcz of the driver is raw_exec.

The debugger does not pause at the next breakpoint and trace doesn't report anything

When I use curl to post a SOAP message into message flow, it takes 9 seconds to respond. Debug won't stop at breakpoints and User Trace doesn't report anything. Meanwhile when I do the same request from Postman or SoapUI (the first message takes the same amount of time, later all messages take around 70 - 200ms) debugger and user trace work as intended. What is cause of this behavior?
IBM App Connect Enterprise 11.0.0.8
curl --trace-time output:
03:29:07.484000 * Trying <host>...
03:29:07.484000 * TCP_NODELAY set
03:29:07.531000 * Connected to <host_name> (<host>) port 7800 (#0)
03:29:07.531000 > POST /service HTTP/1.1
03:29:07.531000 > Host: <host_name>:7800
03:29:07.531000 > User-Agent: curl/7.55.1
03:29:07.531000 > Accept: */*
03:29:07.531000 > Content-Length: 508
03:29:07.531000 > Content-Type: application/x-www-form-urlencoded
03:29:07.531000 >
03:29:07.546000 * upload completely sent off: 508 out of 508 bytes
03:29:16.671000 < HTTP/1.1 200 OK
03:29:16.671000 < Cache-Control: no-cache
03:29:16.671000 < Pragma: no-cache
03:29:16.687000 < Expires: -1
03:29:16.687000 < X-AspNet-Version: 4.0.30319
03:29:16.687000 < X-Powered-By: ASP.NET
03:29:16.687000 < Date: Fri, 22 May 2020 01:29:14 GMT
03:29:16.687000 < Content-Type: text/xml; charset=utf-8
03:29:16.703000 < Server: IBM App Connect Enterprise
03:29:16.703000 < Content-Length: 243
EDIT: I'm still trying to resolve the problem - this time with help of Wireshark and user trace:
curl:
02:56:37.781000 > POST /service HTTP/1.1
after few milliseconds Wireshark detects POST message from "curl machine" - that means there are no problems with the connection
after around 10s delay SoapInput receives data. Why it takes so long?
2020-05-23 02:56:37.257076 6220 UserTrace BIP11304I: The Parser of type 'MQROOT' has been deleted from address '0x131f1312190'. This thread now has '0' cached parsers.
2020-05-23 02:56:40.591580 3684 UserTrace BIP11303I: A Parser of type 'MQROOT' has been created at address '0x131f13144a0'. This thread now has '36' cached parsers.
2020-05-23 02:56:45.143380 3684 UserTrace BIP11501I: Received data from input node 'SOAP Input'.
The input node 'SOAP Input' has received data and has propagated it to the message flow 'link'.
2020-05-23 02:56:45.143880 3684 UserTrace BIP6060I: Node 'link.SOAP Input' used parser type 'Properties' to process a portion of the incoming message of length '0' bytes beginning at offset '0'.
Fixed by restarting the machine. Any clue what exactly caused this problem?

Kafka stream app failing to fetch offsets for partition

I created a kafka cluster with 3 brokers and following details:
Created 3 topics, each one with replication factor=3 and partitions=2.
Created 2 producers each one writing to one of the topics.
Created a Streams application to process messages from 2 topics and write to the 3rd topic.
It was all running fine till now but I suddenly started getting the following warning when starting the Streams application:
[WARN ] 2018-06-08 21:16:49.188 [Stream3-4f7403ad-aba6-4d34-885d-60114fc9fcff-StreamThread-1] org.apache.kafka.clients.consumer.internals.Fetcher [Consumer clientId=Stream3-4f7403ad-aba6-4d34-885d-60114fc9fcff-StreamThread-1-restore-consumer, groupId=] Attempt to fetch offsets for partition Stream3-KSTREAM-OUTEROTHER-0000000005-store-changelog-0 failed due to: Disk error when trying to access log file on the disk.
Due to this warning, Streams application is not processing anything from the 2 topics.
I tried following things:
Stopped all brokers, deleted kafka-logs directory for each broker and restarted the brokers. It didn't solve the issue.
Stopped zookeeper and all brokers, deleted zookeeper logs as well as kafka-logs for each broker, restarted zookeeper and brokers and created the topics again. This too didn't solve the issue.
I am not able to find anything related to this error on official docs or web. Does anyone have an idea of why am I getting this error suddenly?
EDIT:
Out of 3 brokers, 2 brokers(broker-0 and broker-2) continously emit these logs:
Broker-0 logs:
[2018-06-09 02:03:08,750] INFO [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial11_topic-1 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
[2018-06-09 02:03:08,750] INFO [ReplicaFetcher replicaId=0, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial12_topic-0 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
Broker-2 logs:
[2018-06-09 02:04:46,889] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial11_topic-1 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
[2018-06-09 02:04:46,889] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Retrying leaderEpoch request for partition initial12_topic-0 as the leader reported an error: NOT_LEADER_FOR_PARTITION (kafka.server.ReplicaFetcherThread)
Broker-1 shows following logs:
[2018-06-09 01:21:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-06-09 01:31:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2018-06-09 01:39:44,667] ERROR [KafkaApi-1] Number of alive brokers '0' does not meet the required replication factor '1' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)
[2018-06-09 01:41:26,689] INFO [GroupMetadataManager brokerId=1] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
I again stopped zookeeper and brokers, deleted their logs and restarted. As soon as I create the topics again, I start getting the above logs.
Topic details:
[zk: localhost:2181(CONNECTED) 3] get /brokers/topics/initial11_topic
{"version":1,"partitions":{"1":[1,0,2],"0":[0,2,1]}}
cZxid = 0x53
ctime = Sat Jun 09 01:25:42 EDT 2018
mZxid = 0x53
mtime = Sat Jun 09 01:25:42 EDT 2018
pZxid = 0x54
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 52
numChildren = 1
[zk: localhost:2181(CONNECTED) 4] get /brokers/topics/initial12_topic
{"version":1,"partitions":{"1":[2,1,0],"0":[1,0,2]}}
cZxid = 0x61
ctime = Sat Jun 09 01:25:47 EDT 2018
mZxid = 0x61
mtime = Sat Jun 09 01:25:47 EDT 2018
pZxid = 0x62
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 52
numChildren = 1
[zk: localhost:2181(CONNECTED) 5] get /brokers/topics/final11_topic
{"version":1,"partitions":{"1":[0,1,2],"0":[2,0,1]}}
cZxid = 0x48
ctime = Sat Jun 09 01:25:32 EDT 2018
mZxid = 0x48
mtime = Sat Jun 09 01:25:32 EDT 2018
pZxid = 0x4a
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 52
numChildren = 1
Any clue?
I found out the issue. It was due to following incorrect config in server.properties of broker-1:
advertised.listeners=PLAINTEXT://10.23.152.109:9094
Mistakenly port for advertised.listeners got changed to same as port of advertised.listeners of broker-2.

jersey+springboot can't response correct http status [duplicate]

I've encountered the same issue as in this question, using Spring Boot 1.3.0 and not having my controllers annotated with #RestController, just #Path and #Service. As the OP in that question says,
this is, to me, anything but sensible
I also can't understand why would they have it redirect to /error. And it is very likely that I'm missing something, because I can only give back 404s or 200s to the client.
My problem is that his solution doesn't seem to work with 1.3.0, so I have the following request flow: let's say my code throws a NullPointerException. It'll be handled by one of my ExceptionMappers
#Provider
public class GeneralExceptionMapper implements ExceptionMapper<Throwable> {
private static final Logger LOGGER = LoggerFactory.getLogger(GeneralExceptionMapper.class);
#Override
public Response toResponse(Throwable exception) {
LOGGER.error(exception.getLocalizedMessage());
return Response.status(Response.Status.INTERNAL_SERVER_ERROR).build();
}
}
And my code returns a 500, but instead of sending it back to the client, it tries to redirect it to /error. If I don't have another resource for that, it'll send back a 404.
2015-12-16 18:33:21.268 INFO 9708 --- [nio-8080-exec-1] o.glassfish.jersey.filter.LoggingFilter : 1 * Server has received a request on thread http-nio-8080-exec-1
1 > GET http://localhost:8080/nullpointerexception
1 > accept: */*
1 > host: localhost:8080
1 > user-agent: curl/7.45.0
2015-12-16 18:33:29.492 INFO 9708 --- [nio-8080-exec-1] o.glassfish.jersey.filter.LoggingFilter : 1 * Server responded with a response on thread http-nio-8080-exec-1
1 < 500
2015-12-16 18:33:29.540 INFO 9708 --- [nio-8080-exec-1] o.glassfish.jersey.filter.LoggingFilter : 2 * Server has received a request on thread http-nio-8080-exec-1
2 > GET http://localhost:8080/error
2 > accept: */*
2 > host: localhost:8080
2 > user-agent: curl/7.45.0
2015-12-16 18:33:37.249 INFO 9708 --- [nio-8080-exec-1] o.glassfish.jersey.filter.LoggingFilter : 2 * Server responded with a response on thread http-nio-8080-exec-1
2 < 404
And client's side (curl):
$ curl -v http://localhost:8080/nullpointerexception
* STATE: INIT => CONNECT handle 0x6000572d0; line 1090 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* Trying ::1...
* STATE: CONNECT => WAITCONNECT handle 0x6000572d0; line 1143 (connection #0)
* Connected to localhost (::1) port 8080 (#0)
* STATE: WAITCONNECT => SENDPROTOCONNECT handle 0x6000572d0; line 1240 (connection #0)
* STATE: SENDPROTOCONNECT => DO handle 0x6000572d0; line 1258 (connection #0)
> GET /nullpointerexception HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.45.0
> Accept: */*
>
* STATE: DO => DO_DONE handle 0x6000572d0; line 1337 (connection #0)
* STATE: DO_DONE => WAITPERFORM handle 0x6000572d0; line 1464 (connection #0)
* STATE: WAITPERFORM => PERFORM handle 0x6000572d0; line 1474 (connection #0)
* HTTP 1.1 or later with persistent connection, pipelining supported
< HTTP/1.1 404 Not Found
* Server Apache-Coyote/1.1 is not blacklisted
< Server: Apache-Coyote/1.1
< Content-Length: 0
< Date: Wed, 16 Dec 2015 17:33:37 GMT
<
* STATE: PERFORM => DONE handle 0x6000572d0; line 1632 (connection #0)
* Curl_done
* Connection #0 to host localhost left intact
So it's always a 404. Unless I do have such an /error resource, then what? what am I supposed to return? All I have at that point is a GET request to /error. And I don't want those extra requests consuming resources and polluting my logs.
What am I missing? And if nothing, what should I do with my exception handling?
You can set the Jersey property ServerProperties.RESPONSE_SET_STATUS_OVER_SEND_ERROR to true.
Whenever response status is 4xx or 5xx it is possible to choose between sendError or setStatus on container specific Response implementation. E.g. on servlet container Jersey can call HttpServletResponse.setStatus(...) or HttpServletResponse.sendError(...).
Calling sendError(...) method usually resets entity, response headers and provide error page for specified status code (e.g. servlet error-page configuration). However if you want to post-process response (e.g. by servlet filter) the only way to do it is calling setStatus(...) on container Response object.
If property value is true the method Response.setStatus(...) is used over default Response.sendError(...).
Type of the property value is boolean. The default value is false.
You can set Jersey property simply by calling property(key, value) in your ResourceConfig subclass constructor.

Rainbows worker is being killed after timeout even though it replied

I have a Sinatra app running on Rainbows.
I log the following :
before do
logger.info("#{Process.pid} #{Time.now} #{request.ip} #{request.path_info} # {params.to_s}")
end
and
after do
logger.info("#{Process.pid} #{Time.now} #{request.ip} #{request.path_info} #{params.to_s} => #{response.headers['X-API-Status']} (#{response.successful?})")
end
and in my logs I can read:
25988 2012-11-13 11:57:52 +0100 192.168.90.1 /req {"u"=>"810000027"}
25988 2012-11-13 11:57:59 +0100 192.168.90.1 /req {"u"=>"810000027"} => 200 (true)
192.168.90.1 - - [13/Nov/2012 11:57:59] "POST /req HTTP/1.1" **200** 14 7.5862
25988 2012-11-13 11:57:59 +0100 192.168.90.1 /req {"u"=>"810000027"}
25988 2012-11-13 11:57:59 +0100 192.168.90.1 /req {"u"=>"810000027"} => 200 (true)
192.168.90.1 - - [13/Nov/2012 11:57:59] "POST /req HTTP/1.1" 200 14 0.0223
E, [2012-11-13T11:58:04.099913 #25875] ERROR -- : worker=2 PID:**25988** timeout (12s > 11s), killing
E, [2012-11-13T11:58:04.106428 #25875] ERROR -- : reaped #<Process::Status: pid 25988 SIGKILL (signal 9)> worker=2
My worker (pid 25988) is being killed as if it had not responded to the first request... But it has, obviously! It even handled another request (and I use Base concurrency model -> no concurrency)
My Rainbows configuration is:
Rainbows! do
timeout(10)
end
listen(3000)
pid('/tmp/rainbows.pid')
stderr_path('/var/log/rainbows.log')
stdout_path('/var/log/rainbows.log')
working_directory('/opt/app')
worker_processes(4)
Do you have any idea of what happens ? Or how I could investigate further ?
Thanks !
Actually the issue lied in the request that was "kept alive" too long by the client (flash). Apparently there is no way to properly close the tcp connection in AS3...
I fixed my issue with:
Rainbows! do
keepalive_timeout(0)
end
which seems appropriate for me anyway.

Resources