I've been experimenting with Chromecast. I'm running into playback issues.
For the first 5-6 minutes or so, it's all right. It's all a bunch of PROGRESS, TIME_UPDATE, SEGMENT_DOWNLOADED.
player.html?cache=500:102 aj {type: "PROGRESS", currentMediaTime: 398.742094}
player.html?cache=500:102 jj {type: "SEGMENT_DOWNLOADED", downloadTime: 175, size: 33646}
player.html?cache=500:102 aj {type: "TIME_UPDATE", currentMediaTime: 398.9985}
[Violation] 'setInterval' handler took 229ms
player.html?cache=500:102 aj {type: "PROGRESS", currentMediaTime: 401.334166}
`
player.html?cache=500:102 aj {type: "TIME_UPDATE", currentMediaTime: 401.510657}
cast_receiver_framework.js:48 [Violation] 'timeupdate' handler took 455ms
[Violation] 'setTimeout' handler took 1131ms
cast_receiver_framework.js:66 [440.120s] [cast.receiver.MediaManager] Time drifted: -4588.799999999999
cast_receiver_framework.js:66 [440.800s] [cast.receiver.MediaManager] Sending broadcast status message
cast_receiver_framework.js:66 [440.954s] [cast.receiver.IpcChannel] IPC message sent: {"namespace":"urn:x-cast:com.google.cast.media","senderId":":","data":"{\"type\":\"MEDIA_STATUS\",\"status\":[{\"mediaSessionId\":1,\"playbackRate\":1,\"playerState\":\"PLAYING\",\"currentTime\":408.382866,\"supportedMediaCommands\":15,\"volume\":{\"level\":1,\"muted\":false},\"activeTrackIds\":[],\"currentItemId\":1,\"repeatMode\":\"REPEAT_OFF\"}],\"requestId\":0}"}
[Violation] 'setTimeout' handler took 1043ms
[Violation] 'updateend' handler took 177ms
Most of the time, the "time drift" message matches the moment the player stops, and it's never able to recover, so I guess it's somehow related to the problem.
Then, the server will typically take more and more time to respond to chunk requests (20-60 seconds), and the playback will never resume. I'm not sure how the server part is related to the issue. It's puzzling me.
Any word of advice on how to debug this will be appreciated.
For the sake of helping others.
Turns out that after a lot of testing we found that the issue described was only encountered while debugging.
If a debugger is attached to the receiver, it eventually crashes.
Which is inconvenient but at least we can now work our way around the issue and the customers won't be impacted.
Related
I am working on WebSocket application with STOMP and RabbitMQ as the broker. It is a chat-like application. I tried to dump thousands of messages to check if the socket will crash or observe any flicker in the UI.
Javascript code:
for(let i=1;i<1000;i++)
{
if (stompClient) {
var chatMessage = {
sender : name,
content : "File Processing at "+i+ " %",
type : 'Process'
};
stompClient.send("/app/chat.sendMessage", {}, JSON.stringify(chatMessage));
}
}
I followed this tutorial.
Here is what I observed:
A fresh session will send and receive messages within seconds, but if I send again and again the time taken will increase every time 1s, 3s, 5s, & 20s up to 1 min 30s for same 1,000 messages.
After sending 4-5 times if it takes more than 30s. Once messages are received the WebSocket session is terminated, and sometimes flicker is observed.
Re-connecting to the WebSocket after timeout will again take longer than previous to send and receive, and flickering is observed. The UI gets stuck for a while.
I am very new to this concept. Can anyone tell me why is this happening and how to overcome this?
I see SENDER channel goes into RETRY mode after LONGRTS start. It remains in RETRY mode and re-started after LONGMTR(1200) seconds. My question is - does Sender channel comes back to RUNNING as soon as message come, without completion of LONGMTR or it waits for LONGMTR time?
A SENDER channel will go into STATUS(RETRY) - a.k.a. Retry Mode - when the connection to its partner fails.
To begin with, on the assumption that many network failures are very short lived, a SENDER channel will try a small number of fairly close together attempts to re-make the network connection. It will try 10 times at 60 seconds apart, to re-make the connection. This is known as the "short retries".
This 10 times and 60 seconds apart, are coded in the SENDER channel fields called SHORTRTY and SHORTTMR.
If after these first 10 attempts, the SENDER channel has still not managed to get reconnected to the network partner, it will now move to "long retries". It is now operating with the assumption that the network outage is a longer one, for example the partner queue manager machine is having maintenance applied, or there has been some other major outage, and not just a network blip.
The SENDER channel will now try what it hopes is an infinite number of slightly more spaced apart attempts to re-make the connection. It will try 999999999 times at 1200 seconds apart, to re-make the connection.
This 999999999 and 1200, are coded in the SENDER channel fields called LONGRTY and LONGTMR.
You can see how many attempts are left by using the DISPLAY CHSTATUS command and looking at the SHORTRTS and LONGRTS fields. These should how many of the 10 or 999999999 are left. If SHORTRTS(0) then you know the SENDER is into "long retry mode".
If, on any of these attempts to re-make the connection, it is successful, it will stop retrying and you will see the SENDER channel show STATUS(RUNNING). Note that the success is due to the network connection having been successfully made, and is nothing to do with whether a message arrives or not.
It will not continue making retry attempts after it successfully connects to the partner (until the next time the connection is lost of course).
If your channel is in STATUS(RETRY) you should look in the AMQERR01.LOG to discover the reason for the failure. It may be something you can fix at the SENDER end or it may be something that needs to be fixed at the RECEIVER end, for example restarting the queue manager or the listener.
With ZeroMQ and CPPZMQ 4.3.2, I want to drop old messages for all my sockets including
PAIR
Pub/Sub
REQ/REP
So I use m_socks[channel].setsockopt(ZMQ_CONFLATE, 1) on all my sockets before binding/connecting.
Test
However, when I made the following test, it seems that the old messages are still flushed out on each reconnection. In this test,
I use a thread to keep sending generated sinewave to a receiver thread
Every 10 seconds I double the sinewave's frequency
Then after 10 seconds I stop the process
Below is the pseudocode of the sender
// on sender end
auto thenSec = high_resolution_clock::now();
while(m_isRunning) {
// generate sinewave, double the frequency every 10s or so
auto nowSec = high_resolution_clock::now();
if (duration_cast<seconds>(nowSec - thenSec).count() > 10) {
m_sine.SetFreq(m_sine.GetFreq()*2);
thenSec = nowSec;
}
m_sine.Generate(audio);
// send to rendering thread
m_messenger.send("inproc://sound-ear.pair",
(const void*)(audio),
audio_size,
zmq::send_flags::dontwait
);
}
Note that I already use DONTWAIT to mitigate blocking.
On the receiver side I have a zmq::poller_event handler that simply receives the last message on event polling.
In the stop sequence I reset the sinewave frequency to its lowest value, say, 440Hz.
Expected
The expected behaviour would be:
If I stop both the sender and the receiver after 10s when the frequency is doubled,
and I restart both,
then I should see the sinewave reset to 440Hz.
Observed
But the observed behaviour is that the received sinewave is still of the doubled frequency after restarting the communication, i.e., 880Hz.
Question
Am I doing it wrong or should I use some kind of killswitch to force drop all messages in this case?
OK, I think I solved it myself. Kind of.
Actual solution
I finally realized that the behaviour I want is to flush all messages when I stop the rendering. According to the official doc(How can I flush all messages that are in the ZeroMQ socket queue?), this can only be achieved by
set the sockets of both sender's and receiver's ZMQ_LINGER option to 0, meaning to keep nothing on closing those sockets;
closing the sockets on both sender and receiver ends, which also involves bootstrapping pollers and all references to the sockets.
This seems a lot of unnecessary work if I'm to restart rendering my data again, right after the stop sequence. But I found no other way to solve this cleanly.
Initial effort
It seems to me that ZMQ_CONFLATE does not make a difference on PAIR. I really have to tweak high water marks on sender and receiver ends using ZMQ_SNDHWM and ZMQ_RCVHWM.
However, I said "kind of solved" because tweaking HWM in the end is not the optimal solution for a realtime application,
having ZMQ_SNDHWM / ZMQ_RCVHWM set to the minimum "1", we still have a sizable latency in terms of realtime.
Also, the consumer thread could fall into underrun situatioin, i.e., perceivable jitters with the lowest HWM.
If I'm not doing anything wrong, I guess the optimal solution would still be shared memory for my targeted scenario. This is sad because I really enjoyed the simplicity of ZMQ's multicast messaging patternsand hate to deal with thread locking littered everywhere.
when i send message to broker,this exception occasionally occurs.
MQBrokerException: CODE: 2 DESC: [TIMEOUT_CLEAN_QUEUE]broker busy, start flow control for a while
This means broker is too busy(when tps>1,5000) to handle so many sending message request.
What would be the most impossible reason to cause this? Disk ,cpu or other things? How can i fix it?
There are many possible ways.
The root cause is that, there are some messages has waited for long time and no worker thread processes them, rocketmq will trigger the fast failure.
So the below is the cause:
Too many thread are working and they are working very slow to process storing message which makes the cache request is timeout.
The jobs it self cost a long time to process for message storing.
This may be because of:
2.1 Storing message is busy, especially when SYNC_FLUSH is used.
2.2 Syncing message to slave takes long when SYNC_MASTER is used.
In
/broker/src/main/java/org/apache/rocketmq/broker/latency/BrokerFastFailure.java you can see:
final long behind = System.currentTimeMillis() - rt.getCreateTimestamp();
if (behind >= this.brokerController.getBrokerConfig().getWaitTimeMillsInSendQueue()) {
if (this.brokerController.getSendThreadPoolQueue().remove(runnable)) {
rt.setStopRun(true);
rt.returnResponse(RemotingSysResponseCode.SYSTEM_BUSY, String.format("[TIMEOUT_CLEAN_QUEUE]broker busy, start flow control for a while, period in queue: %sms, size of queue: %d", behind, this.brokerController.getSendThreadPoolQueue().size()));
}
}
In common/src/main/java/org/apache/rocketmq/common/BrokerConfig.java, getWaitTimeMillsInSendQueue() method returns
public long getWaitTimeMillsInSendQueue() {
return waitTimeMillsInSendQueue;
}
The default value of waitTimeMillsInSendQueue is 200, thus you can just set it bigger to make the queue waiting for longer time. But if you wanna solve the problem completely, you should follow Jaskey's advice and check your code.
I've created a real time game for Google Play Game Services. It's in the later alpha stages right now. I have a question about sendReliableMessage. I've noticed certain cases where the other peer doesn't receive the message. I am aware that there is a callback onRealTimeMessageSent and I have some code in my MainActivity:
#Override
public void onRealTimeMessageSent(int i, int i2, String s) {
if(i== GamesStatusCodes.STATUS_OK)
{
}
else
{
lastMessageStatus=i;
sendToast("lastMessageStatus:"+Integer.toString(lastMessageStatus));
}
}
My games render loop is checking every iteration the value of lastMessageStatus and if there was something other than STATUS_OK I'm painting a T-Rex right now.
My question is is checking the sent status really enough? I also could create source code where the sender has to wait for an Acknowledged message. Each message would be stamped with a UUID and if ack is not received within a timeout then the sender would send the message again? Is an ACK based system necessary to create a persistent connection?
I've noticed certain cases where there is some lag before the opposite peer received the reliable message and I was wondering is there a timeout on the sendReliable message? Google Play Services documentation doesn't seem to indicate in the documentation that there is a timeout at all.
Thank you
Reliable messages are just that, reliable. There are not a lot of use cases for the onRealTimeMessageSent callback for reliable messages because, as you said, it does not guarantee that the recipient has processed the message yet. Only that it was sent.
It may seem annoying, but an ACK-based system is the best way to know for sure that your user has received the message. A UUID is one good way to do this. I have done this myself and found it to work great (although now you have round-trip latency).
As far as timeout, that is not implemented in the RealTime Messaging API. I have personally found round trip latency (send message, receive ACK in callback) to be about 200ms, and I have never found a way to make a message fail to deliver eventually even when purposefully using bad network conditions.