Watson ASR python WebSocket

Watson ASR python WebSocket - websocket

I use python Websockets implemented using the websocket-client library in order to perform live speech recognition using Watson ASR. This solution was working until very recently but about a month ago it stopped working. There is not even a handshake. Weirdly enough I haven't changed the code (below). Another colleague using a different account has the same problem, so we don't believe that there is anything wrong with our accounts. I've contact IBM regarding this, but since there is no handshake there is no way they can track if something is wrong on their side. The code for websocket is shown below.
import websocket
(...)
ws = websocket.WebSocketApp(
self.api_url,
header=headers,
on_message=self.on_message,
on_error=self.on_error,
on_close=self.on_close,
on_open=self.on_open
)
Where the url is 'wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize', headers are the authorization tokens, and the other functions and methods to handle callbacks. What happens at the moment is that this method runs and waits until there is a time out for the connection. I was wondering if this problem is happening to anyone else running live ASR with Watson in Python running this websocket-client library.

#zedavid Over a month ago we switch to use IAM so username and password was replaced with an IAM apikey. You should migrate your Cloud Foundry Speech to Text instance to IAM. There is a Migration page that will help you understand more about this. You can also create a new Speech to Text instance which will be a resource controlled instance by default.
Once you have the new instance you will need to get an access_token which is similar to the token in Cloud Foundry. The access_token will be used to authorize your request.
Finally, We recently released support for Speech to Text and Text to Speech in the Python SDK. I encourage you to use that rather than writing the code for the token exchange and WebSocket connection management.
service = SpeechToTextV1(
iam_apikey='YOUR APIKEY',
url='https://stream.watsonplatform.net/speech-to-text/api')
# Example using websockets
class MyRecognizeCallback(RecognizeCallback):
def __init__(self):
RecognizeCallback.__init__(self)
def on_transcription(self, transcript):
print(transcript)
def on_connected(self):
print('Connection was successful')
def on_error(self, error):
print('Error received: {}'.format(error))
def on_inactivity_timeout(self, error):
print('Inactivity timeout: {}'.format(error))
def on_listening(self):
print('Service is listening')
def on_hypothesis(self, hypothesis):
print(hypothesis)
def on_data(self, data):
print(data)
# Example using threads in a non-blocking way
mycallback = MyRecognizeCallback()
audio_file = open(join(dirname(__file__), '../resources/speech.wav'), 'rb')
audio_source = AudioSource(audio_file)
recognize_thread = threading.Thread(
target=service.recognize_using_websocket,
args=(audio_source, "audio/l16; rate=44100", mycallback))
recognize_thread.start()

Thanks for the headers information. Here's how it worked for me.
I am using WebSocket-client 0.54.0, which is currently the latest version. I generated a token using
curl -u <USERNAME>:<PASSWORD> "https://stream.watsonplatform.net/authorization/api/v1/token?url=https://stream.watsonplatform.net/speech-to-text/api"
Using the returned token in the below code, I was able to make the handshake
import websocket
try:
import thread
except ImportError:
import _thread as thread
import time
import json
def on_message(ws, message):
print(message)
def on_error(ws, error):
print(error)
def on_close(ws):
print("### closed ###")
def on_open(ws):
def run(*args):
for i in range(3):
time.sleep(1)
ws.send("Hello %d" % i)
time.sleep(1)
ws.close()
print("thread terminating...")
thread.start_new_thread(run, ())
if __name__ == "__main__":
# headers["Authorization"] = "Basic " + base64.b64encode(auth.encode()).decode('utf-8')
websocket.enableTrace(True)
ws = websocket.WebSocketApp("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize",
on_message=on_message,
on_error=on_error,
on_close=on_close,
header={
"X-Watson-Authorization-Token": <TOKEN>"})
ws.on_open = on_open
ws.run_forever()
Response:
--- request header ---
GET /speech-to-text/api/v1/recognize HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Host: stream.watsonplatform.net
Origin: http://stream.watsonplatform.net
Sec-WebSocket-Key: Yuack3TM04/MPePJzvH8bA==
Sec-WebSocket-Version: 13
X-Watson-Authorization-Token: <TOKEN>
-----------------------
--- response header ---
HTTP/1.1 101 Switching Protocols
Date: Tue, 04 Dec 2018 12:13:57 GMT
Content-Type: application/octet-stream
Connection: upgrade
Upgrade: websocket
Sec-Websocket-Accept: 4te/E4t9+T8pBtxabmxrvPZfPfI=
x-global-transaction-id: a83c91fd1d100ff0cb2a6f50a7690694
X-DP-Watson-Tran-ID: a83c91fd1d100ff0cb2a6f50a7690694
-----------------------
send: b'\x81\x87\x9fd\xd9\xae\xd7\x01\xb5\xc2\xf0D\xe9'
Connection is already closed.
### closed ###
Process finished with exit code 0
According to RFC 6455, the server should respond with 101 Switching protocol,
The handshake from the server looks as follows:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: chat
Additionally, when I am using ws:// instead of wss://, I am facing the operation timeout issue.
Update: Example with Live Speech Recognition - https://github.com/watson-developer-cloud/python-sdk/blob/master/examples/microphone-speech-to-text.py

Related

Spring WebClient / Reactor-netty broken when a server responds early?

I don't know much about Webflux / Reactor / Netty. I'm using Spring's WebClient to do all the heavy lifting. But it appears not to work correctly when a server responds back early with an error.
My understanding is when you are POSTing data to a server, the server can respond at any time with an HTTP 4XX error. The client is supposed to stop sending the HTTP body and read that error.
I have a very simply WebClient that POSTs data to a server. It looks like this:
FileResponse resp = client.post().uri(uri)
.contentType(MediaType.APPLICATION_OCTET_STREAM)
.header("Authorization", "Bearer " + authorizationToken)
.accept(MediaType.APPLICATION_JSON)
.bodyValue(data)
.retrieve()
.bodyToMono(FileResponse.class)
.block();
The body can contain a large amount of data (100+KB). Apparently the server looks at the header, validates the authorization token, and only if it's valid, reads the body. If the authorization token is not valid (expired, etc) it immediately responds with an "HTTP 401 Unauthorized" with the response body "{"message": "Invalid user/password"}" while the client is still sending the body. The server then closes the socket which results in the WebClient throwing this:
2022-08-10 15:56:03,474 WARN [reactor.netty.http.client.HttpClientConnect] (reactor-http-nio-1) [id: 0xa7b48bb8, L:/5.6.7.8:51122 - R:dubcleoa030/1.2.3.4:5443] The connection observed an error: java.io.IOException: An existing connection was forcibly closed by the remote host
at java.base/sun.nio.ch.SocketDispatcher.read0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:276)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:233)
at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:223)
at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:358)
at deployment.bp-global.war//io.netty.buffer.PooledByteBuf.setBytes(PooledByteBuf.java:253)
at deployment.bp-global.war//io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1133)
at deployment.bp-global.war//io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:350)
at deployment.bp-global.war//io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:148)
at deployment.bp-global.war//io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
at deployment.bp-global.war//io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at deployment.bp-global.war//io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at deployment.bp-global.war//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at deployment.bp-global.war//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at deployment.bp-global.war//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at deployment.bp-global.war//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
I've made the same request with curl, and it's handled properly. Curl sees the server's early response, stops sending the body and processes the response from the server. I've chopped out a lot of fluff from the curl output but here is the important stuff...
Trying 1.2.3.4...
* TCP_NODELAY set
* Connected to 1.2.3.4 port 5443 (#0)
> POST /api/folders/file/?path=/out HTTP/1.1
> Host: 1.2.3.4:5443
> User-Agent: curl/7.61.1
> accept-encoding: gzip
> Content-Type: application/octet-stream
> Authorization: Bearer youshallnotpass
> accept: application/json
> Content-Length: 298190
>
< HTTP/1.1 401 Unauthorized
< Server: Cleo Harmony/5.7.0.3 (Linux)
< Date: Wed, 10 Aug 2022 21:59:23 GMT
< Content-Length: 36
< Content-Language: en
< Content-Type: application/json
< Connection: keep-alive
* HTTP error before end of send, stop sending
* Closing connection 0
{"message": "Invalid user/password"}
I'm not sure if the issue is with Spring's WebClient or the underlying reactor-netty stuff. But am I crazy or does it just look broken if the server responds early? If I am correct that it's broken, any thoughts on a work-around?
Thank you!
Todd

I setup a small stand-alone command line Spring Boot program so I could test various aspects of this issue. What I found is that if I send the body from memory (byte[]) the issue occurs. If I send the body from a file resource as shown below, everything works correctly.
Our current very large product is using Spring Boot 2.3.3. My stand-alone test program gave me the ability to quickly upgrade Spring Boot to 2.7.2. Everything works correctly in Spring Boot 2.7.2. So it was definitely a bug that was fixed at some point.
Unfortunately our large project cannot be upgraded to Spring Boot 2.7.2 overnight. It will be a large effort which will require extensive testing from our QA department. So for now, as a work-around, I'll write the body payload to a temporary file so I can get WebClient to read it as a file resource which works in 2.3.3 as shown below. Just in case any other poor developer runs into this and needs a work-around, try this...
InputStreamResource resource = new InputStreamResource(new FileInputStream(tempFilename));
String resp = client.post().uri(uri)
.contentType(MediaType.APPLICATION_OCTET_STREAM)
.header("Authorization", "Bearer youshallnotpass")
.accept(MediaType.APPLICATION_JSON)
.body(BodyInserters.fromResource(resource))
.retrieve()
.bodyToMono(String.class)
.block();
EDIT: I started going through the releases to determine when this was fixed. This issues appears to be fixed in Spring Boot 2.3.4 (Netty 4.1.52). I cannot find any reference to this bug in the release notes of Spring Boot or Netty. So I'm not sure where the issue was. Perhaps it was in a deeper dependency. But it was definitely fixed with Spring Boot 2.3.4.

It may happened because there is a conflict in dependencies (may create conflict with azure dependency for same package):
update your spring boot version I ahve updted to latest and works for me.

How to print the WebSocket HTTP Upgrade Request?

Reference: websocket_client_sync.cpp
Table 1.30. WebSocket HTTP Upgrade Request
GET / HTTP/1.1
Host: www.example.com
Upgrade: websocket
Connection: upgrade
Sec-WebSocket-Key: 2pGeTR0DsE4dfZs2pH+8MA==
Sec-WebSocket-Version: 13
User-Agent: Boost.Beast/216
Question> Based on the example websocket_client_sync.cpp, which function is used to send a HTTP Upgrade Request similar as the one shown above and how can I print the request shown above?
Thank you

This is a duplicate, but I can't mark it as such because this answer was never accepted¹:
boost async ws server checking client information
In short use the overload of accept that that takes a request object that you have previously read.
The linked answer has a complete live demo.
¹ answering on Stack overflow can be pretty thankless at times
UPDATE
To the comment, I apologize for missing the client/server distinction initially. The client has a similar overload on handshake that allows you to inspect the upgrade response:
http::response<http::string_body> res;
ws.handshake(res, host, "/");
std::cout << res;
Printing e.g.
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: /wp5bsjYquNsAIhi4lKoIuDm0TY=
However, the request is not directly exposed. I suppose it's best monitored with a network packet sniffer or from the server side. If the goal is to manipulate the upgrade request, you should use a RequestDecorator.
PS: I just checked and the request decorator is applied nearly-at-the-end (although some things related to per-message-deflate might be added on later in the handshake_op). So you might be content with just supplying a decorator that inspects the request:
http::response<http::string_body> res;
ws.set_option(websocket::stream_base::decorator(
[](http::request<http::empty_body>& req) {
std::cout << "--- Upgrade request: " << req << std::endl;
}));
ws.handshake(res, host, "/");
std::cout << "--- Upgrade response: " << res << std::endl;
Which prints e.g.
--- Upgrade request: GET / HTTP/1.1
Host: localhost:10000
Upgrade: websocket
Connection: upgrade
Sec-WebSocket-Key: Quyn+IEvycAhcRtlvPIS4A==
Sec-WebSocket-Version: 13
--- Upgrade response: HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: aRDrnkHhNfaPqGdsisX51rjj+fI=

Microsoft Speech API with Python requests?

I'm trying to use the requests package in Python to make a call to the Microsoft Bing Speech Transcription API. I can make the call work when I use Postman, but this requires manually selecting a file to upload (Postman provides a GUI to select the file), but I'm not sure how this file selection gets mapped onto the actual HTTP request (and by extension the Python requests request). Postman can convert its internal queries into code, and according to Postman the http request it's making is:
POST /recognize?scenarios=smd&appid=[REDACTED]&locale=en-US&device.os=wp7&version=3.0&format=json&form=BCSSTT&instanceid=[REDACTED]&requestid=[REDACTED] HTTP/1.1
Host: speech.platform.bing.com
Authorization: [REDACTED]
Content-Type: application/x-www-form-urlencoded
Cache-Control: no-cache
Postman-Token: [REDACTED]
undefined
And the equivalent request if made through the Python requests library would be:
import requests
url = "https://speech.platform.bing.com/recognize"
querystring = {"scenarios":"smd","appid":[REDACTED],"locale":"en-US","device.os":"wp7","version":"3.0","format":"json","form":"BCSSTT","instanceid":[REDACTED],"requestid":[REDACTED]}
headers = {
'authorization': [REDACTED],
'content-type': "application/x-www-form-urlencoded",
'cache-control': "no-cache",
'postman-token': [REDACTED]
}
response = requests.request("POST", url, headers=headers, params=querystring)
print(response.text)
However note that in neither case does the generated code actually pass in the audio file to be transcribed (clearly Postman doesn't know how to display raw audio data), so I'm not sure how to add this crucial information to the request. I assume that in the case of the HTTP request code the audio stream goes in the spot displayed as "undefined". In the Python requests command, from reading the documentation it seems like the response = requests.request(...) line should be replaced by:
response = requests.request("POST", url, headers=headers, params=querystring, files={'file': open('PATH/TO/AUDIO/FILE', 'rb')})
But when I run this query I get "Request timed out (> 14000 ms)". Any idea for how I can successfully call the Microsoft Speech API through Python? Any help would be much appreciated, thanks.

Make this line your post request:
r = requests.post(url, headers=headers, params=querystring, data=open('PATH/TO/WAV/FILE', 'rb').read())
And that should do the trick.
In the Microsoft Documentation, the audio file binary data is the body of the POST request and must be sent using the data parameter of the requests library.

Ruby: Persistent HTTP client not receiving response on second request

I am trying to create an HTTP client that uses persistent connections. My Code works when I send my first request and get my first response. However, when I send a second request, I am unable to get a second response. I am not sure why? I got the same error when I was coding in C.
Here is the code
require 'socket'
include Socket::Constants
socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
sockaddr = Socket.pack_sockaddr_in( 80, 'www.google.com' )
socket.connect( sockaddr )
# This Works
socket.write( "GET / HTTP/1.0\r\n\r\n" )
results = socket.read
# This Works
socket.write( "GET / HTTP/1.0\r\n\r\n" )
# THIS DOESN'T WORK
results = socket.read
I do not want to use built libraries like Net::HTTP. What do I need to do to make this work?

You cannot make 2 HTTP requests on the same connection, unless you've told the server that you're expecting to do so. This is how HTTP persistent connection works. At a minimum, you have to make sure to add this to your request header:
Connection: keep-alive
Servers have differing support for persistent connections, although it's become common for servers to support basic persistent connections. Here's an SO question that asks What exactly does a “persistent connection” mean?
Start there, and you'll find what you need to know to make persistent connections work correctly. You may have to check the HTTP response headers for an indication that the server will honor your request, or you may have to check that the server didn't simply close the connection when it was finished writing the first response. On your final request through a persistent connection, you should also specify the header:
Connection: close
Also check out these resources:
IETF HTTP 1.1 specification
W3 HTTP 1.1 section 8: Persistent Connections
Safari Books Online HTTP: The Definitive Guide - Persistent Connections

How to completely stop the rails action from responding any kind of HTTP status?

RAILS 3.2.13. JRUBY 1.7.15. RUBY 1.9.3 as interpreter
Is there a way to kill a Rails action from responding at all? I want to do this to avoid any kind of response being sent back to anonymous hacker. There is a constraint check before index and if the constraint check fails, index method ideally should stop from responding. This is a small part on REST API in an effort to kill the action from sending any http status back.
Any constructive suggestion are welcomed.
i.e.
def index
# kill method, do not send any response at all. not even 500 error
end
Thank you all for your help.

Although a REST API should probably send a valid HTTP response, you can suppress any output by closing the underlying output stream. This is exposed via Rack's hijacking API (assuming your server supports it):
def index
if env['rack.hijack?']
env['rack.hijack'].call
io = env['rack.hijack_io']
io.close
end
end
This results in an empty reply, i.e. the server just closes the connection without sending any data:
$ curl -v http://localhost:3000/
* Connected to localhost port 3000
> GET / HTTP/1.1
> User-Agent: curl/7.37.1
> Host: localhost:3000
> Accept: */*
>
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Watson ASR python WebSocket - websocket

Related

Spring WebClient / Reactor-netty broken when a server responds early?

How to print the WebSocket HTTP Upgrade Request?

Microsoft Speech API with Python requests?

Ruby: Persistent HTTP client not receiving response on second request

How to completely stop the rails action from responding any kind of HTTP status?

Categories

Resources