Issue on using Yahoo! Web Services and APIs(YQL) - yahoo

I am using Yahoo Api and Web Services..
For example:
http://query.yahooapis.com/v1/public/yql?q=select symbol,DaysLow,DaysHigh,PreviousClose from yahoo.finance.quotes where symbol in ("INDUSINDB.NS,YESBANK.NS,CANBK.NS,AXISBANK.NS,SBIN.NS,KOTAKBANK.NS,HDFCBANK.NS,BANKBAROD.NS,UNIONBANK.NS,BANKINDIA.NS,ICICIBANK.NS,PNB.NS")&diagnostics=false&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys
http://query.yahooapis.com/v1/public/yql?q=select symbol,DaysLow,DaysHigh,PreviousClose from yahoo.finance.quotes where symbol in ("ACC.NS,AMBUJACEM.NS,ASIANPAIN.NS,AXISBANK.NS,BAJAJAUTO.NS,BANKBAROD.NS,BHARTIART.NS,BHEL.NS,BPCL.NS,CAIRN.NS,CIPLA.NS,COALINDIA.NS,DLF.NS,DRREDDY.NS,GAIL.NS,GRASIM.NS,HCLTECH.NS,HDFC.NS,HDFCBANK.NS,HEROHONDA.NS,HINDALCO.NS,HINDUNILV.NS,ICICIBANK.NS,IDFC.NS,INFY.NS,ITC.NS,JINDALSTE.NS,JPASSOCIA.NS,KOTAKBANK.NS,LT.NS,M%26M.NS,MARUTI.NS,NTPC.NS,ONGC.NS,PNB.NS,POWERGRID.NS,RANBAXY.NS,RELIANCE.NS,RELINFRA.NS,SAIL.NS,SBIN.NS,SESAGOA.NS,SIEMENS.NS,STER.NS,SUNPHARMA.NS,TATAMOTOR.NS,TATAPOWER.NS,TATASTEEL.NS,TCS.NS,WIPRO.NS")&diagnostics=false&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys
These Webservices returns me the Xml. But some times it does not return any results.
It shows
This XML file does not appear to have any style information associated with it. The document tree is shown below.
Can any one please help me on this

First: This XML file does not appear to have any style information associated with it. ... does not prevent the file to display the results. It seems like there is a lot of traffic and the query takes too long.
YQL Datatables are just csv conversions. Thus you can directly query the csv behind. Moreover when there is a lot of traffic, YQL datatables are often down whereas csv files are almost up-to-date.
You can do the same (CSV instead of XML) with the following query:
http://download.finance.yahoo.com/d/quotes.csv?f=smp&s=INDUSINDB.NS,YESBANK.NS,CANBK.NS,AXISBANK.NS,SBIN.NS,KOTAKBANK.NS,HDFCBANK.NS,BANKBAROD.NS,UNIONBANK.NS,BANKINDIA.NS,ICICIBANK.NS,PNB.NS
CSV Files are more reliable (direct source of informations instead of conversion) and faster.

The snippet you have provided returns N/A for all fields. In general, Yahoo APIs seem to have serious limitations for bourses outside the West. I have experimented with YQL as well as the REST apis and am unable to access data for Indian stocks.
Try substituting RIL.BO for most of the examples here: http://www.gummy-stuff.org/Yahoo-data.htm
Works well for GOOG and YHOO and AAPL. For Indian stocks like RIL.BO, all I get is N/As. Your query repeatedly returns:
INDUSINDB.NS N/A - N/A N/A
YESBANK.NS N/A - N/A N/A
CANBK.NS N/A - N/A N/A
AXISBANK.NS N/A - N/A N/A
SBIN.NS N/A - N/A N/A
KOTAKBANK.NS N/A - N/A N/A
HDFCBANK.NS N/A - N/A N/A
BANKBAROD.NS N/A - N/A N/A
UNIONBANK.NS N/A - N/A N/A
BANKINDIA.NS N/A - N/A N/A
ICICIBANK.NS N/A - N/A N/A
PNB.NS N/A - N/A N/A
Pity! Yahoo! doesn't appear serious about their APIs or this could be so useful. Documentation is not helpful either.

Related

Slow Ingestion of Network Packets

I am using a Packet Capturing and Analysis Tool named as Cisco Joy for generating network flows.
Here is the link: https://github.com/cisco/joy
So Joy is a Packet capturing and analysis tool which uses a configuration file to capture Packets on a network interface and return json files as output in a directory.
I have configured Cisco Joy with AF_Packet to generate the network flows.
So I have been trying to process the network packets using tcpreplay on a virtual network interface at a speed of 3 GBPS but Joy is not receiving the packets at the same speed.
Actual: 450889 packets (397930525 bytes) sent in 1.06 seconds
Rated: 374307598.1 Bps, 2994.46 Mbps, 424122.22 pps
Flows: 12701 flows, 11947.01 fps, 450298 flow packets, 591 non-flow
Statistics for network device: vth0
Successful packets: 450889
Failed packets: 0
Truncated packets: 0
Retried packets (ENOBUFS): 0
Retried packets (EAGAIN): 0
Packets Received from Cisco Joy: 260850
So Here tcpreplay sent over 400k packets but Cisco Joy received only around 260k.
I have changed the buffer size of the length in which the packets have been captured but still I didn't find any resolution from that so anyone have any clue about this?

How to make a circuit-breaker in Istio?

I am trying to configure a circuit breaker in Istio. This is the yaml.
trafficPolicy:
connectionPool:
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
tcp:
maxConnections: 1
outlierDetection:
baseEjectionTime: 1m
consecutive5xxErrors: 1
interval: 1s
I have a list of thread groups in JMeter that will be continously hitting the service associated with the above circuit breaker. Upon receiving an error response, it should be making the service unavailable for 1 minute. But, that is not happenning.
Am I misunderstanding how it works? Is there any way to achieve that?
I think you are confusing between outlier detection and circuit breaker based on connectionPool settings.
The settings you are applying in the connectionPool will configure a circuit breaker where if any of the limits are breached then circuit will be tripped and new requests will get an immideate 503 response from istio proxy. As in the new requests will not be sent to the application.
However, the proxy will accept new requests as soon as it can (when limits are not breached by accepting the new request).
There is no such thing as circuit breaking for 1 minute in this context.
Outlier detection is different. This works by tripping a particular error prone POD from the load balancing pool.
Suppose, you have 4 replica pods running for your deployment. And let us say one of the PODs is giving 5xx error (The 503 errors sent by proxy, like in the connection pool breach case, are not counted here. This count is of your application errors). In this case istio will wait for consecutive5xxErrors (1 in your case) and once this is breached it will remove that pod from load balancing for the baseEjectionTime for the first time.
That is, it will wait for baseEjectionTime (1m in your case). Till then no new request will be sent to the error proned POD. After 1 minute it will add the POD again to the load balancing pool. But if again this POD breaches the consecutive5xxErrors (1 in your case) then istio will remove it from the load balancing for 2xbaseEjectionTime which would be 2 minutes in your case.
This will keep going until your POD is back giving non 5XX errors.
With the information you provided I think the problem might the parameter maxEjectionPercent not being set in your DestiationRule:
maxEjectionPercent - Maximum % of hosts in the load balancing pool for the upstream service that can be ejected. Defaults to 10%.
Since it default to 10% this means that only 10% of you deployment will ejected by circuit breaker. For testing purposes you might try to set this to 100%, similar the documentiation to demonstrate this:
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: httpbin
spec:
host: httpbin
trafficPolicy:
connectionPool:
tcp:
maxConnections: 1
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
outlierDetection:
consecutive5xxErrors: 1
interval: 1s
baseEjectionTime: 3m
maxEjectionPercent: 100 👈
I have tested the example in the docs and it works fine for me.
Another possible issue might be sidecar injection. Please verify that your pod actually has one (you should see 2 out 2 containers ready inside the pod):
~  kgp  ✔  cluster-1 ⎈
NAME READY STATUS RESTARTS AGE
fortio-deploy-576dbdfbc4-9crcf 2/2 Running 0 46m
httpbin-74fb669cc6-mg9rh 2/2 Running 0 48m

H2O H2OServerError: HTTP 500 Server Error when training model

Trying to train a DRF classifier in h2o (version 3.20.0.5), the error "H2OServerError: HTTP 500 Server Error" with no further explanation.
---------------------------------------------------------------------------
H2OServerError Traceback (most recent call last)
<ipython-input-44-f52d1cb4b77a> in <module>()
4 training_frame=train_u, validation_frame=val_u,
5 weights_column='weight',
----> 6 max_runtime_secs=max_train_time_hrs*60*60)
7
8
/home/mapr/python-virtual-envs/ml1c/venv/lib/python2.7/site-packages/h2o/estimators/estimator_base.pyc in train(self, x, y, training_frame, offset_column, fold_column, weights_column, validation_frame, max_runtime_secs, ignored_columns, model_id, verbose)
224 rest_ver = parms.pop("_rest_version") if "_rest_version" in parms else 3
225
--> 226 model_builder_json = h2o.api("POST /%d/ModelBuilders/%s" % (rest_ver, self.algo), data=parms)
227 model = H2OJob(model_builder_json, job_type=(self.algo + " Model Build"))
228
/home/mapr/python-virtual-envs/ml1c/venv/lib/python2.7/site-packages/h2o/h2o.pyc in api(endpoint, data, json, filename, save_to)
101 # type checks are performed in H2OConnection class
102 _check_connection()
--> 103 return h2oconn.request(endpoint, data=data, json=json, filename=filename, save_to=save_to)
104
105
/home/mapr/python-virtual-envs/ml1c/venv/lib/python2.7/site-packages/h2o/backend/connection.pyc in request(self, endpoint, data, json, filename, save_to)
400 auth=self._auth, verify=self._verify_ssl_cert, proxies=self._proxies)
401 self._log_end_transaction(start_time, resp)
--> 402 return self._process_response(resp, save_to)
403
404 except (requests.exceptions.ConnectionError, requests.exceptions.HTTPError) as e:
/home/mapr/python-virtual-envs/ml1c/venv/lib/python2.7/site-packages/h2o/backend/connection.pyc in _process_response(response, save_to)
728 # Note that it is possible to receive valid H2OErrorV3 object in this case, however it merely means the server
729 # did not provide the correct status code.
--> 730 raise H2OServerError("HTTP %d %s:\n%r" % (status_code, response.reason, data))
731
732
H2OServerError: HTTP 500 Server Error:
Server error java.lang.NullPointerException:
Error: Caught exception: java.lang.NullPointerException
Request: None
The code snippet in question is shown below:
max_train_time_hrs = 8
drf_proc.train(
x=train_features, y=train_response,
training_frame=train_u, validation_frame=val_u,
weights_column='weight',
max_runtime_secs=max_train_time_hrs*60*60)
The output from running the h2o.init() command looks like
Checking whether there is an H2O instance running at http://172.18.4.62:54321. connected.
Warning: Your H2O cluster version is too old (7 months and 24 days)! Please download and install the latest version from http://h2o.ai/download/
H2O cluster uptime: 06 secs
H2O cluster timezone: Pacific/Honolulu
H2O data parsing timezone: UTC
H2O cluster version: 3.20.0.5
H2O cluster version age: 7 months and 24 days !!!
H2O cluster name: H2O_88021
H2O cluster total nodes: 4
H2O cluster free memory: 15.34 Gb
H2O cluster total cores: 8
H2O cluster allowed cores: 8
H2O cluster status: accepting new members, healthy
H2O connection url: http://172.18.4.62:54321
H2O connection proxy: None
H2O internal security: False
H2O API Extensions: AutoML, XGBoost, Algos, Core V3, Core V4
Python version: 2.7.12 fin
While I realize that there is a warning that the version of h2o I am using is "too old", the version of the h2o python package I am using and the cluster I am connecting to still match and this cannot be upgraded due to other h2o applications that access this cluster and expect a certain version (all of these applications appear to have no problem running on the cluster). Meanwhile, any web browser is unable to connect to the H2O connection url.
Any ideas about what could be going on here or debugging steps that could be looked into?
15GB of memory might not be enough for a training process you expect to last 8hrs. (Aside: I'd recommend using early stopping, rather than, or as well as, max_runtime_secs.)
As a debugging step, I would recommend watching in the Flow interface (point your browser to port 54321 - see the connection URL in your h2o.init() output). Especially watch how memory usage is rising over time.
(Sometimes a "500" error just means it has gone unstable, and lack of memory is a common trigger.)
If you are getting the error immediately, that is less likely to be the problem (unless you have a huge dataset).
In that case I'd try to narrow down if a particular column or data row could be causing the problem. E.g.
Experiment 1: first half of columns in train_features
Experiment 2: second half of columns in train_features
Experiment 3: first half of rows in train_u
Experiment 4: second half of rows in train_u
Experiment 5/6 (if still no luck): the same for valid_u
If one of the experiment pair crashes but the other doesn't, then repeat the experiment on the crashing half.

Intermittent DynamoDB DAX errors: NoRouteException during cluster refresh

Via CloudFormation, I have a setup including DynamoDB tables, DAX, VPC, Lambdas (living in VPC), Security Groups (allowing access to port 8111), and so on.
Everything works, except when it doesn't.
I can access DAX from my VPC'd Lambdas 99% of the time. Except occasionally they get NoRouteException errors... seemingly randomly. Here's the output from CloudWatch for a single Lambda function doing the exact same thing each time (a DAX get). Notice how it works, fails, and then works again:
/aws/lambda/BigOnion_accountGet START RequestId: 2b732899-f380-11e7-a650-cbfe0f7dfb3d Version: $LATEST
/aws/lambda/BigOnion_accountGet END RequestId: 2b732899-f380-11e7-a650-cbfe0f7dfb3d
/aws/lambda/BigOnion_accountGet REPORT RequestId: 2b732899-f380-11e7-a650-cbfe0f7dfb3d Duration: 58.24 ms Billed Duration: 100 ms Memory Size: 768 MB Max Memory Used: 48 MB
/aws/lambda/BigOnion_accountGet START RequestId: 3b63a928-f380-11e7-a116-5bb37bb69bee Version: $LATEST
/aws/lambda/BigOnion_accountGet END RequestId: 3b63a928-f380-11e7-a116-5bb37bb69bee
/aws/lambda/BigOnion_accountGet REPORT RequestId: 3b63a928-f380-11e7-a116-5bb37bb69bee Duration: 35.01 ms Billed Duration: 100 ms Memory Size: 768 MB Max Memory Used: 48 MB
/aws/lambda/BigOnion_accountGet START RequestId: 4b7fa7f2-f380-11e7-a0c8-513a66a11e7a Version: $LATEST
/aws/lambda/BigOnion_accountGet 2018-01-07T07:56:40.643Z 3b63a928-f380-11e7-a116-5bb37bb69bee caught exception during cluster refresh: { Error: NoRouteException: not able to resolve address
at DaxClientError (/var/task/index.js:545:5)
at AutoconfSource._resolveAddr (/var/task/index.js:18400:23)
at _pull (/var/task/index.js:18421:20)
at _pullFrom.then.catch (/var/task/index.js:18462:18)
time: 1515311800643,
code: 'NoRouteException',
retryable: true,
requestId: null,
statusCode: -1,
_tubeInvalid: false,
waitForRecoveryBeforeRetrying: false }
/aws/lambda/BigOnion_accountGet 2018-01-07T07:56:40.682Z 3b63a928-f380-11e7-a116-5bb37bb69bee Error: NoRouteException: not able to resolve address
at DaxClientError (/var/task/index.js:545:5)
at AutoconfSource._resolveAddr (/var/task/index.js:18400:23)
at _pull (/var/task/index.js:18421:20)
at _pullFrom.then.catch (/var/task/index.js:18462:18)
/aws/lambda/BigOnion_accountGet END RequestId: 4b7fa7f2-f380-11e7-a0c8-513a66a11e7a
/aws/lambda/BigOnion_accountGet REPORT RequestId: 4b7fa7f2-f380-11e7-a0c8-513a66a11e7a Duration: 121.24 ms Billed Duration: 200 ms Memory Size: 768 MB Max Memory Used: 48 MB
/aws/lambda/BigOnion_accountGet START RequestId: 5b951673-f380-11e7-9818-f1effc29edd5 Version: $LATEST
/aws/lambda/BigOnion_accountGet END RequestId: 5b951673-f380-11e7-9818-f1effc29edd5
/aws/lambda/BigOnion_accountGet REPORT RequestId: 5b951673-f380-11e7-9818-f1effc29edd5 Duration: 39.42 ms Billed Duration: 100 ms Memory Size: 768 MB Max Memory Used: 48 MB
/aws/lambda/BigOnion_siteCreate START RequestId: 0ec60080-f380-11e7-afea-a95d25c6e53f Version: $LATEST
/aws/lambda/BigOnion_siteCreate END RequestId: 0ec60080-f380-11e7-afea-a95d25c6e53f
/aws/lambda/BigOnion_siteCreate REPORT RequestId: 0ec60080-f380-11e7-afea-a95d25c6e53f Duration: 3.48 ms Billed Duration: 100 ms Memory Size: 768 MB Max Memory Used: 48 MB
Any ideas what it could be?
It's presumably not the VPC and security access as 9/10 times access is perfectly fine. I have a wide range of CIDR IPs, so I don't think it's anything related to EIN provisioning... but what else?
The only hint I have is the initial error which states "caught exception during cluster refresh". What exactly is a "cluster refresh" and how could it lead to these failures?
A "cluster refresh" is a background process used by the DAX Client to ensure that its knowledge of the cluster membership state somewhat matches reality, as the DAX client is responsible for routing requests to the appropriate node in the cluster.
Normally a failure on refresh is not an issue because the cluster state rarely changes (And thus the existing state can be reused), but on startup, the client "blocks" to get an initial membership list. If that fails, the client can't proceed as it doesn't know which node can handle which requests.
There can be a slight delay creating the VPC-connected ENI during a Lambda cold-start, which means the client cannot reach the cluster (hence, "No route to host") during initialization. One the Lambda container is running it shouldn't be an issue (you might still get the exception in the logs if there's a network hiccup, but it shouldn't affect anything).
If it only happens for you during a cold-start, retrying after a slight delay should be able to work around it.

Seeing a lot of "stream error: stream ID 1221; PROTOCOL_ERROR" since 23rd March from a single Linode DC

We've had a lot of these errors from only one of our DCs - running the same code as others globally.
For example:
2017/03/31 13:22:03 Error sending request to BigQuery (attempt 1 of 3): Post https://www.googleapis.com/bigquery/v2/projects/coull-delta/datasets/demand/tables/table_name/insertAll?alt=json: stream error: stream ID 1221; PROTOCOL_ERROR
It uses the Big Query Go package from google - https://godoc.org/google.golang.org/api
Anyone got any ideas? Seen this before? As I said, exactly the same code running in other places with no issue.

Resources