Is it possible to setup CORS Anywhere on localhost? - codeigniter

I am building a web scraper as a small project (using CodeIgniter). Due to CORS policy, I am not allowed to get data from some sites.
To bypass that, I am using Rob Wu's CORS Anywhere. I'm prepending the cors_url to the URL I'm scraping data off of.
Everything works fine until I hit the maximum allowed limit of 200 requests per hour. After hitting 200 times, I get an HTTP status code: 429 (Too many requests).
Screenshot showing Network log.
As per the documentation, we can create an instance of our own server.js on Heroku. But, what I want to do is, to set it up locally for my local Apache server (localhost), just to test out the things first.
Some sample code:
var url = "http://example.com/";
var cors_url = "https://cors-anywhere.herokuapp.com/";
$.ajax({
method:'GET',
url : cors_url + url,
success : function(response){
//data_scraping_logic...
}
}

Install the latest node
save the repo example code as cors.js (I'll paste it below)
do npm install cors-anywhere
run node cors - now it's running on localhost:8080
sample code
// Listen on a specific host via the HOST environment variable
var host = process.env.HOST || '0.0.0.0';
// Listen on a specific port via the PORT environment variable
var port = process.env.PORT || 8080;
var cors_proxy = require('cors-anywhere');
cors_proxy.createServer({
originWhitelist: [], // Allow all origins
// requireHeader: ['origin', 'x-requested-with'],
// removeHeaders: ['cookie', 'cookie2']
}).listen(port, host, function() {
console.log('Running CORS Anywhere on ' + host + ':' + port);
});

Related

Unable to see http traffic from/to my NodeJS app in Charles [mac]

I am running Charles to inspect HTTP traffic between a node js client and a service running locally on my machine (a Mac). I am able to access the service but don't see any trace in Charles. I have tried replacing localhost with my machine's IP name but still no trace. If I type the service URL in Chrome I do see a trace. Anyone knows how to fix this?
Here is my nodejs code:
var thrift = require('thrift'); // I use Apache Thrift
var myService = require('./gen-nodejs/MyService'); // this is code generated by thrift compiler
var transport = thrift.TBufferedTransport();
var protocol = thrift.TBinaryProtocol();
var connection = thrift.createHttpConnection("localhost", 5331, {
transport : transport,
protocol : protocol,
path: '/myhandler',
});
connection.on('error', function(err) {
console.log(err);
});
// Create a client with the connection
var client = thrift.createHttpClient(myService, connection);
console.log('calling getTotalJobCount...');
client.getTotalJobCount(function(count)
{
console.log('total job count = ' + count);
});
and my proxy settings:
fixed this myself with help of this link. Charles intercepts the traffic crossing the system proxy which is 127.0.0.1:8888 on my mac. Here is proper code:
// give path to the proxy in argument to createHttpConnection
var connection = thrift.createHttpConnection('127.0.0.1', 8888, {
transport : transport,
protocol : protocol,
path: 'http://localhost:5331/myhandler', // give the actual URL you want to connect to here
});
In addition need to use thrift.TBufferedTransport instead of thrift.TBufferedTransport() and thrift.TBinaryProtocol instead of thrift.TBinaryProtocol()

Request working in CURL but not in Ajax

I have a Scrapyd server running and trying to schedule a job.
When i try below using CURL it is working fin e
curl http://XXXXX:6800/schedule.json -d project=stackoverflow -d spider=careers.stackoverflow.com -d setting=DOWNLOAD_DELAY=2 -d arg1=val1
After that i have done a small code UI in angular to have a GUI for this,
I have done a AJAX request to do the above.
var baseurl = GENERAL_CONFIG.WebApi_Base_URL[$scope.server];
var URI = baseurl +"schedule.json"; //http://XXXXX:6800/schedule.json
var headers = {'content-type': 'application/x-www-form-urlencoded'}
console.log(URI)
$http.post( URI,data = $scope.Filters, headers).success(function (data, status) {
console.log(data)
}).error(function (data, status) {
console.log(status);
alert("AJAX failed!");
});
but i am getting No 'Access-Control-Allow-Origin' header is present on the requested resource. error.
Can any one help me how to resolve this ?
And why it is working in CURL but not in my AJAX.
Thanks,
This is because of browser protection called Same-origin policy. It prevents ajax requests across a different combination of scheme, hostname, and port number. Curl has no such protection.
In order to prevent it you will either have to put both the api and client app on the same domain and port or add the CORS header 'Access-Control-Allow-Origin' to the server.
One other option is to use JSONP. This may be suitable in this case to just get json data. It's not suitable for rest apis. In angular use $http.jsonp for this

Ajax calls from node to django

I'm developing a django system and I need to create a chat service that was in real-time. For that I used node.js and socket.io.
In order to get some information from django to node I made some ajax calls that worked very nice when every address was localhost, but now that I have deployed the system to webfaction I started to get some errors.
The djando server is on a url like this: example.com and the node server is on chat.example.com. When I make a ajax get call to django I get this error on the browser:
XMLHttpRequest cannot load http://chat.example.com/socket.io/?EIO=3&transport=polling&t=1419374305014-4. Origin http://example.com is not allowed by Access-Control-Allow-Origin.
Probably I misunderstood some concept but I'm having a hard time figuring out which one.
The snippet where I think the problem is, is this one:
socket.on('id_change', function(eu){
sessionid = data['sessionid']
var options = {
host: 'http://www.example.com',
path: '/get_username/',
method: 'POST',
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
'Content-Length': sessionid.length
}
}
var request = http.request(options, function(response) {
response.on('data', function(msg){
console.log('Received something')
if(response.statusCode == 200){
//do something here
}
}
})
})
request.write(sessionid);
request.end();
});
And I managed to serve socket.io.js and make connections to the node server, so this part of the setup is ok.
Thank you very much!
You're bumping into the cross origin resource sharing problem. See this post for more information: How does Access-Control-Allow-Origin header work?
I am NOT a Django coder at all, but from this reference page (https://docs.djangoproject.com/en/1.7/ref/request-response/#setting-header-fields) it looks like you need to do something like this in the appropriate place where you generate responses:
response = HttpResponse()
response['Access-Control-Allow-Origin'] = 'http://chat.example.com'

How to use secure websockets with sock js

I am using SockJS-client. SockJS constructor takes a relative URL as
var ws= new SockJS('/spring-websocket-test/sockjs/echo', undefined,{protocols_whitelist: [transport]});
Where do we indicate that WSS:// be used instead of WS://. If I try absolute URL, it gives error :
XMLHttpRequest cannot load ws://localhost:8080/appname/app. Cross origin requests are only supported for HTTP.
_ws_onclose. wasClean: false code: 1002 reason: Can't connect to server
Not sure why getting this error. Any similar configuration needed on Spring Server Implementation?
Went through the sock JS client code :I went through the client code -sock JS takes care of it -if its HTTPS it uses WSS else it does WS
var that = this;
var url = trans_url + '/websocket';
if (url.slice(0, 5) === 'https') {
url = 'wss' + url.slice(5);
} else {
url = 'ws' + url.slice(4);
}
No need to pass WS or WSS (In fact not even HTTP/HTTPS) with the SockJS constructor -just relative URL is sufficient. SockJs client library takes care of it.
One more surprising fact I encountered, it appends "/websocket" at end of the URL -this gave me clue why I was not able to connect with java client using jetty websocket-client apis
Try this code.
var ws= new SockJS('http://localhost:8080/sockjs/echo', undefined,{protocols_whitelist: [transport]});

Securing (ssl) a nodejs/express web service and calling it cross-domain with $.ajax()

I've set up a nodejs app like this:
var express = require('./../../../Library/node_modules/express');
var https = require('https');
var app = express();
httpsOptions = {
key: fs.readFileSync('privatekey.pem'),
cert: fs.readFileSync('certificate.pem') // SELF-SIGNED CERTIFICATE
}
var server = https.createServer(httpsOptions);
app.get('/myservice', function(req, res) {
...
}
server.listen(8443);
I have opened the 8443 port in my server for inbound requests.
From a browser, if I open https://mydomain/myservice:8443 I get the untrusted connection warning from the browser, which seems logical.
Then from a test.html that I run from my local computer (to test the cross-domain issue), I do something like this:
function testService(){
var data = { some JSON };
$.ajax({
url: 'https://myserver:8443/myservice',
dataType: "jsonp",
data: data,
jsonpCallback: "_mycallback",
cache: false,
timeout: 5000,
success: function(data) {
alert(data);
},
error: function(jqXHR, textStatus, errorThrown) {
alert('Error: ' + textStatus + " " + errorThrown);
}
});
}
My problem is that this request times out, and I don't think it even reaches the service.
Any idea why?
Whenever I make this request reach the server, hopefully thanks to your kind responses, what will happen with the browser warning for the untrusted certificate? Will that stop $.ajax() from silently calling the server and receiving the response?
The reason that your clients' JSONP request times out could be practically anything. Because of the way JSONP works, you can only ever know whether the request fails or succeeds, and when it fails it will always be because of a timeout. That said, its pretty much guaranteed to fail if you haven't saved the servers self-signed cert on the client. To do so, make sure that you tell your browser to always trust the servers' certificate. In Firefox you can also go Preferences->Encryption->View Certificates->Your Certificates->Import... to add the certificate to Firefox. Other browsers should have a similar interface.
To solve a potential cross domain issue, try adding the following to your app.get('/myservice'):
res.header("Access-Control-Allow-Origin:", "*");
res.header("Access-Control-Allow-Methods:", "GET");
Additionally, different browsers handle these things differently. In my experience Firefox is sometimes more lenient than Chrome, but I would definitely test in both.
To test the HTTPS issue, first I would try just setting up a regular expressjs server (no encryption) and not using https:// in your request. If the request then succeeds you know that the problem is the SSL. If so, make sure that when your browser gives a security warning you enable any options allowing you to permanently add that site to your trusted hosts.
Also, I believe that this line:
var server = https.createServer(httpsOptions);
should be:
var server = https.createServer(httpsOptions, app);
(From: http://expressjs.com/api.html#app.listen)
You may also want to add the following code below var server = https.createServer(httpsOptions); for debugging (so that you can easily see if your server receives the request):
app.get('*', function(req, res, next) {
// You *should* also be able to add the response headers here, although I haven't tried.
console.log('Request received', req, res);
next();
})
Hopefully that helps!

Resources