I'm trying to scrape some data using the mechanize library in ruby and I have to first get past a "Terms and Conditions" page. To that end I'm clicking an "I agree" button.
require 'mechanize'
agent = Mechanize.new
agent.agent.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
agent.get('https://apply.hobartcity.com.au/Common/Common/terms.aspx')
form = agent.page.form_with(:id => 'aspnetForm')
button = form.button_with(:name => 'ctl00$ctMain$BtnAgree')
page = form.submit(button)
But when I run the above code I get this error on the form submission step:
Uncaught exception: unsupported content-encoding: gzip,gzip
When I access that second page with a browser the response headers are
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=utf-8
X-UA-Compatible: IE=9,10,11
Date: Tue, 16 Feb 2016 22:44:27 GMT
Cteonnt-Length: 16529
Content-Encoding: gzip
Content-Length: 5436
I assume mechanize can work with gzip content encoding, so I'm not sure where the error is coming from. Any ideas what's going on here?
Ruby 2.1.7, mechanize 2.7.4.
I didn't figure out what the actual cause of the problem was, but I was able to work around it by overriding the content-encoding:
agent.content_encoding_hooks << lambda { |httpagent, uri, response, body_io|
response['Content-Encoding'] = 'gzip'
}
agent.submit(form, button)
No more error.
Related
I'm trying to implement a download link for users to download a record in .txt file.
Firstly it was a simple <a> tag
download
I could download the file from server in .txt format. But I found that it does not bring Auth header. So I tried to use a http get method to fetch it.
service.js
getCdrFile(url) {
return this.http.get<any>(`${this.env.service}/service/api/downloadFile?` + url);
}
component.js
downloadFile(url) {
this.service.getCdrFile(url).subscribe(
data => {
console.log(data);
},
error => {
console.log(error);
}
);
}
I can successfully call the API with auth head, then I got nothing happened after I clicked the download button but the txt data displayed in the "response" tab in Chrome developer tool. Also, I got nothing from console.log(data); inside my http request.
Is there anyway I can download the file? thanks!
(and here is my response detail)
# GENERAL
Request URL: http://localhost:8080/service/api/downloadFile?fileType=daily&trsDate=20190918
Request Method: GET
Status Code: 200
Remote Address: 127.0.0.1:8080
Referrer Policy: no-referrer-when-downgrade
# RESPONSE HEADER
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Connection: keep-alive
Content-Disposition: attachment; filename=20190918.txt
Content-Type: application/json
Date: Wed, 02 Oct 2019 03:51:01 GMT
Expires: 0
Pragma: no-cache
Server: nginx/1.15.2
Transfer-Encoding: chunked
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
You can create a Blob response and create a blob url with it and download on the fly.
Service:
Modify your service to receive a blob response
getImage() {
return this.httpClient.get(
your_image_link,
{
responseType: 'blob', // <-- add this
headers: {your_headers}
}
);
}
Component:
On click of your link on the page call your service to get the response blob of your file
Create a blob url URL.createObjectUrl method
Create a dummy anchor element assign the blob url and name of the file to download
Trigger a click event on the anchor element
remove the blob url from browser using URL.revokeObjectUrl method
downloadImage() {
this.service.getImage().subscribe(img => {
const url = URL.createObjectURL(img);
const a = document.createElement('a');
a.download = "filename.txt";
a.href = url;
a.click();
URL.revokeObjectURL(url);
});
}
Stackblitz: https://stackblitz.com/edit/angular-kp3saz
You have two ways to download the file from the server.
1:-) Ater getting a response from HTTP call to create base64 and create a dummy anchor tag and download.
2:-) Modify backend response as download response.
I'm trying to make an https request using the Typhoeus::Request object and i don't get it working.
The code i'm running is something like this:
url = "https://some.server.com/"
req_opts = {
:method => :get,
:headers => {
"Content-Type"=>"application/json",
"Accept"=>"application/json"
},
:params=>{},
:params_encoding=>nil,
:timeout=>0,
:ssl_verifypeer=>true,
:ssl_verifyhost=>2,
:sslcert=>nil,
:sslkey=>nil,
:verbose=>true
}
request = Typhoeus::Request.new(url, req_opts)
response = request.run
The response i'm getting is this:
HTTP/1.1 302 Found
Location: https://some.server.com:443/
Date: Sat, 27 Apr 2019 02:25:05 GMT
Content-Length: 5
Content-Type: text/plain; charset=utf-8
Why is this happening?
Well it's hard to know because your example is not a reachable url. But 2 things I see is that you are not passing an ssl cert or key. But also 302 indicates a redirect. You can try to follow redirection but your first problem is probably you don't need to set SSL options, why are you?
See if you try the following options:
req_opts = {
:method => :get,
:headers => {
"Content-Type"=>"application/json",
"Accept"=>"application/json"
},
:params=>{},
:params_encoding=>nil,
:timeout=>0,
:followlocation => true,
:ssl_verifypeer=>false,
:ssl_verifyhost=>0,
:verbose=>true
}
See the following sections for more info
https://github.com/typhoeus/typhoeus#following-redirections
https://github.com/typhoeus/typhoeus#ssl
I'm trying to connect to Parse.com 's REST-API via NSURLConnection to track AppOpened metadata.
I get 200 OK back from the API and the headers are the same to the cURL headers but my API calls are not being represented in the data browser on Parse.com . Is NSURLConnection doing something silly I don't know of? API response is the same but one request gets represented while the other one isn't.
NSLog output:
<NSHTTPURLResponse: 0x7ff5eb331ca0> { URL: https://api.parse.com/1/events/AppOpened } { status code: 200, headers {
"Access-Control-Allow-Methods" = "*";
"Access-Control-Allow-Origin" = "*";
Connection = "keep-alive";
"Content-Length" = 3;
"Content-Type" = "application/json; charset=utf-8";
Date = "Sun, 04 Jan 2015 22:42:54 GMT";
Server = "nginx/1.6.0";
"X-Parse-Platform" = G1;
"X-Runtime" = "0.019842";
} }
cURL output:
HTTP/1.1 200 OK
Access-Control-Allow-Methods: *
Access-Control-Allow-Origin: *
Content-Type: application/json; charset=utf-8
Date: Sun, 04 Jan 2015 23:03:51 GMT
Server: nginx/1.6.0
X-Parse-Platform: G1
X-Runtime: 0.012325
Content-Length: 3
Connection: keep-alive
{}
It's the same output. What am I doing wrong? Has anyone experience with this?
Turns out Parse was showing funny API keys the moment I copied them out of the cURL example they provide in their lovely docs. Don't know whose analytics I screwed over but I'm terribly sorry and it wasn't my fault!
Always copy your API keys out of [Your-Parse-App-Name]->Settings->Keys
It probably was just a stupid glitch that happened on the Server.
In Opera only I receive "JSON.parse: Unterminated string" when going to http://www.underfashion.nl/babys
The string is indeed unterminated, does not end with "]}.
In the other browsers (IE, FF, Chrome) it works fine and receives the entire string.
The string is very long: 217529 chars. Is that possibly the problem? The other browsers receive 220374 chars ending with "]}
I have tried 3 AJAXways to get the data, all with the same strings as result:
The first:
var value = (function () {
var val = null;
$.ajax({'async': false, 'global': false, 'url': uf_urlsearch,
'success': function (data) { val = data;
alert("Data Loaded: " + data.slice(-100) + "<br/>Numofchars: " + data.length);
}
});
return val;
})();
The second:
$.get(uf_urlsearch, function(data){
alert("Data Loaded: " + data.slice(-100));
});
The third:
uf_XMLHttpProductlist.onreadystatechange=function(){
if (uf_XMLHttpProductlist.readyState==4 && uf_XMLHttpProductlist.status==200){
//Get the returned menu-items in Responsetext, expected to look like this:
...
};//if (uf_XMLHttp.readyState==4 && uf_XMLHttp.status==200){
};//uf_XMLHttp.onreadystatechange=function()
uf_urlsearch = "http://www.underfashion.nl/php/get_productlist.php?"+uf_PHPsearchstring;
uf_XMLHttpProductlist.open("GET",uf_urlsearch,true);
uf_XMLHttpProductlist.send();
};
Anyone see any solution?
Best regards,
To inspect the network activity, Go to Opera Menu -> Tools -> Advanced -> Opera Dragonfly. Then enter the URL in your addressbar.
In the Network Tab you can see the list of resources. Select the XHR button, and you will see the get_productlist.php resource. For what is worth, I didn't have any issue with your Web site. The HTTP Request was:
GET /php/get_productlist.php?afdeling=babys HTTP/1.1
User-Agent: Opera/9.80 (Macintosh; Intel Mac OS X 10.7.4; U; fr) Presto/2.10.289 Version/12.00
Host: www.underfashion.nl
Accept-Language: fr,en;q=0.9,en-US;q=0.8,ja;q=0.7,pt;q=0.6,de;q=0.5,zh-CN;q=0.4,es;q=0.3,it;q=0.2,nl;q=0.1,sv;q=0.1,nb;q=0.1,da;q=0.1,fi;q=0.1,zh-TW;q=0.1,ko;q=0.1,pl;q=0.1,pt-PT;q=0.1,ru;q=0.1,ar;q=0.1,cs;q=0.1,hu;q=0.1,tr;q=0.1,ca;q=0.1,el;q=0.1,he;q=0.1,hr;q=0.1,ro;q=0.1,sk;q=0.1,th;q=0.1,uk;q=0.1
Accept-Encoding: gzip, deflate
Referer: http://www.underfashion.nl/babys
Cookie: JSESSIONID=9ABC3B0357487E01298EBC7A02B5FDCD; __atuvc=1%7C25; __utma=137714676.906129982.1340200451.1340200451.1340200451.1; __utmb=137714676.1.10.1340200451; __utmc=137714676; __utmz=137714676.1340200451.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmv=
Connection: Keep-Alive
X-Requested-With: XMLHttpRequest
Accept: */*
Now the HTTP Response is interesting:
HTTP/1.1 200 OK
Date: Wed, 20 Jun 2012 13:54:11 GMT
Server: Apache/2.2.14 (Ubuntu)
X-Powered-By: PHP/5.3.2-1ubuntu4.15
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 11469
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
Then the json content. Do you see what is wrong in the HTTP response above? YUP.
Content-Type: text/html
The mime type for JSON is defined in RFC 4627. Please send with JSON content the following mime type.
Content-Type: application/json
That said You are saying that you still have the issue (I don't) on some specific URIs. Could you share which one?
My goal is to upload file with ajax-way.
I use this javascript library http://valums.com/wp-content/uploads/ajax-upload/demo-jquery.htm
There is a link on my page like "Upload" button on example page.
When I click it, "Open file" dialog is open.
I choose file and form is automatically submitted.
This is my javascript code.
var upload_btn = $('#upload-opml');
new AjaxUpload(upload_btn.attr('id'), {
action: upload_btn.attr('href'),
name: 'opml',
onComplete: function (file, response) {
//
}
});
This is server code in Ruby on Rails.
def upload_opml
render :text => 'hello'
end
Headers, taken from Firebug.
>> Response headers
Server nginx/0.7.62
Date Wed, 09 Jun 2010 19:03:28 GMT
Content-Type text/html; charset=utf-8
Connection keep-alive
Etag "5d41402abc4b2a76b9719d911017c592"
X-Runtime 18
Content-Length 5
Cache-Control private, max-age=0, must-revalidate
Set-Cookie _RssWebApp_session=BAh7CDoPc2Vzc2lvbl9pZCIlMzJhMTQ0ZWZhOGM3YmIwODFhZmFmNjkwYTI1YWQ2ZjQ6EF9jc3JmX3Rva2VuIjEvZHVzdm1NOVlMTUF6bEw3cGRFT2I3RzZvcVJZUU42bCtMNS9PVVYrNHdBPToMdXNlcl9pZGkG--13f1950a9530591881404fbfab7b1246f98f0d81; path=/; HttpOnly
>> Request headers
Host readbox.cz
User-Agent Mozilla/5.0 (X11; U; Linux i686; ru; rv:1.9.2) Gecko/20100115 Firefox/3.6
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language ru,en-us;q=0.7,en;q=0.3
Accept-Encoding gzip,deflate
Accept-Charset windows-1251,utf-8;q=0.7,*;q=0.7
Keep-Alive 115
Connection keep-alive
Referer http://readbox.cz/view
Cookie _RssWebApp_session=BAh7CDoPc2Vzc2lvbl9pZCIlMzJhMTQ0ZWZhOGM3YmIwODFhZmFmNjkwYTI1YWQ2ZjQ6EF9jc3JmX3Rva2VuIjEvZHVzdm1NOVlMTUF6bEw3cGRFT2I3RzZvcVJZUU42bCtMNS9PVVYrNHdBPToMdXNlcl9pZGkG--13f1950a9530591881404fbfab7b1246f98f0d81; login=1; APE_Cookie=%7B%22frequency%22%3A11%7D; show-tsl=0
But in Firefox I get an error
!:#8?BC http://readbox.cz (document.domain=http://readbox.cz) >B:070=> 2 #07#5H5=88 =0 ?>;CG5=85 A2>9AB20 HTMLDocument.readyState 87 http://readbox.cz (document.domain =5 1K; CAB0=>2;5=).
[Break on this error] if (doc.readyState && doc.readyState != 'complete') {
In Google Chrome
Unsafe JavaScript attempt to access frame with URL http://readbox.cz/subscriptions/upload_opml from frame with URL http://readbox.cz/view#/posts/all. Domains, protocols and ports must match.
/javascripts/ajaxupload.js?1276107673:574
Uncaught TypeError: Cannot read property 'readyState' of undefined
Domain readbox.info points to 127.0.0.1. It's for development.
I had the same problem and I fix it editing the ajaxupload library, with this commit:
https://github.com/felipelalli/ajax-upload/commit/9307f5eb6ded1ec63eac828a7ef4b8187acb9617
I already sent a pull request to the author.
I had this problem when I was using the sandbox developer environment (opensocial for Orkut). I just check now if "doc" is undefined. The upload works fine, but the callback now has no answer (the answer is undefined).
I don't know exactly what is the cause, but I think it is some kind of limitation of the dev environment.
If you want to download the fix, please check it out: https://github.com/felipelalli/ajax-upload/commits/3.9.1