403 Forbidden error in Firefox only, works in Chrome and Safari - firefox

I have a Firefox quicksearch bookmark that runs a Maxmind query. This worked until recently. I type 'ip 82.176.230.15' (for example) into the URL bar and it queries Maxmind to retrieve the location of the IP:
http://www.maxmind.com/app/locate_demo_ip?ips=82.176.230.15
Within the past week, for reasons unknown, I now get a 403/Forbidden error when I try to access Maxmind.
"You don't have permission to access /app/locate_demo_ip on this server"
Strangely, the same URL is accessible in Chrome and Safari. I can also access the same URL with Firefox, Chrome, or Safari on my Mac.
I've deleted all cookies, disabled all addons, and still can't get it to work. Any idea what could be happening? I know that the 403 has to come from the server, so I don't know why it would work in other browsers. And it's been going on for days, definitely not some glitch on their server.

Get an HTTP debugger like firebug or fiddler (not sure that will work with FireFox, but probably if you set it up right)
Look at the difference between using your quick bookmark and just typing the URL. The server could return 403 whenever it feels like -- see if there's any difference, and what it is.

I recently had the same issue and was able to fix it.
In my case the problem was in headers that Mozilla sent.
Particularly it was because of header:
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0"
What makes web-site refuse connection is this part of string "(X11; Ubuntu; Linux x86_64; rv:100.0)" and i have no idea why.
I found a nice solution, you can change Mozilla settings to include other browsers in this header (Chrome and Safari) and it could make sites with this problem works.
Here is how to do it:
Type about:config into the URL bar. Press Enter.
Create a new entry with key=general.useragent.override and add this string there Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.115 Safari/537.36. I found that Google Chrome uses this string as User-Agent header probably to prevent such issues. So you should see something like:
Now save this settings and go reload your page, it should work now

Related

How to use Nokogiri in ruby to parse link contain # charater

I use Nokogiri in ruby to parse link like this
link='http://vnreview.vn/danh-gia-di-dong#cur=2'
doc= Nokogiri::HTML(open(link,'User-Agent'=>'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31').read, nil, 'UTF-8')
but nokogiri return doc is source of link='http://vnreview.vn/danh-gia-di-dong'
How can i parse link with #cur=1, #cur=2...
Fragment is not sent to server with http request, i.e. if you open http://www.example.com/#fragment in browser following request will be made:
GET / HTTP/1.1
Host: example.com
Then after receiving response, browser will append fragment to URL and perform some actions (for example, scroll to element with id="fragment", or execute javascript callbacks)
If page content differs based on fragment, it's done via javascript. Nokogiri is not capable of running javascript, so you need some other tool, like selenium-webdriver or capybara-webkit.
Another option is to inspect ajax requests on page you trying to parse and probably you'll find JSON with data you need. Then download this json directly. Probably content is already on page, and it's just hidden via css (like tabs in twitter bootstrap).

Google crawler does not translate #! to _escaped_fragment_ mapping in ajax application

I have a single page applicaton that is supposed to use #! (hash bang) for navigation. I now read Google's specification on Making AJAX Applications Crawlable. How can I test that my application works in the required way?
I entered my application in the google plus debugger, e.g. http://www.mysite.org/de#!foo=bar. However, apache's access log tells me that the google crawler then does not translate #! to _escaped_fragment_, hence the google debugger still retrieves /de without the hash bang:
66.249.81.165 - - [06/Mar/2014:15:54:06 +0100] "GET /de HTTP/1.1" 200 177381 "Mozilla/5.0 (compatible; X11; Linux x86_64; Google-StructuredDataTestingTool; +http://www.google.com/webmasters/tools/richsnippets)"
(Note well: GET /de without _escaped_fragment_ hash fragment still). I'd expect Google to retrieve something like this instead:
... "GET /de?_escaped_fragment_ mapping HTTP/1.1" ...
As far as I know is _escaped_fragment_= with the = at the end.
Have you tried with:
<meta name="fragment" content="!" />
on your HTML head?

Omit display of HTTP requests by URL or other means

When using Fiddler for web debugging with Visual Studio, the vast majority of requests appear to be Visual Studio keepalive's which have nothing to do with development of the website.
I just discovered the "Filters" tab which includes: Show only if URL contains:, but I don't see anything like "Do not show if URL contains:"
Below is an image showing the traffic in question.
The contents of which resembles:
GET /67e56dbd9660475b992bdb4884bf024c/arterySignalR/poll?transport=longPolling&connectionToken=AQAAANCMnd8BFdERjHoAwE%2FCl%2BsBAAAAA9mo0FfMdkuV%2FOrook6XLgAAAAACAAAAAAADZgAAwAAAABAAAACdwdngu4Q3YaxNPSSSB6SaAAAAAASAAACgAAAAEAAAAEpHLB83IL2dS4l5v3LvZ4woAAAAPAHEqYMxK%2Fwwk%2Be%2FEq3MMrbOM4ao8Nhip4toaFxOxM0ARXitnQCueRQAAADELXsi%2FlcBeN%2BcFxQKtcMb7Yvd3A%3D%3D&messageId=d-B39A7C95-E4%2C0%7CE7%2C4%7CE8%2C0&requestUrl=http%3A%2F%2Flocalhost%3A56602%2FReticleDatabase%3Fsubmit%3DSearch%26process%3D%26device%3D%26lev_no%3D999%26xadj%3Dtrue%26xadj%3Dfalse%26xmag1%3Dtrue%26xmag1%3Dfalse&browserName=Internet+Explorer&tid=8&callback=jQuery18206701631324945791_1391540298842&_=1391540397878 HTTP/1.1
Accept: application/javascript, */*;q=0.8
Referer: http://localhost:56602/ReticleDatabase?submit=Search&process=&device=&lev_no=999&xadj=true&xadj=false&xmag1=true&xmag1=false
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Host: localhost:61010
Connection: Keep-Alive
How can I filter (not display) this junk data in Fiddler?
Fiddler offers many ways to filter data. The most powerful mechanism is FiddlerScript. Click Rules > Customize Rules. Scroll to OnBeforeRequest and add:
if (oSession.urlContains("SignalR/poll?")) { oSession["ui-hide"] = "FiddlerScript hides signalR"; }
Save the file.
(It's not entirely clear that SignalR's longPolling requests really have "nothing to do with development of the website", but if you don't want to see them, they're easily hidden.)
Incidentally, the next build of Fiddler's Filters tab will include a Hide URLs containing option. Thanks for the suggestion.

Why does requests library fail on this URL?

I have a url. When I try to access it programmatically, the backend server fails (I don't run the server):
import requests
r = requests.get('http://www.courts.wa.gov/index.cfm?fa=controller.managefiles&filePath=Opinions&fileName=875146.pdf')
r.status_code # 200
print r.content
When I look at the content, it's an error page, though the status code is 200. If you click the link, it'll work in your browser -- you'll get a PDF -- which is what I expect in r.content. So it works in my browser, but fails in Requests.
To diagnose, I'm trying to eliminate differences between my browser and Requests library. So far I've:
Disabled Javascript
Disabled (and deleted) cookies
Set the User-Agent to be the same in each
But I can't get the thing to work properly in Requests or fail in my browser due to disabling something. Can somebody with a better idea of browser-magic help me diagnose and solve this?
Does the request work in Chrome? If so, you can open the web inspector and right-click the request to copy it as a curl command. Then you'll have access to all the headers, params, and request body, which you can play around with to see which are triggering the failure you're seeing with the requests library.
You're probably running into a server that discriminates based on User-Agent. This works:
import requests
S = requests.Session()
S.headers.update({'User-Agent': 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)'})
r = S.get('http://www.courts.wa.gov/index.cfm?fa=controller.managefiles&filePath=Opinions&fileName=875146.pdf')
with open('dl.pdf', 'wb') as f:
f.write(r.content)

Why does ie8's user agent return 'opera'?

My code at: http://www.mgxvideo.com/mgxcopy-dev/get_browser.php, returns Opera when I run IE8. My source is:
<?php
$browser = get_browser(null, true);
echo $browser['browser'];
?>
It doesn't. The get_browser() function is making educated (but ill-informed) guesses about which browser the user-agent is running. Your browser capabilities file is likely outdated, probably because it was made before IE8 was released. Update it here.
The real IE8 user-agent string looks something like this:
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)
You should use a lower-level tool, like a packet trace or server logging or a header dump to see what is being sent.

Resources