proxy.pac - exception for images - firefox

I'm a web developer and I use squid as a proxy, which I entered in firefox as the proxy server.
So when I enter http://www.example.com in firefox, I see the site on my local machine, by having configured squid accordingly.
Now problem is, that some of our customers have GBs of images, and it's a pain to load them all on my machine. So basically I want to use my offline webpage, but loading the images from the live server, so I don't have a broken site without images.
In order to do this I've tried to create a proxy.pac and configured it this way:
function FindProxyForURL(url, host) {
if (shExpMatch(url, "*.jpg")) {
return "DIRECT";
} else {
return "PROXY 192.168.178.31:3128; DIRECT";
}
}
Unfortunately it doesn't really work. What am I doing wrong, and how can I achieve my goal?

According to the Mozilla document on PAC files:
The path and query components of https:// URLs are stripped. In Chrome, you can disable this by setting PacHttpsUrlStrippingEnabled to false, in Firefox the preference is network.proxy.autoconfig_url.include_path.
What this means is when you enter a url such as https://www.example.com/image.jpg, what gets passed to the PAC script is the url https://www.example.com. As a result, you're never going to enter the first condition of your if statement.
In Firefox, you can change this by going to the about:config page and setting network.proxy.autoconfig_url.include_path to true.

Related

Create a multi-website proxy with `http-proxy`

I'm using node-http-proxy to run a proxy website. I would like to proxy any target website that the user chooses, similarly to what's done by https://www.proxysite.com/, https://www.croxyproxy.com/ or https://hide.me/en/proxy.
How would one achieve this with node-http-proxy?
Idea #1: use a ?target= query param.
My first naive idea was to add a query param to the proxy, so that the proxy can read it and redirect.
Code-wise, it would more or less look like (assuming we're deploy this to http://myproxy.com):
const BASE_URL = 'https://myproxy.com';
// handler is the unique handler of all routes.
async function handler(
req: NextApiRequest,
res: NextApiResponse
): Promise<void> {
try {
const url = new URL(req.url, BASE_URL); // For example: `https://myproxy.com?target=https://google.com`
const targetURLStr = url.searchParams.get('target'); // Get `?target=` query param.
return httpProxyMiddleware(req, res, {
changeOrigin: true,
target: targetURLStr,
});
} catch (err) {
res.status(500).json({ error: (err as Error).message });
}
}
Problem: If I deploy this code to myproxy.com, and load https://myproxy.com?target=https://google.com, then google.com is loaded, but:
if I click a link to google images, it loads https://myproxy.com/images instead of https://myproxy.com?target=https://google.com/images, also see URL as query param in proxy, how to navigate?
Idea #2: use cookies
Second idea is to read the ?target= query param like above, store its hostname in a cookie, and proxy all resources to the cookie's hostname.
So for example user wants to access https://google.com/a/b?c=d via the proxy. The flow is:
go to https://myproxy.com?target=${encodeURIComponent('https://google.com/a/b?c=d')}
proxy reads the ?target= query param, sets the hostname (https://google.com) in a cookie
proxy redirects to https://myproxy.com/a/b?c=d (307 redirect)
proxy sees a new request, and since the cookie is set, we proxy this request into node-http-proxy using cookie's target.
Code-wise, it would look like: https://gist.github.com/throwaway34241/de8a623c1925ce0acd9d75ff10746275
Problem: This works very well. But only for one proxy at a time. If I open one browser tab with https://myproxy.com?target=https://google.com, and another tab with https://myproxy.com?target=https://facebook.com, then:
first it'll set the cookie to https://google.com, and i can navigate in the 1st tab correctly
then I go to the 2nd tab (without closing the 1st one), it'll set the cookie to https://facebook.com, and I can navigate facebook on the 2nd tab correctly
but then if I go back to the first tab, it'll proxy google resources through facebook, because the cookie has been overwritten.
I'm a bit out of ideas, and am wondering how those generic proxy websites are doing. Ideally, I would not want to parse the HTML of the target website.
The idea of a Proxy is to intercept the client requests, either by ports or by backend APIs, extract the URLs of requested resources, modify them and make those requests by self from servers, and modify responses and send them back to the client.
your first approach does this except modify responses and send back modified responses.
one way to do this is to edit all links in resources return by proxy to have your web address in them, only then send them as responses back to the client.
another way is to wrap the target site in a frame, as most web proxy sites do, and have a script to crawl the page and replace all links.
there is a small problem though. javascript-based requests are mostly hardcoded in the script and it is not an easy job to replace them.
your seconds approach sounds as if it would work better, but just a sound, nothing concrete I can say. implement a tab activity checker so you can change the cookie to your active tab. please check how-to-tell-if-browser-tab-is-active discussion about that

Laravel forcing Http for asssets

this is a little bit strange because most of the questions here wanted to force https.
While learning AWS elastic beanstalk. I am hosting a laravel site there. Everything is fine, except that none of my javascripts and css files are being loaded.
If have referenced them in the blade view as :
<script src="{{asset('assets/backend/plugins/jquery/jquery.min.js')}}"></script>
First thing I tried was looking into the file/folder permissions in the root of my project by SSHing into EC2 instance. Didn't work even when I set the permission to public folder to 777.
Later I found out that, the site's main page url was http while all the assets url were 'https'.
I dont want to get into the SSL certificates things just yet, if it is possible.
Is there anyway I can have my assets url be forced to Http only?
Please forgive my naiveity. Any help would be appreciated.
This usually happens if your site is for example behind an reverse proxy, As the URL helper facade, trusts on your local instance that is beyond the proxy, and might not use SSL. Which can be misleading/wrong.
Which is probaly the case on a EC2 instance... as the SSL termination is beyond load balancers/HA Proxies.
i usually add the following to my AppServiceProvider.php
public function boot()
{
if (Str::startsWith(config('app.url'), 'https')) {
\URL::forceScheme('https');
} else {
\URL::forceScheme('http');
}
}
Of course this needs to ensure you've set app.url / APP_URL, if you are not using that, you can just get rid of the if statement. But is a little less elegant, and disallows you to develop on non https

Swagger page being redirected from https to http

AWS Elastic Load Balancer listening through HTTPS (443) using SSL and redirecting requests to EC2 instances through HTTP (80), with IIS hosting a .net webapi application, using swashbuckle to describe the API methods.
Home page of the API (https://example.com) has a link to Swagger documentation which can bee read as https://example.com/swagger/ui/index.html when you hove over on the link.
If I click on the link it redirects the request on the browser to http://example.com/swagger/ui/index.html which displays a Page Not Found error
but if I type directly in the browser URL https://example.com/swagger/ui/index.html then it loads Swagger page, but then, when expanding the methods an click on "Try it out", the Request URL starts with "http" again.
This configuration is only for Stage and Production environments. Lower environments don't use the load balancer and just use http.
Any ideas on how to stop https being redirected to http? And how make swagger to display Request URLs using https?
Thank you
EDIT:
I'm using a custom index.html file
Seems is a known issue for Swashbuckle. Quote:
"By default, the service root url is inferred from the request used to access the docs. However, there may be situations (e.g. proxy and load-balanced environments) where this does not resolve correctly. You can workaround this by providing your own code to determine the root URL."
What I did was provide the root url and/or scheme to use based on the environment
GlobalConfiguration.Configuration
.EnableSwagger(c =>
{
...
c.RootUrl(req => GetRootUrlFromAppConfig(req));
...
c.Schemes(GetEnvironmentScheme());
...
})
.EnableSwaggerUi(c =>
{
...
});
where
public static string[] GetEnvironmentScheme()
{
...
}
public static string GetRootUrlFromAppConfig(HttpRequestMessage request)
{
...
}
The way I would probably do it is having a main file, and generating during the build of your application a different swagger file based on the environnement parameters for schemes and hosts.
That way, you have to manage only one swagger file accross your environments, and you only have to manage a few extra environnement properties, host and schemes (if you don't already have them)
Since I don't know about swashbuckle, I cannot answer for sure at your first question (the redirect)

How to set nginx cache headers to never expire?

Right now I'm using this:
location ~* \.(js|css)$ { # |png|jpg|jpeg|gif|ico
expires max;
#log_not_found off; # what's this for?
}
And this is what I see in firebug:
Did it work? If I didn't get it wrong, my browser is asking for the file again, and nginx is answering 'not modified', so my browser uses the cache. But I thought the browser shouldn't even ask for the file, it already knows it will never expire.
Any thoughts?
Do not use F5 to reload the page. Use click on the url + enter, or click in a link. That's how I got only 1 request.
Clearly , your file is not stale as its max-age and expiry date are still valid and hence the browser will not communicate with server.The Browser doesn't ask for the file unless it is stale. i.e. its cache-control ( max -age) is over or Expiry date is gone. In that case it will ask the serve if the given copy is still valid or not. if yes, it will serve same copy, else it will get new one.
Update :
See, here is the thing. F5/refresh will always make browser to request the server if anything is modified or not. It will have If-Modified-Since in Request header. While it is different from just navigating the site, coming back to pages and click events in which browser will not ask server , and load from cache silently( no server call). Also, if you are testing on firefox Live HTTP Headers, it will show you exactly what is requested, while Firebug will always show you If-Modified-Since. Safari's developer menu should show load time as 0. Hope it helps.

Can XDomainRequest be made to work with SSL?

I have code that uses Microsoft's XDomainRequest object in IE8. The code looks like this:
var url = "http://<host>/api/acquire?<query string>";
var xdr = new XDomainRequest();
xdr.onload = function(){
$.("#identifier").text(xdr.responseText);
};
xdr.open("GET", url);
xdr.send();
When the scheme in "url" is "http://" the command works fine. However, when the scheme is "https://" IE8 gives me an "Access denied" JavaScript error. Both schemes work fine in FF 3.6.3, where I am, of course, using XmlHttpRequest. With both browsers I am complying with W3C Access Control. "http://" works cross origin for both browsers. So the problem is with IE8, XDomainRequest, and SSL.
The SSL certificate is not the problem. If I type https://<host>/ into the address bar of IE8, where <host> is the same as in "url" above, the page loads fine.
So we have the following:
- hitting https://<host>/ directly from the browser works fine;
- hitting https://<host>/api/acquire?<query string> via XDomainRequest is not allowed.
Can it be done? Am I leaving something out?
Apparently, the answer is here: Link
Point 7 on this page says, "Requests must be targeted to the same scheme as the hosting page."
Here is some of the supporting text for point 7:
"It was definitely our intent to prevent HTTPS pages from making
XDomainRequests for HTTP-based resources, as that scenario presents a
Mixed Content Security Threat which many developers and most users do
not understand.
However, this restriction is overly broad, because it prevents HTTP
pages from issuing XDomainRequests targeted to HTTPS pages. While it’s
true that the HTTP page itself may have been compromised, there’s no
reason that it should be forbidden from receiving public resources
securely."
It would appear at present that the answer to my original question is: YES, if the hosting page can use the "https://" scheme; NO, if it cannot.

Resources