Using Express Router to proxy non-CORS content doubles request path - proxy

There must be something obvious I am missing. I have an external record like https/udspace.udel.edu/rest/handle/19716/28599 that is not set up for CORS, but I would like to get access to it in my application. In the past, I have set up a server to simply GET the resource and redeliver it for my application. However, I read about Node http-proxy-middleware and thought I'd try out a more direct proxy.
First, I could not use the target in the createProxyMiddleware() constructor because I wanted to send in the hostname and path of the desired resource like, perhaps, http://example.com/https/udspace.udel.edu/rest/handle/19716/28599 to get the resource above.
Relevant Code (index.js)
const express = require('express')
const { createProxyMiddleware } = require('http-proxy-middleware')
const app = express()
app.get('/info', (req, res, next) => {
res.send('Proxy is up.');
})
// Proxy endpoint
app.use('/https', createProxyMiddleware({
target: "incoming",
router: req => {
const protocol = 'https'
const hostname = req.path.split("/https/")[1].split("/")[0]
const path = req.path.split(hostname)[1]
console.log(`returning: ${protocol}, ${hostname}, ${path}`)
return `${protocol}://${hostname}/${path}`
},
changeOrigin: true
}))
app.listen(PORT, HOST, () => {
console.log(`Starting Proxy at ${HOST}:${PORT}`);
})
I was getting a 404 from the DSpace server without other information, so I knew that the request was going through, but transformed incorrectly. I tried again with an item in an S3 bucket, since AWS gives better errors and saw that my path was being duplicated:
404 Not Found
Code: NoSuchKey
Message: The specified key does not exist.
Key: api/cookbook/recipe/0001-mvm-image/manifest.json/https/iiif.io/api/cookbook/recipe/0001-mvm-image/manifest.json
What dumb thing am I doing wrong? Is this not what this proxy is for and I need to do something else?

Related

how to get the file link after successfully uploading in minio

I am using minio to manage the files
const getMinioClient = () => {
const minioClient = new Minio.Client({
endPoint: '127.0.0.1',
port: 9000,
useSSL: false,
accessKey: 'minioadmin',
secretKey: 'minioadmin'
});
return minioClient;
};
uploadFile(bucketName, newFileName, localFileLocation,metadata={}) {
return new Promise((resolve, reject) => {
const minioClient = getMinioClient();
//'application/octet-stream'
minioClient.fPutObject(bucketName, newFileName, localFileLocation, metadata , (err, etag) => {
if (err) return reject(err);
return resolve(etag);
});
});
}
with the following code I can upload the file, after successfully uploading it returns me only with etag, but I want to get the download link, how would I get it directly without searching the filename again.
You won't be able to get something like Public URL/Link for accessing images unless you ask for it to manually generate a time limited download URL using something like:
https://min.io/docs/minio/linux/reference/minio-mc/mc-share-download.html#generate-a-url-to-download-object-s
One workaround is to let nginx directly access the location you are uploading your files to:
https://gist.github.com/harshavardhana/f05b60fe6f96803743f38bea4b565bbf
After you have successfully written your file with your code above, you can use presignedUrl method to generate the link to your image.
An example for Javascript is here: https://min.io/docs/minio/linux/developers/javascript/API.html#presignedUrl:~:text=//%20presigned%20url%20for%20%27getObject%27%20method.%0A//%20expires%20in%20a%20day.%0AminioClient.presignedUrl(%27GET%27%2C%20%27mybucket%27%2C%20%27hello.txt%27%2C%2024*60*60%2C%20function(err%2C%20presignedUrl)%20%7B%0A%20%20if%20(err)%20return%20console.log(err)%0A%20%20console.log(presignedUrl)%0A%7D)
In any case you have to set an expiration time. Here or you set a very long time, which is suitable to your app or if you have a backend, require the images from Frontend through the backend with the getObject method: getObject(bucketName, objectName, getOpts[, callback]).
https://min.io/docs/minio/linux/developers/javascript/API.html#presignedUrl:~:text=getObject(bucketName%2C%20objectName%2C%20getOpts%5B%2C%20callback%5D)
If you have only a few number of static images to show in your app, (which are not uploaded by your app), you can also create the links manually with tme minio client or from the Minio-UI.

Failure to use Netnut.io proxy with Apify Cheerio scraper

I develop web scraper and I want to integrate Proxy from Netnut into it.
Netnut integration given:
Proxy URL: gw.ntnt.io
Proxy Port: 5959
Proxy User: igorsavinkin-cc-any
Proxy Password: xxxxx
Example Rotating IP format (IP:PORT:USERNAME-CC-COUNTRY:PASSWORD):
gw.ntnt.io:5959:igorsavinkin-cc-any:xxxxx
In order to change the country, please change 'any' to your desired
country. (US, UK, IT, DE etc.) Available countries:
https://l.netnut.io/countries
Our IPs are automatically rotated, if you wish to make them Static
Residential, please add a session ID in the username parameter like
the example below:
Username-cc-any-sid-any_number
The code:
Apify.main(async () => {
const proxyConfiguration = await Apify.createProxyConfiguration({
proxyUrls: [
'gw.ntnt.io:5959:igorsavinkin-DE:xxxxx'
]
});
// Add URLs to a RequestList
const requestQueue = await Apify.openRequestQueue(queue_name);
await requestQueue.addRequest({ url: 'https://ip.nf/me.txt' });
// Create an instance of the CheerioCrawler class - a crawler
// that automatically loads the URLs and parses their HTML using the cheerio library.
const crawler = new Apify.CheerioCrawler({
// Let the crawler fetch URLs from our list.
requestQueue,
// To use the proxy IP session rotation logic, you must turn the proxy usage on.
proxyConfiguration,
// Activates the Session pool.
minConcurrency: 10,
maxConcurrency: 50,
// On error, retry each page at most once.
maxRequestRetries: 2,
// Increase the timeout for processing of each page.
handlePageTimeoutSecs: 50,
// Limit to 10 requests per one crawl
maxRequestsPerCrawl: 1000,
handlePageFunction: async ({ request, $/*, session*/ }) => {
const text = $('body').text();
log.info(text);
...
});
await crawler.run();
});
The error: RequestError: getaddrinfo ENOTFOUND 5959 5959:80
Seems the crawlwer mixes with url ports 5959 and 80...
ERROR CheerioCrawler: handleRequestFunction failed, reclaiming failed request
back to the list or queue {"url":"https://ip.nf/me.txt","retryCount":3,"id":
"F32s4Txz0fBUmwd"}
RequestError: getaddrinfo ENOTFOUND 5959 5959:80
at ClientRequest.request.once (C:\Users\User\Documents\RnD\Node.js\merc
ateo-scraper\node_modules\got\dist\source\core\index.js:953:111)
at Object.onceWrapper (events.js:285:13)
at ClientRequest.emit (events.js:202:15)
at ClientRequest.origin.emit.args (C:\Users\User\Documents\RnD\Node.js\
mercateo-scraper\node_modules\#szmarczak\http-timer\dist\source\index.js:39:2
0)
at onerror (C:\Users\User\Documents\RnD\Node.js\mercateo-scraper\node_m
odules\agent-base\dist\src\index.js:115:21)
at callbackError (C:\Users\User\Documents\RnD\Node.js\mercateo-scraper\
node_modules\agent-base\dist\src\index.js:134:17)
at processTicksAndRejections (internal/process/next_tick.js:81:5)
Any way out ?
Try to use it in this format:
http://username:password#host:port

Socket IO forbidden 403

I have a simple socket.io app and it works just fine on local and also it's installed successfully on AWS server using plesk admin dashboard but when I connect to the app I always get forbidden {"code":4,"message":"Forbidden"} .. the entry point seems to work great http://messages.entermeme.com .. any idea what could be wrong with it ?
Frontend code
import io from 'socket.io-client'
const socket = io('https://messages.entermeme.com', {
transports: ['polling'],
})
socket.emit('SUBSCRIBE')
Backend code
const cors = require('cors')
const app = require('express')()
const server = require('http').Server(app)
const io = require('socket.io')(server)
server.listen(9000)
app.use(cors())
io.set('transports', [
'polling'
])
io.origins([
'http://localhost:8000',
'https://entermeme.com',
])
io.on('connection', (socket) => {
socket.on('SUBSCRIBE', () => {
//
})
})
had a similar issue but when using nginx. So in case you still need some help:
In the end it turned out to be the URL I specified as socket origins. I didn't specify the port since the origin for me was also running on port 80 (443 for SSL) like in your example above:
io.origins([
'http://localhost:8000',
'https://entermeme.com', // <--- No port specified
])
I updated my config and added the port. So for you it would be:
io.origins([
'http://localhost:8000',
'https://entermeme.com:80', // <--- With port (or 443 for SSL)
])

Possible to call http gets with Alexa hosted skill?

I have been trying without success to use http module in my Node.js endpoint to do a simple http get.
I have followed the various tutorials to execute the get within my intent, but it keeps failing with getaddrinfo ENOTFOUND in the cloudwatch log.
It seems like I am preparing the url correctly, if I just cut and past the url output into the browswer I get the expected response, and its just a plain http get over port 80.
I suspect that maybe the Alexa hosted lambda doesn't have permission necessary to make remote calls to non-amazon web services, but I don't know this for sure.
Can anybody shed any light? FYI this is the code in my lambda:
var http = require('http');
function httpGet(address, zip, zillowid) {
const pathval = 'www.zillow.com/webservice/GetSearchResults.htm' + `?zws-id=${zillowid}` + `&address=${encodeURIComponent(address)}&citystatezip=${zip}`;
console.log ("pathval =" + pathval);
return new Promise(((resolve, reject) => {
var options = {
host: pathval,
port: 80,
method: 'GET',
};
const request = http.request(options, (response) => {
response.setEncoding('utf8');
console.log("options are" + options);
let returnData = '';
response.on('data', (chunk) => {
returnData += chunk;
});
response.on('end', () => {
resolve(JSON.parse(returnData));
});
response.on('error', (error) => {
console.log("I see there was an error, which is " + error);
reject(error);
});
});
request.end();
}));
}
host: pathval is incorrect usage of the Node.js http module. You need to provide the hostname and the path + query string as two different options.
An example of correct usage:
host: 'example.com',
path: '/webservice/GetSearchResults.htm?zws-id=...',
(Of course, these can be variables, they don't need to be literals as shown here for clarity.)
The error occurs because you're treating the whole URL as a hostname, and as such it doesn't exist.
I suspect that maybe the Alexa hosted lambda doesn't have permission necessary to make remote calls to non-amazon web services
There is no restriction on what services you can contact from a within a Lambda function (other than filters that protect against sending spam email directly to random mail servers).

Nuxt Axios Dynamic url

I manage to learn nuxt by using following tutorial
https://scotch.io/tutorials/implementing-authentication-in-nuxtjs-app
In the tutorial, it show that
axios: {
baseURL: 'http://127.0.0.1:3000/api'
},
it is point to localhost, it is not a problem for my development,
but when come to deployment, how do I change the URL based on the browser URL,
if the system use in LAN, it will be 192.168.8.1:3000/api
if the system use at outside, it will be example.com:3000/api
On the other hand, Currently i using adonuxt (adonis + nuxt), both listen on same port (3000).
In future, I might separate it to server(3333) and client(3000)
Therefore the api links will be
localhost:3333/api
192.168.8.1:3333/api
example.com:3333/api
How do I achieve dynamic api url based on browser and switch port?
You don't need baseURL in nuxt.config.js.
Create a plugins/axios.js file first (Look here) and write like this.
export default function({ $axios }) {
if (process.client) {
const protocol = window.location.protocol
const hostname = window.location.hostname
const port = 8000
const url = `${protocol}//${hostname}:${port}`
$axios.defaults.baseURL = url
}
A late contribution, but this question and answers were helpful for getting to this more concise approach. I've tested it for localhost and deploying to a branch url at Netlify. Tested only with Windows Chrome.
In client mode, windows.location.origin contains what we need for the baseURL.
# /plugins/axios-host.js
export default function ({$axios}) {
if (process.client) {
$axios.defaults.baseURL = window.location.origin
}
}
Add the plugin to nuxt.config.js.
# /nuxt.config.js
...
plugins: [
...,
"~/plugins/axios-host.js",
],
...
This question is a year and a half old now, but I wanted to answer the second part for anyone that would find it helpful, which is doing it on the server-side.
I stored a reference to the server URL that I wanted to call as a Cookie so that the server can determine which URL to use as well. I use cookie-universal-nuxt and just do something simple like $cookies.set('api-server', 'some-server') and then pull the cookie value with $cookies.get('api-server') .. map that cookie value to a URL then you can do something like this using an Axios interceptor:
// plguins/axios.js
const axiosPlugin = ({ store, app: { $axios, $cookies } }) => {
$axios.onRequest ((config) => {
const server = $cookies.get('api-server')
if (server && server === 'some-server') {
config.baseURL = 'https://some-server.com'
}
return config
})
}
Of course you could also store the URL in the cookie itself, but it's probably best to have a whitelist of allowed URLs.
Don't forget to enable the plugin as well.
// nuxt.config.js
plugins: [
'~/plugins/axios',
This covers both the client-side and server-side since the cookie is "universal"

Resources