I'm currently trying to get a .zip file from a webpage using nifi to do it. I am able to generate a direct download link of the file but the application needs to log in into the page before opening the direct link. I've tried using InvokeHTTP, ListWebDAV and FetchWebDAV processors and I was not able to do this properly.
I even tried to add the login and the password as attributes using the same ID used by the page(logon, temp_password).
Also tried going for a Python code but I was not able to get any good results with it.
Every time I tried any of these methods I received a small file on the InvokeHTTP with the download link saying that authorization is required and it downloads a file that is the source code of the login page.
Tried to look in almost everyplace on the internet without much success :/
I'm now trying to get a processor to actually log into the page and keep it that way so the invoke processor can download the zip file using the direct link.
If somebody have another idea on how I can resolve this I will be very grateful.
I can provide more info if needed, at the moment I am using the Ni-Fi 1.1.2
Thanks in advance;
Depending on the authentication mechanism in place by the page, you'll likely need to chain two InvokeHTTP processors together. Assuming the first page has a form field you fill out with the username and password, you'll make one InvokeHTTP which uses the POST method to submit the form with the provided credentials and receives a response that contains some kind of token (session ID, etc.). You will extract this value (either from a response header or the page content), and provide it to the second InvokeHTTP as a request header. Using your browser's Developer Tools feature as daggett suggested to observe the authentication process will allow you to determine exactly where these values are provided.
Related
I have a Ubuntu Nginx server (using laravel forge to set it up)
I am now getting 403 errors when posting form data including which I was not getting previously.
The form is posted by a javascript button $('#my-form').submit(); if this is relevant.
Other forms are working fine as long as I remove the tags (used for youtube embedding)
Open up developer console and see more details about the POST request in network tab or console itself. 4XX is a group for client errors, not server or runtime, so expect the issue to be in your implementation. Maybe you use some package that is supposed to "automagically" authenticate or check user permissions when accepting this specific request, and so it fails because you are not passing some header or custom field? Hard to tell without more details.
Add relevant code (at very least your form html) if you want more specific tips.
In this case - I also had a wordpress blog installed with a wordfence plugin operating. The wordfence configuration was enforcing security settings which were preventing any website forms from posting and tags
I am a novice in the area of benchmarking and so would like to request for your guidance.
Problem: I have a test website developed in PHP and MySQL hosted in the localhost.
I need to perform the following set of activities:-
Login as a registered user
Download a PDF file
I wish to know how to load test the above activities in order? I need to check if at a particular instant, 'n' number of users are logged in and they download a pdf file, what would be the worst response time and related stats.
Steps I already did (Please correct me if I did something wrong here.):-
Used the apache benchmarking tool (ab) to load test the login authentication script page passing the username and password as parameters
(i.e., ab -n 1000 -c 100 -A username:password url_of_script.php)
I tested both for apache and nginx webservers (got comparatively better results in nginx)
But, I want to test if after login, the user performs some other activities, how can we use the ab (or some other) tool to assess the load.
Waiting for your responses. Thanks.
Create a PHP script using curl.
Use your Browser to login and and downlaod the PDF.
Before you do that, right click and select Inspect or Inspect Element. Go to the Network tab. Then start the login and download PDF process.
In the Network Tab look only at the request headers of each HTML page request. Filter out all the other requests (e.g. JS CSS images, media and etc.). You can use these headers as a guideline when setting up curl to do each request.
In FireFox you can edit the headers and resend. Go in to the edit mode and copy the request header.
In your curl requests use exactly the the same request the Browser used.
Curl reports all the stats on the request and response.
We are using Embedded signing of DocuSign REST API to e-sign files.To sign a file, we upload the required file to our web app and then display it a viewer in the browser. This file can be signed immediately or later.
What is happening is that when the file is signed and the process is completed, we return to the same file view but the updated file is not reflected. Only when we refresh page like 3-4 times, it shows the sign on the file.
This issue comes only for files that were uploaded and signed later. For a fresh file which is uploaded and signed immediately, we get the updated file view.
It appears that all the browsers cache files (not HTML page, but the embedded files). The recommended solutions suggest to either add a parameter in the request when file is reloaded after signing- but this works only intermittently. The other is to rename the file so that the browser picks the updated file. But renaming file is not an option for us.
Is there some other alternative? Have any other DocuSign API users ever faced something similar? (I believe this issue would not come if you use email request mode for e-signing)
Thanks.
There have been no similar reports from anyone... I am not discounting yours necessarily but when you just write up something about your web app I could think of a few things that your web app could be doing out of sequence to see this behavior.
The first common mistake with embedded signing that comes to mind is this. In general embedded signing requires several steps (1) login call (2) create envelope (3) get the view of the recipients.
Most of the people put that logic in the controller code behind a web page so when they come back it goes through the same sequence. I understand that your page has some logic to maybe guard against it, but ideally on the "viewing" you should only call (3) - getting the view. If you somehow end up calling (2) again - you will see the signing sequence all over.
That's the most common mistake. However I do not want to discount your report. In order to actually get to the bottom of it you should post the web service call traces (XML for SOAP / JSON for REST) and show exactly what your app is doing.
Hope this helps.
-mb // i work for docusign
I want to write a script to log in and interact with a web page, and a bit at a loss as to where to start. I can probably figure out the html parsing, but how do I handle the login part? I was planning on using bash, since that is what I know best, but am open to any other suggestions. I'm just looking for some reference materials or links to help me get started. I'm not really sure if the password is then stored in a cookie or whatnot, so how do I assess the situation as well?
Thanks,
Dan
Take a look a cURL, which is generally available in a Linux/Unix environment, and which lets you script a call to a web page, including POST parameters (say a username and password), and lets you manage the cookie store, so that a subsequent call (to get a different page within the site) can use the same cookie (so your login will persist across calls).
I did something like that at work some time ago, I had to login in a page and post the same data over and over...
Take a look at here. I used wget because I did not get it working with curl.
Search this site for screen scraping. It can get hairy since you will need to deal with cookies, javascript and hidden fields (viewstate!). Usually you will need to scrape the login page to get the hidden fields and then post to the login page. Have fun :D
I'm trying to find a way of finding out who is downloading what image from an image gallery. Users can download using a button beside the thumbnail or right click and use the "save link as" Is it possible to relate a user session or ID to a "save link as" action from all browsers using either PHP or JavaScript.
Yes, my preferred way of doing this would be via PHP. You'd have to set up a script which would load up the file and send it to the user browser. This script would also be able to log the download somewhere (e.g. your database).
For example - in very rough pseudo-code:
download.php
$file = $_GET['file'];
updateFileCount($file);
header('Content-Type: image/jpeg');
sendFile($file);
Then, you just have your download link point to download.php instead of the actual file. (Note that updateFileCount and sendFile are functions that you would have to provide, of course - this script is an example of a download script which you could use)
Note: I highly recommend avoiding the use of $_GET['file'] to get the whole filename - malicious users could use it to retrieve sensitive files from your web server. But the safe use of PHP downloads is a topic for another question.
You need a gateway script, like ImageDownload.php?picture=me.jpg, or something like that.
That page whould return the image bytes, as well as logging that the image is downloaded.
Because the images being saved are on their computer locally there would be no way to get that kind of information as they have already retrieved the image from your system. Even with javascript the best I know that you could do is to log each time a user presses the second mousebutton using some kind of ajax'y stuff.
I don't really like the idea, but if you wanted to log everytime someone downloaded an image you could host the images inside a flash or java app that made it a requirement to click a download image button. That way the only way for them to get the image without doing that would be to either capture packets as they came into their side or take a screenshot.
Your server access logs should already have the request for the non-thumbnailed version of the file, so you just need to modify the log format to include the sessionid, which I presume you can map back to a user.
I agree strongly with the suggestion put forward by Phill Sacre. For what you are looking for this is the way to go.
It also has the benefit of being potentially able to keep the tracked files out of the direct web path so that they can't be direct linked to.
I use this method in a client site where the images are paid content so must be restricted access.