Creating a script to automate submitting something on a webpage - bash

I want to create a script, which accesses a website behind a login (with 2FA) and press the submit button every x seconds.
Unfortunately, I am a total Shell noob, but I already automated the process with the Chrome extensions "Kantu Browser Automation", but the extension has limits on the looping and a looping timeout.

use curl command for this and put it crontab.
curl:
https://curl.haxx.se/
you have to use POST method.
crontab:
https://crontab.guru/

Related

curl 1020 error when trying to scrape page using bash script

I'm trying to write a bash script to access a journal overview page on SSRN.
I'm trying to use curl for this, which works for me on other webpages, but it returns error code: 1020 for me if I try to run the following codes:
curl https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1925128
I thought it might have to do with the question mark in the URL, but I got it to work with other pages that contained question marks.
It probably has something to do with what the page's allows to do. However, I can also access the page using R's rvest package, so I think it should work in general also using bash.
Looks like the site has blocked access via curl. Change the user agent and it should work fine i.e.
curl --user-agent 'Chrome/79' "https://papers.ssrn.com/sol3/papersstract_id=1925128"

Curl or Lynx scripting with Chrome Cookie

Just looking for someone to point me in the right direction. I need to script an interaction with a site that uses a "trust this device" cookie, and uses a log in portal. I found the cookie in Chrome, but not sure what to do next. This will be hosted on a CentOS 7 system.
After authenticating to the login portal, I need to access another page using the "trust this device" cookie and the session cookie so I can download files. Manually downloading files everyday gets tedious, and the owner of the site does not want to use SFTP.
Update 1 :
There was some confusion in my request (I could have made it more clear), I am NOT looking for someone to "write code" for me. This is more a sanity check as I learn how this process works. Please simply point me in the right direction as far as tools and general procedure.
Update 2 :
Using the "Copy as curl" option found in most web browsers, I was able to get the correct header information needed for authenticating.
Instead of
curl -b "xxx=xxx"
I needed
curl -H "Cookie: XXXX="%"2Fwpsnew; xxx=xxx"
When adding the -c switch, I can now save the session cookie. Further testing is needed, but at least there is progress.
EDIT
Using the Chrome feature for copying curl commands from the history (this is found in Firefox as well), I was able to partially reproduce results. However in my case I was not able to log in as the site I was working with uses additional js that modifies the cookies.
This initial question can be closed, I will open a new post for more specific parts of my project.

Chrome.runtime.connect no longer identified?

I have an extension with a background page and a sandbox page where most of the content scripts execute.
Whenever I need to do an Ajax call it has to run in the background environment as otherwise I get a CORS error. Recently as of last week I believe, the chrome.runtime is no longer available in the sandbox environment for some reason. I can't find any notes etc about it and trying to figure out a solution how to communicate with background page now.
I had this in the sandbox environment to initialize a connect port to pass messages from an Ajax request
var ajaxCall = chrome.runtime.connect({name: "ajaxCall"});
Is there any info out there that I'm missing on why this change occurred and what are some possible workarounds?
Here's the output for chrome. 1st is the background page and 2nd is the sandbox. They used to be identical in both.

Wget download after POST, make it wait?

I am working on bash script in which I use Wget to supply POST data, and Wget is supposed to make POST request on specific page, and that page is supposed to return file for download.The problem is that, after making request, that page returns file after few seconds, not immediately so Wget only downloads html page, and don't wait for that file to be returned.Is there any option to make this work - make post request and wait a few seconds for a file to be returned from remote server ?
If your only problem is that you need more time you can use the sleep command.
You can get more information about it here: http://www.linuxtopia.org/online_books/advanced_bash_scripting_guide/timedate.html
Hope that helped!

Ajax Post Request blocks website loading

I have a strange problem with using ajax post requests. I use the request to run an ImageMagick process directly on the command line by using php function exec(). The process takes about a minute, and then responds with some variables. This is working fine, except from one problem. During the execution time I cannot excess other parts of the website that are installed on the same webserver (as if the server is unreachable). When the process finishes, everything works fine again.
I first thought this to be due to an overloaded server. However, when you access the website via another browser, there are no problems, even during the execution time of the process in the other browser. So it looks like the problems has something to do with browsers blocking other requests during the post request.
Could anyone help me out here? What could be the root problem?
Found the solution! Thanks from the help by kukipei By adding session_write_close(); to the file of the ajax request (after is has read the userid and token), the session file is no longer locked, and all pages are accessible again. Problem was that the session was locked during the whole execution time of the process, which was not necessary, since I only needed the session to read the userid and token. So before calling the ImageMagick operation, I now add session_write_close()

Resources