Getting 403 forbidden in cypress for a simple login hosted on akamai - cypress

I am running this script to access a website hosted under akamai :
/// <reference types="cypress" />
describe('Store login', () => {
it('login to Store', () => {
cy.visit(("https://store.qlsit.qantas.com/"),{
headers: {
"Accept" : "application/json, text/plain, */*",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36",
"LOYALTY-PARTNER-FORWARD": "D19313AA-5BFF-4586-947A-C3AE8D78CEA4"
}
})
})
})
I am getting 403 forbidden .
I am not sure what header to add here to pass akamai blocker to access the website?
I have tried with different user-agent but it still doesn't work.

Related

Laravel GET Requests returns Resource not found for nse india website

I use this program to get the json data from https://www.nseindia.com/api/quote-equity?symbol=SBIN
My code:
foreach (StockName::all() as $stockName) {
$data = [];
$headers = [
"Host"=> "www.nseindia.com",
"Referer"=> "https://www.nseindia.com/get-quotes/equity?symbol=SBIN",
"X-Requested-With"=> "XMLHttpRequest",
"pragma"=> "no-cache",
"sec-fetch-dest"=> "empty",
"sec-fetch-mode"=> "cors",
"sec-fetch-site"=> "same-origin",
"User-Agent"=> "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36",
"Accept"=> "*/*",
"Accept-Encoding"=> "gzip, deflate, br",
"Accept-Language"=> "en-GB,en-US;q=0.9,en;q=0.8",
"Cache-Control"=> "no-cache",
"Connection"=> "keep-alive",
];
$lastRecord = $stockName->latestMetadata();
$eachSymbolResponse = Http::withHeaders($headers)->get("https://www.nseindia.com/api/quote-equity?symbol=".$stockName->symbol);
$symbolData = json_decode($eachSymbolResponse->body());
dd($eachSymbolResponse->body());
Here eachSymbolResponse->body() returns Resource not found
It seems like you'll need to visit their frontend to obtain some of the cookies. Apparently, their cookies act as some kind of authorization token that the API uses to very weirdly prevent external access.
I cannot pinpoint which cookie exactly they need but it appears to be one of these: ak_bmsc, bm_sv, bm_mi. bm_sv appears to only be set after a second request to either their frontend or APIs, but still cannot be used as the sole authentication cookie. So the most important cookies are ak_bmsc and bm_mi.
So, before you attempt to access their API you'll have to obtain the cookies from their main site and pass them to any subsequent following API calls. This is the only way I managed to get the API requests working consistently.
Following is the code I used to verify my theory. It still needs a ton of work to be useful in any production environment. You might also get away with caching the entire CookieJar returned by getAuthorisationCookies for around two hours (according to the expiration date of the aforementioned cookies). The headers declared in prepareRequestWithHeaders seem to be the bare minimum to get a working request in the first place.
use GuzzleHttp\Cookie\CookieJar;
use Illuminate\Http\Client\PendingRequest;
use Illuminate\Http\Client\Response;
PendingRequest::macro('withCookieJar', function (CookieJar $cookieJar) {
$this->options['cookies'] = $cookieJar;
return $this;
});
function prepareRequestWithHeaders(): PendingRequest
{
return Http::withHeaders([
'Accept' => '*/*',
'Accept-Encoding' => 'gzip, deflate, br',
'Connection' => 'keep-alive',
'User-Agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36',
]);
}
function getAuthorisationCookies(): CookieJar
{
$response = prepareRequestWithHeaders()
->get('https://www.nseindia.com');
return $response->cookies();
}
function getQuoteEquityResponse(string $symbol): Response
{
$authorisationCookies = getAuthorisationCookies();
return prepareRequestWithHeaders()
->withCookieJar($authorisationCookies)
->get("https://www.nseindia.com/api/quote-equity?symbol={$symbol}");
}
getQuoteEquityResponse('SBIN')->json();

Guzzle request agaisn't cloudflare protected server - works from development env but not production

I've got a bit of a strange one. I've been scraping a website for a while then I think they protected it with cloudflare which resulted in me getting a '...resulted in a 403 Forbidden response'.
So I added a user agent in and rotated my proxies and it works from my development machine however when deploying to my live server I am still getting a 403 error. Same code, same user agent, same proxy IP.
My code is below:
$client = new Client();
$response = $client->request('GET', 'https://www.targeturl.com', ['proxy' => '1.2.3.4:5432', 'headers' => [
'User-Agent' => 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.37',
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding' => 'gzip, deflate, br',
]]);
dd($response->getBody()->getContents());
Any ideas why?

I am getting 400 response for a http post request. How do I resolve this issue?

I have been trying to make a simple web request using python post data, the response ,
The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).
The code below posts a request using python requests library. 400 response is received when executed. Could this issue be due to header syntax or format issues.
code:
headers = {
'Host': 'host.url',
'Content-Length': '1847',
'Sec-Ch-Ua': '"Chromium";v="95", ";Not A Brand";v="99"',
'Accept': 'application/json, text/plain, /',
'Content-Type': 'application/json',
'Authorization': 'auth-key',
'Sec-Ch-Ua-Mobile': '?0',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36',
'Sec-Ch-Ua-Platform': '"Windows"',
'Origin': 'origin.url',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Dest': 'empty',
'Referer': 'referer.url',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-US,en;q=0.9',
'Connection': 'close',
}
data = {}
import json
json_object = json.dumps(data, indent = 4)
response = requests.post('url', data=json_object ,headers=headers, verify=False)
print(response.text)

Web scraping beginner. AJAX POST request not working

I have just started out with web scraping.
The data I need seems to be returned by an AJAX POST request. POST requests are very rarely covered by scraping tutorials and seem to come with lots of "gotcha's" for new users like myself.
I copied the request from Chrome dev tools into Postman using cURL and then generated the Python request code. The request uses a peculiar set query parameters... I have however repeated this process and the only parameter that changes is the session ID.
The problem is that the request stops working after some time has elapsed (Internal server error 500). I would then have to copy the request from the site again with the new session ID.
Any pointers in the right direction would be appreciated.
import requests
url = "https://online.natis.gov.za/gateway/PreBooking?_flowId=PreBooking&_flowExecutionKey=e1s2&flowName=[object%20Object]&_eventId_next=Next?dtoName=perSummaryDetailDto&viewId=perSummaryDetail&flowExecutionKey=e1s2&flowExecutionUrl=%2Fgateway%2FPreBooking%3F_flowId%3DPreBooking%26_flowExecutionKey%3De1s2&sessionId=IWhelPTLyYDa7JohJV6x8So_qEKdC8wOknArAXkS&surname={SURNAME}&initials=R&firstName1={FIRSTNAME}&emailAddress={EMAIL}&cellN={CELL}&isWithinPriorityDate=false&viewPrioritySlots=false&showPrioritySlotsModal=false&provcdt=4&supportUser=false"
payload = {}
headers = {
'Connection': 'keep-alive',
'Content-Length': '0',
'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
'Accept': 'application/json, text/plain, */*',
'sec-ch-ua-mobile': '?0',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36',
'Origin': 'https://online.natis.gov.za',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Dest': 'empty',
'Referer': 'https://online.natis.gov.za/',
'Accept-Language': 'en-US,en;q=0.9',
'Cookie': 'JSESSIONID=IWhelPTLyYDa7JohJV6x8So_qEKdC8wOknArAXkS.master:gateway_3; Gateway=R35619282; ROUTEID.33f40c02f95309866c572c0def16f016=.node1; JSESSIONID=BadmtwJ7c8YWEz73xe6Wu165Q7gapmm4WTY6at-p.master:gateway_3; Gateway=R35619282',
'dnt': '1',
'sec-gpc': '1'
}
response = requests.request(
"POST", url, headers=headers, data=payload, verify=False)
print(response.text)

Unable to Send Data in header to Spring Boot application in Angular 6

I am trying to hit my Spring Boot APIs using Angular 6. I am sending some data as a part of headers like this
const headers = new HttpHeaders({
'X-TenantID': tenantId,
'Accept': 'application/json'
});
this.httpClient.get(this.constants.urls.TENANT.VALIDATE + '/' + tenantId, {headers: headers });
At the back end, I am using interceptor and there i am trying to get 'X-TenantID' from my request Headers like this
if (request.getHeader("X-TenantID") != null) {
String tenantName = request.getHeader("X-TenantID");
ThreadLocaleStorage.setTenant(tenantName);
return tenantName;
}
And unfortunately its always returning null value for 'X-TenantID'. When I tried to print all headers it giving me following response
host
localhost:8080
connection
keep-alive
access-control-request-method
GET
origin
http://localhost:4200
user-agent
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36
dnt
1
access-control-request-headers
**x-tenantid**
accept
*/*
accept-encoding
gzip, deflate, br
accept-language
en-US,en;q=0.9
It's clear x-tenantid is present in headers but why I am not getting the value for it.
Please help me, how can I get this value from headers.

Resources