How to speedup the requests query? - parallel-processing

I want to basically brute-force multiple links to check if the file exists in any of the links, is there any possibility that i can increase the speed of requests, maybe multiple request processes at the same time, because it's currently too slow.
import requests
for no in range(100000,500000):
part1 = "https://url/"
part2 = "_[12시.jpg"
url = part1+str(no)+part2
response = requests.get(url)
if response.status_code == 200:
print(url)
break
elif response.status_code == 404:
print('Not Found #',no)

Without going into the complexities of sending requests at the same time, here's an approach that will already speed up your calls by using requests.Session
import requests
s = requests.Session()
for no in range(100000, 500000):
url = f"https://url/{no}_[12시.jpg"
response = s.get(url)
if response.status_code == 200:
print(url)
break
elif response.status_code == 404:
print(f"Not Found #{no}")

Related

Iterate through asyncio loop

I am very new with aiohttp and asyncio so apologies for my ignorance up front. I am having difficulties with the event loop portion of the documentation and don't think my below code is executing asynchronously. I am trying to take the output of all combinations of two lists via itertools, and POST to XML. A more full blown version is listed here while using the requests module, however that is not ideal as I am needing to POST 1000+ requests potentially at a time. Here is a sample of how it looks now:
import aiohttp
import asyncio
import itertools
skillid = ['7715','7735','7736','7737','7738','7739','7740','7741','7742','7743','7744','7745','7746','7747','7748' ,'7749','7750','7751','7752','7753','7754','7755','7756','7757','7758','7759','7760','7761','7762','7763','7764','7765','7766','7767','7768','7769','7770','7771','7772','7773','7774','7775','7776','7777','7778','7779','7780','7781','7782','7783','7784']
agent= ['5124','5315','5331','5764','6049','6076','6192','6323','6669','7690','7716']
url = 'https://url'
user = 'user'
password = 'pass'
headers = {
'Content-Type': 'application/xml'
}
async def main():
async with aiohttp.ClientSession() as session:
for x in itertools.product(agent,skillid):
payload = "<operation><operationType>update</operationType><refURLs><refURL>/unifiedconfig/config/agent/" + x[0] + "</refURL></refURLs><changeSet><agent><skillGroupsRemoved><skillGroup><refURL>/unifiedconfig/config/skillgroup/" + x[1] + "</refURL></skillGroup></skillGroupsRemoved></agent></changeSet></operation>"
async with session.post(url,auth=aiohttp.BasicAuth(user, password), data=payload,headers=headers) as resp:
print(resp.status)
print(await resp.text())
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
I see that coroutines can be used but not sure that applies as there is only a single task to execute. Any clarification is appreciated.
Because you're making a request and then immediately await-ing on it, you are only making one request at a time. If you want to parallelize everything, you need to separate making the request from waiting for the response, and you need to use something like asyncio.gather to wait for the requests in bulk.
In the following example, I've modified your code to connect to a local httpbin instance for testing; I'm making requests to the /delay/<value> endpoint so that each requests takes a random amount of time to complete.
The theory of operation here is:
Move the request code into the asynchronous one_request function,
which we use to build an array of tasks.
Use asyncio.gather to run all the tasks at once.
The one_request functions returns a (agent, skillid, response)
tuple, so that when we iterate over the responses we can tell which
combination of parameters resulted in the given response.
import aiohttp
import asyncio
import itertools
import random
skillid = [
"7715", "7735", "7736", "7737", "7738", "7739", "7740", "7741", "7742",
"7743", "7744", "7745", "7746", "7747", "7748", "7749", "7750", "7751",
"7752", "7753", "7754", "7755", "7756", "7757", "7758", "7759", "7760",
"7761", "7762", "7763", "7764", "7765", "7766", "7767", "7768", "7769",
"7770", "7771", "7772", "7773", "7774", "7775", "7776", "7777", "7778",
"7779", "7780", "7781", "7782", "7783", "7784",
]
agent = [
"5124", "5315", "5331", "5764", "6049", "6076", "6192", "6323", "6669",
"7690", "7716",
]
user = 'user'
password = 'pass'
headers = {
'Content-Type': 'application/xml'
}
async def one_request(session, agent, skillid):
# I'm setting `url` here because I want a random parameter for
# reach request. You would probably just set this once globally.
delay = random.randint(0, 10)
url = f'http://localhost:8787/delay/{delay}'
payload = (
"<operation>"
"<operationType>update</operationType>"
"<refURLs>"
f"<refURL>/unifiedconfig/config/agent/{agent}</refURL>"
"</refURLs>"
"<changeSet>"
"<agent>"
"<skillGroupsRemoved><skillGroup>"
f"<refURL>/unifiedconfig/config/skillgroup/{skillid}</refURL>"
"</skillGroup></skillGroupsRemoved>"
"</agent>"
"</changeSet>"
"</operation>"
)
# This shows when the task actually executes.
print('req', agent, skillid)
async with session.post(
url, auth=aiohttp.BasicAuth(user, password),
data=payload, headers=headers) as resp:
return (agent, skillid, await resp.text())
async def main():
tasks = []
async with aiohttp.ClientSession() as session:
# Add tasks to the `tasks` array
for x in itertools.product(agent, skillid):
task = asyncio.ensure_future(one_request(session, x[0], x[1]))
tasks.append(task)
print(f'making {len(tasks)} requests')
# Run all the tasks and wait for them to complete. Return
# values will end up in the `responses` list.
responses = await asyncio.gather(*tasks)
# Just print everything out.
print(responses)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
The above code results in about 561 requests, and runs in about 30
seconds with the random delay I've introduced.
This code runs all the requests at once. If you wanted to limit the
maximum number of concurrent requests, you could introduce a
Semaphore to make one_request block if there were too many active requests.
If you wanted to process responses as they arrived, rather than
waiting for everything to complete, you could investigate the
asyncio.wait method instead.

python - Is my code ever reaching the proxy method? - requests

So I am playing around with proxies here and there with requests, Basically meaning that if I run a proxy with a requests. It will then be used in the whole session with that requests. Now I have coded but I haven't been able to check if it really goes through and I don't either know if this place is correct to paste. What I have done is looking like
with open('proxies.json') as json_data_file:
proxies = json.load(json_data_file)
def setProxy(proxy):
s = requests.Session()
proxies = {'http': 'http://' + proxy,
'https': 'http://' + proxy}
s.proxies.update(proxies)
return s
def info(thread):
global prod
prod = int(thread) + 1
runit(proxies)
def runit(proxies):
try:
if proxies != []:
s = setProxy(random.choice(proxies))
sleepy = time.sleep(.5)
else:
s = requests.Session()
sleepy = time.sleep(1)
r = s.get(url)
except requests.exceptions.ProxyError:
log(Fore.RED + "Proxy DEAD - rotating" + Fore.RESET)
sleepy
passwd(proxies)
PostUrl = s.post('www.hellotest.com')
print("Does it actually use the proxy or not?"
def main():
i = 0
jobs = []
for i in range(10):
p = multiprocessing.Process(target=info, args=(str(i),))
jobs.append(p)
time.sleep(.5)
p.start()
for p in jobs:
p.join()
sys.exit()
Is there a way to actually see if it does it or not? This is also my first time doing it so! Please do not judge!

Tweeting images programmatically

I have a business requirement for the project that I'm working on to allow users to print, email and share an image on Facebook and Twitter. The first three are simple whereas I'm finding it impossible to find a succinct example of how to post a tweet with an image using only client side scripting. I've seen various solutions using the Twitter API and almost all of them are PHP based. Surely this can't be that difficult.
This example uses the TwitterAPI python library.
from TwitterAPI import TwitterAPI
TWEET_TEXT = 'some tweet text'
IMAGE_PATH = './some_image.png'
CONSUMER_KEY = ''
CONSUMER_SECRET = ''
ACCESS_TOKEN_KEY = ''
ACCESS_TOKEN_SECRET = ''
api = TwitterAPI(CONSUMER_KEY,CONSUMER_SECRET,ACCESS_TOKEN_KEY,ACCESS_TOKEN_SECRET)
# STEP 1 - upload image
file = open(IMAGE_PATH, 'rb')
data = file.read()
r = api.request('media/upload', None, {'media': data})
print('UPLOAD MEDIA SUCCESS' if r.status_code == 200 else 'UPLOAD MEDIA FAILURE')
# STEP 2 - post tweet with a reference to uploaded image
if r.status_code == 200:
media_id = r.json()['media_id']
r = api.request('statuses/update', {'status': TWEET_TEXT, 'media_ids': media_id})
print('UPDATE STATUS SUCCESS' if r.status_code == 200 else 'UPDATE STATUS FAILURE')

Delete Post on Facebook Ruby Curl:Easy

I am using curl easy library to make http delete requests to Facebook graph to delete a post.
Code snippet is below:
url = "https://graph.facebook.com/#{id}?access_token=#{token}"
curl = Curl::Easy.new(url)
result = nil
retries = 2
while (!result || curl.response_code != 200) && retries >= 0
result = curl.http_delete
retries -= 1
end
return result
Facebook returns "not valid request" each time. I have verified access_token in facebook debugger.
What am i missing here? Can someone please help?
You should add /post to your url as per the reference:
url = "https://graph.facebook.com/post/#{id}?access_token=#{token}"

How to parse HTTP response using Ruby

I've written a short snippet which sends a GET request, performs auth and checks if there is a 200 OK response (when auth success). Now, one thing I saw with this specific GET request, is that the response is always 200 irrespective of whether auth success or not.
The diff is in the HTTP response. That is when auth fails, the first response is 200 OK, just the same as when auth success, and after this then there is a second step. The page gets redirected again to the login page.
I am just trying to make a quick script which can check my login user and pass on my web application and tell me which auth passed and which didn't.
How should I check this? The sample code is like this:
def funcA(u, p)
print_A("#{ip} - '#{u}' : '#{p}' - Pass")
end
def try_login(u, p)
path = '/index.php?uuser=#{u}&ppass=#{p}'
r = send_request_raw({
'URI' => 'path',
'method' => 'GET'
})
if (r and r.code.to_i == 200)
check = true
end
if check == true
funcA(u, p)
else
out = "#{ip} - '#{u} - Fail"
print_B(out)
end
return check, r
end
end
Update:
I also tried adding a new check for matching a 'Success/Fail' keyword coming in HTTP response. It didn't work either. But I now noticed that the response coming back seems to be in a different form. The Content-Type in response is text/html;charset=utf-8 though. And I am not doing any parsing so it is failing.
Success Response is in form of:
{"param1":1,"param2"="Auth Success","menu":0,"userdesc":"My User","user":"uuser","pass":"ppass","check":"success"}
Fail response is in form of:
{"param1":-1,"param2"="Auth Fail","check":"fail"}
So now I need some pointers on how to parse this response.
Many Thanks.
I do this with with "net/http"
require 'net/http'
uri = URI(url)
connection = Net::HTTP.start(uri.host, uri.port)
#response = Net::HTTP.get_response(URI(url))
#httpStatusCode = #response.code
connection.finish
If there's a redirect from a 200 then it must be a javascript or meta redirect. So just look for that in the response body.

Resources