Python renaming triggers error; file is being used by another process - windows

I"m getting the following error message, and I can't figure out why renaming is trigger this error message:
Traceback (most recent call last):
File "C:\Apps\UtilitiesByMarc\make hypertext list from directory aaa.py", line 17, in
os.rename(fullname, nufullname)
WindowsError: [Error 32] The process cannot access the file because it is being used by another process
Any insights would be much appreciated.
Here's my code:
import os
dir_path = r'E:\ddelete'
dir_list = os.listdir(dir_path)
html_file = r"E:\ddelete\AAA__ReadMeFirst___dirlist.html"
text_file = open(html_file, "w")
i = 10000
for item in sorted(dir_list):
fullname = os.path.join(dir_path,item)
ext = ext = os.path.splitext(fullname)[1]
nufullname = os.path.join(dir_path, str(i) + ext)
nufilename = str(i) + ext
print fullname
print nufullname
os.rename(fullname, nufullname)
temp_item = '''''' + item + '<br />' + '\r\n'
text_file.write(temp_item)
i += 1
text_file.close()

I just realized that I was attempting to rename the html_file to which I was writing text.
I'm sorry for wasting your time.
Marc

Related

Python: Opening auto-generated file

As part of my larger program, I want to create a logfile with the current time & date as part of the title. I can create it as follows:
malwareLog = open(datetime.datetime.now().strftime("%Y%m%d - %H.%M " + pcName + " Malware scan log.txt"), "w+")
Now, my app is going to call a number of other functions, so I'll need to open the file, write some output to it and close the file, several times. It doesn't seem to work if I simply go:
malwareLog.open(malwareLog, "a+")
or similar. So how should I open a dynamically created txt file that I don't know the actual filename for...?
When you create malwareLog object, it has name attribute which contains the file name.
Here's an example: (my test is your malwareLog)
import random
test = open(str(random.randint(0,999999))+".txt", "w+")
test.write("hello ")
test.close()
test = open(test.name, "a+")
test.write("world!")
test.close()
with open(test.name, "r") as f: print(f.read())
You also can store the file name in a variable before or after creating the file.
###Before
file_name = "123"
malwareLog = open(file_name, "w")
###After
malwareLog = open(random.randint(0,999999), "w")
file_name = malwareLog.name

Airflow UI link in SlackAPIPostOperator?

Im using SlackAPIPostOperator in Airflow to send Slack messages on task failures.
I wondered if there's a smart way to add a link to the airflow UI logs page of the failed task to the slack message.
The following is an example I want to achieve:
http://myserver-uw1.myaws.com:8080/admin/airflow/graph?execution_date=...&arrange=LR&root=&dag_id=MyDAG&_csrf_token=mytoken
The current message is:
def slack_failed_task(context):
failed_alert = SlackAPIPostOperator(
task_id='slack_failed',
channel="#mychannel",
token="...",
text=':red_circle: Failure on: ' +
str(context['dag']) +
'\nRun ID: ' + str(context['run_id']) +
'\nTask: ' + str(context['task_instance']))
return failed_alert.execute(context=context)
You can build the url to the UI with the config value base_url under the [webserver] section and then use Slack's message format <http://example.com|stuff> for links.
from airflow import configuration
def slack_failed_task(context):
link = '<{base_url}/admin/airflow/log?dag_id={dag_id}&task_id={task_id}&execution_date={execution_date}|logs>'.format(
base_url=configuration.get('webserver', 'BASE_URL'),
dag_id=context['dag'].dag_id,
task_id=context['task_instance'].task_id,
execution_date=context['ts'])) # equal to context['execution_date'].isoformat())
failed_alert = SlackAPIPostOperator(
task_id='slack_failed',
channel="#mychannel",
token="...",
text=':red_circle: Failure on: ' +
str(context['dag']) +
'\nRun ID: ' + str(context['run_id']) +
'\nTask: ' + str(context['task_instance']) +
'\nSee ' + link + ' to debug')
return failed_alert.execute(context=context)
We can also do this using the log_url attribute in the Task Instance
def slack_failed_task(context):
failed_alert = SlackAPIPostOperator(
task_id='slack_failed',
channel="#mychannel",
token="...",
text=':red_circle: Failure on: ' +
str(context['dag']) +
'\nRun ID: ' + str(context['run_id']) +
'\nTask: ' + str(context['task_instance']) +
'\nLogs: <{url}|to Airflow UI>'.format(url=context['task_instance'].log_url)
)
return failed_alert.execute(context=context)
I know this is available since version 1.10.4 at the very least.

Download Images from list of urls

I have a list of urls in a text file.i want the images to be downloaded to a particular folder ,how i can do it.is there any addons available in chrome or any other program to download images from url
Create a folder in your machine.
Place your text file of images URL in the folder.
cd to that folder.
Use wget -i images.txt
You will find all your downloaded files in the folder.
On Windows 10/11 this is fairly trivial using
for /F "eol=;" %f in (filelist.txt) do curl -O %f
Note the inclusion of eol=; allows us to mask individual exclusions by adding ; at the start of those lines in filelist.txt that we do not want this time. If using above in a batch file GetFileList.cmd then double those %%'s
Windows 7 has a FTP command, but that can often throw up a firewall dialog requiring a User Authorization response.
Currently running Windows 7 and wanting to download a list of URLs without downloading any wget.exe or other dependency like curl.exe (which would be simplest as the first command) the shortest compatible way is a power-shell command (not my favorite for speed, but if needs must.)
The file with URLs is filelist.txt and IWR is the PS near equivalent of wget.
The Security Protocol first command ensures we are using modern TLS1.2 protocol
-OutF ... split-path ... means the filenames will be the same as remote filenames but in CWD (current working directory), for scripting you can cd /d folder if necessary.
PS> [Net.ServicePointManager]::SecurityProtocol = "Tls12" ; GC filelist.txt | % {IWR $_ -OutF $(Split-Path $_ -Leaf)}
To run as a CMD use a slightly different set of quotes around 'Tls12'
PowerShell -C "& {[Net.ServicePointManager]::SecurityProtocol = 'Tls12' ; GC filelist.txt | % {IWR $_ -OutF $(Split-Path $_ -Leaf)}}"
This needs to be made into a function with error handling but it repeatedly downloads images for image classification projects
import requests
urls = pd.read_csv('cat_urls.csv') #save the url list as a dataframe
rows = []
for index, i in urls.iterrows():
rows.append(i[-1])
counter = 0
for i in rows:
file_name = 'cat' + str(counter) + '.jpg'
print(file_name)
response = requests.get(i)
file = open(file_name, "wb")
file.write(response.content)
file.close()
counter += 1
import os
import time
import sys
import urllib
from progressbar import ProgressBar
def get_raw_html(url):
version = (3,0)
curr_version = sys.version_info
if curr_version >= version: #If the Current Version of Python is 3.0 or above
import urllib.request #urllib library for Extracting web pages
try:
headers = {}
headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"
request = urllib.request.Request(url, headers = headers)
resp = urllib.request.urlopen(request)
respData = str(resp.read())
return respData
except Exception as e:
print(str(e))
else: #If the Current Version of Python is 2.x
import urllib2
try:
headers = {}
headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"
request = urllib2.Request(url, headers = headers)
try:
response = urllib2.urlopen(request)
except URLError: # Handling SSL certificate failed
context = ssl._create_unverified_context()
response = urlopen(req,context=context)
#response = urllib2.urlopen(req)
raw_html = response.read()
return raw_html
except:
return"Page Not found"
def next_link(s):
start_line = s.find('rg_di')
if start_line == -1: #If no links are found then give an error!
end_quote = 0
link = "no_links"
return link, end_quote
else:
start_line = s.find('"class="rg_meta"')
start_content = s.find('"ou"',start_line+1)
end_content = s.find(',"ow"',start_content+1)
content_raw = str(s[start_content+6:end_content-1])
return content_raw, end_content
def all_links(page):
links = []
while True:
link, end_content = next_link(page)
if link == "no_links":
break
else:
links.append(link) #Append all the links in the list named 'Links'
#time.sleep(0.1) #Timer could be used to slow down the request for image downloads
page = page[end_content:]
return links
def download_images(links, search_keyword):
choice = input("Do you want to save the links? [y]/[n]: ")
if choice=='y' or choice=='Y':
#write all the links into a test file.
f = open('links.txt', 'a') #Open the text file called links.txt
for link in links:
f.write(str(link))
f.write("\n")
f.close() #Close the file
num = input("Enter number of images to download (max 100): ")
counter = 1
errors=0
search_keyword = search_keyword.replace("%20","_")
directory = search_keyword+'/'
if not os.path.isdir(directory):
os.makedirs(directory)
pbar = ProgressBar()
for link in pbar(links):
if counter<=int(num):
file_extension = link.split(".")[-1]
filename = directory + str(counter) + "."+ file_extension
#print ("Downloading image: " + str(counter)+'/'+str(num))
try:
urllib.request.urlretrieve(link, filename)
except IOError:
errors+=1
#print ("\nIOError on Image" + str(counter))
except urllib.error.HTTPError as e:
errors+=1
#print ("\nHTTPError on Image"+ str(counter))
except urllib.error.URLError as e:
errors+=1
#print ("\nURLError on Image" + str(counter))
counter+=1
return errors
def search():
version = (3,0)
curr_version = sys.version_info
if curr_version >= version: #If the Current Version of Python is 3.0 or above
import urllib.request #urllib library for Extracting web pages
else:
import urllib2 #If current version of python is 2.x
search_keyword = input("Enter the search query: ")
#Download Image Links
links = []
search_keyword = search_keyword.replace(" ","%20")
url = 'https://www.google.com/search?q=' + search_keyword+ '&espv=2&biw=1366&bih=667&site=webhp&source=lnms&tbm=isch&sa=X&ei=XosDVaCXD8TasATItgE&ved=0CAcQ_AUoAg'
raw_html = (get_raw_html(url))
links = links + (all_links(raw_html))
print ("Total Image Links = "+str(len(links)))
print ("\n")
errors = download_images(links, search_keyword)
print ("Download Complete.\n"+ str(errors) +" errors while downloading.")
search()
In this python project I make a search in unsplash.com, which brings me a list of URL, then I save a number of them (pre-defined by the user) to a pre-defined folder. Check it out.
On Windows, install wget - https://sourceforge.net/projects/gnuwin32/files/wget/1.11.4-1/
and add C:\Program Files (x86)\GnuWin32\bin to your environment path.
create a folder with a txt file of all the images you want to download.
in the location bar at the top of the file explorer type cmd
When the command prompt opens enter the following.
wget -i images.txt --no-check-certificate

Handling Authentication Retrieving a Image from authenicated website Via urllib2

I am trying to make a small API that logs into a internal monitoring tool via web and retrieves images on pages that I specify using the login credentials I specify. It's not passing any authentication to the last section after it has already built the URL. After the URL is built, it is put into variable y. I then attempt to open y and save it, that is where the authentication problem is happening. Please scroll down for examples.
import urllib
import urllib2
import lxml, lxml.html
ID = raw_input("Enter ID:")
PorS = raw_input("Enter 1 for Primary 2 for Secondary:")
base_link = 'http://statseeker/cgi/nim-report?rid=42237&command=Graph&mode=ping&list=jc-'
dash = '-'
end_link = '&tfc_fav=range+%3D+start_of_today+-+1d+to+now%3B&year=&month=&day=&hour=&minute=&duration=&wday_from=&wday_to=&time_from=&time_to=&tz=America%2FChicago&tfc=range+%3D+start_of_today+-+1d+to+now%3B&rtype=Delay&graph_type=Filled&db_type=Average&x_step=1&interval=60&y_height=100&y_gridlines=5&y_max=&y_max_power=1&x_gridlines=on&legend=on'
urlauth = base_link + jConnectID + dash + jConnectPorS + end_link
print urlauth
realm = 'statseeker'
username = 'admin'
password = '*****'
auth_handler = urllib2.HTTPBasicAuthHandler()
auth_handler.add_password(realm, urlauth, username, password)
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
data = opener.open(urlauth).read()
html = lxml.html.fromstring(data)
imgs = html.cssselect('img.graph')
for x in imgs:
y = 'http://statseeker%s' % (x.attrib['src'])
g = urllib2.urlopen(y).read()
urllib2.urlopen(test.jpg, 'wb').write(g)
print 'http://statseeker%s' % (x.attrib['src'])
with open('statseeker.html', 'a') as f:
f.write(y)
Result:
C:\Users\user\Documents\Scripting>python test.py
Enter ID:4050
Enter 1 for Primary 2 for Secondary:1
http://statseeker/cgi/nim-report?rid=42237&command=Graph&mode=ping&list=jc-4050-
1&tfc_fav=range+%3D+start_of_today+-+1d+to+now%3B&year=&month=&day=&hour=&minute
=&duration=&wday_from=&wday_to=&time_from=&time_to=&tz=America%2FChicago&tfc=ran
ge+%3D+start_of_today+-+1d+to+now%3B&rtype=Delay&graph_type=Filled&db_type=Avera
ge&x_step=1&interval=60&y_height=100&y_gridlines=5&y_max=&y_max_power=1&x_gridli
nes=on&legend=on
Traceback (most recent call last):
File "JacksonShowAndSave.py", line 35, in <module>
g = urllib2.urlopen(y).read()
File "C:\Python27\lib\urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py", line 410, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python27\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:\Python27\lib\urllib2.py", line 382, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 401: Authorization Required
C:\Users\user\Documents\Scripting>
How do I fix these errors to pass the Authentication to any web page opened on that website? I thought that any urlopen request would use the same authentication as above.
The Section that is Failing:
y = 'http://statseeker%s' % (x.attrib['src'])
g = urllib2.urlopen(y).read()
urllib2.urlopen(test.jpg, 'wb').write(g)
print 'http://statseeker%s' % (x.attrib['src'])
with open('statseeker.html', 'a') as f:
I built another auth handler for the request:
auth_handler2 = urllib2.HTTPBasicAuthHandler()
auth_handler2.add_password(realm, y, username, password)
opener2 = urllib2.build_opener(auth_handler2)
urllib2.install_opener(opener2)
link = urllib2.Request(y)
response = urllib2.urlopen(link)
output = open('out2.jpg','wb')
output.write(response.read())
output.close()

MongoEngine and GridFS - Write files with ImageGridFsProxy

I'm playing around with mongoengine(0.8.2) and trying to write some images using the GridFS functionality. This is how I have declared my model:
class Lady(DynamicDocument):
id = ObjectIdField()
name = StringField()
offers = ListField(EmbeddedDocumentField(Offer))
pictures = ListField(ImageField())
So what I want to do is to append the pictures to the pictures field. What I tried is the following:
fileA = open('/home/evermean/Pictures/a.jpg', 'r')
fileB = open('/home/evermean/Pictures/b.jpg', 'r')
upload = [fileA, fileB]
files = []
for f in upload:
mf = ImageGridFsProxy()
mf.put(f, content_type = 'image/jpeg')
files.append(mf)
lady.pictures = files
lady.save()
What I get is the following:
Error
Traceback (most recent call last):
File "/home/evermean/Code/django/pourlamour/core/tests.py", line 38, in test_basic_addition
mf.put(f, content_type = 'image/jpeg')
File "/home/evermean/Code/django/env/pourlamour/local/lib/python2.7/site-packages/mongoengine/fields.py", line 1257, in put
field = self.instance._fields[self.key]
AttributeError: 'NoneType' object has no attribute '_fields'
I tried to figure out what goes wrong and the following line from the fields.py seems to be responsible although I cannot quite see what to do about it.
field = self.instance._fields[self.key]
I also tried to use the GridFSProxy which works ... so why does it fail with ImageGridFsProxy?
Any ideas? Where is my mistake? Thanks for your help.

Resources