Heroku problem when trying to download .CSV file into local download folder - heroku

I have a requirement to be able to download attendee details from an SQL table into a CSV file.
The code below works perfectly for my local deployment, creating a .csv file in the /app/static directory which the Return statement downloads to my desktop.
When uploaded to Heroku the same code fails on the final line (Return) stating:
'FileNotFoundError: [Errno 2] No such file or directory: '/app/app/static/users1001.csv'
This suggests it has created the file on the server but is looking in the wrong place (an extra '/app'). However, no file has been created anywhere (I've checked using Heroku Run Bash). The code:
name = 'attendees' + str(name) # name = 1001 in this example
f = open('%s.csv' % name, 'w')
out = csv.writer(f)
out.writerow([ 'Date','Activity No','Activity', 'User No', 'User Name',\
'Cost','Received','Update Time'\
])
for item in Attendees.query.filter_by(club=current_user.club)\
.order_by(Attendees.activitydate.desc(),Attendees.activityname):
out.writerow([item.activitydate, item.activitynumber,\
item.activityname,item.usernum,\
item.username,item.cost,item.received,item.update_time])
path='static/'+name+'.csv' # this was necessary for the local deployment
# csv writer send to 'static' by default but
# send_file needs it stated explicitly
f.close()
return send_file(path, as_attachment=True)
My question therefore is does anyone know where this file is stored in Heroku such that I can complete the download?
One further wrinkle that I can't explain. If I run this for club 1002 (not 1001), the code works and downloads a file attendees1002.csv (this is a file I created locally and was uploaded to Heroku by GIT). You will see below that this file is actually in /app/static/temp which is really confusing.
From Heroku Run Bash:
~/app/static $ dir
1002.csv clubmanager2020.jpg code39_barcode.svg eds.jpg loading.gif ratings.csv
bwt.jpg clubmanager2020.png dump.csv eetc.jpg out.csv temp
~/app/static $ cd temp
~/app/static/temp $ dir
attendees1002.csv out.csv ratings.csv users1002.csv
Any help greatly appreciated

I've found some documentation to fix this problem. Heroku does not allow creation of files on the system except in the /tmp directory off the root directory. The problem was fixed by changing:
name = 'attendees' + str(name)
to
../../tmp/name = 'attendees' + str(name)
and
path='static/'+name+'.csv'
to
path=name+'.csv'

Related

Using gitpython to get current hash does not work when using qsub for job submission on a cluster

I use python to do my data analysis and lately I came up with the idea to save the current git hash in a log file so I can later check which code version created my results (in case I find inconsistencies or whatever).
It works fine as long as I do it locally.
import git
import os
rep = git.Repo(os.getcwd(), search_parent_directories=True)
git_hash = rep.head.object.hexsha
with open ('logfile.txt', 'w+') as writer:
writer.write('Code version: {}'.format(git_hash))
However, I have a lot of heavy calculations that I run on a cluster to speed things up (run analyses of subjects parallel), using qsub, which looks more or less like this:
qsub -l nodes=1:ppn=12 analysis.py -q shared
This always results in a git.exc.InvalidGitRepositoryError.
EDIT
Printing os.getcwd() showed me, that on the cluster the current working dir is always my $HOME directory no matter from where I submit the job.
My next solution was to get the directory where the file is located using some of the solutions suggested here.
However, these solutions result in the same error because (that's how I understand it) my file is somehow copied to a directory deep in the root structure of the cluster's headnode (/var/spool/torque/mom_priv/jobs).
I could of course write down the location of my file as a hardcoded variable, but I would like a general solution for all my scripts.
So after I explained my problem to IT in detail, they could help me solve the problem.
Apparently the $PBS_O_WORKDIR variable stores the directory from which the job was committed.
So I adjusted my access to the githash as follows:
try:
script_file_directory = os.environ["PBS_O_WORKDIR"]
except KeyError:
script_file_directory = os.getcwd()
try:
rep = git.Repo(script_file_directory, search_parent_directories=True)
git_hash = rep.head.object.hexsha
except git.InvalidGitRepositoryError:
git_hash = 'not-found'
# create a log file, that saves some information about the run script
with open('logfile.txt'), 'w+') as writer:
writer.write('Codeversion: {} \n'.format(git_hash))
I first check if the PBS_O_WORKDIR variable exists (hence if I run the script as a job on the cluster). If it does get the githash from this directory if it doesn't use the current working directory.
Very specific, but maybe one day someone has the same problem...

Issues with os.listdir when script is an executable

I have created a script that takes a file from one folder and produces another file in another folder. This is a project to convert one format into another to be used by people who dont have strong background in informatics so I have created a folder with the script plus the input folder and the output folder. The user just need to put the input file in the input folder and take the results from the output folder.
The script works fine if I run this python script when running with visual code as well as If I run the script using the terminal ( python CSVtoVCFv3.py )
but when I convert my script in an executable with pyinstaller I found the next error.
File "CSVtoVCFv3.py", line 99, in <module>
FileNotFoundError: [Errno 2] No such file or directory: '/Users/manoldominguez/input/'
[99327] Failed to execute script CSVtoVCFv3
The code used in line 99 is:
97 actual_path = os.getcwd()
98 folder_input = '/input/'
99 input_file_name = os.listdir(actual_path+folder_input)
100 input_file_name= ''.join(input_file_name)
101 CSV_input = actual_path+folder_input+input_file_name
I have also tried this:
actual_path = (os.path.dirname(os.path.realpath('CSVtoVCFv3.py')))
So as conclusion as far as I can understand the issue is:
In these lines If I run my script I get this
'/Users/manoldominguez/Desktop/CSVtoVCF/input/'
If my script is ran with my executable I get this
'/Users/manoldominguez/input/'
os.getcwd() gives Current Working Directory - it means folder in which script was executed, but it doesn't have to be folder in which script is saved. This way you can run code in different folder and it works with files in different folder - and it can be usefull.
But if you need with files in folder where you have script then you can get this folder using
SCRIPT_PATH = os.path.dirname(os.path.realpath(__file__))
or
import sys
SCRIPT_PATH = os.path.dirname(os.path.realpath(sys.argv[0]))
not with 'CSVtoVCFv3.py'
And then you can join it
SCRIPT_PATH = os.path.dirname(os.path.realpath(sys.argv[0]))
folder_input = '/input/'
full_folder_input = os.path.join(SCRIPT_PATH, folder_input)
all_filenames = os.listdir(full_folder_input)
for input_file_name in all_filenames:
#CSV_input = os.path.join(full_folder_input, input_file_name)
CSV_input = os.path.join(SCRIPT_PATH, folder_input, input_file_name)
I only don't like your
input_file_name = os.listdir(actual_path+folder_input)
input_file_name= ''.join(input_file_name)
because listdir() may gives more files and then your join may create incorrect path. Better get input_file_name[0] for single file or use for-loop to work with all files in folder.
BTW: Maybe you should use sys.argv to get path as parameter and then everyone may decide where to put file.

s3cmd sync is remote copying the wrong files to the wrong locations

I've got the following as part of a shell script to copy site files up to a S3 CDN:
for i in "${S3_ASSET_FOLDERS[#]}"; do
s3cmd sync -c /path/to/.s3cfg --recursive --acl-public --no-check-md5 --guess-mime-type --verbose --exclude-from=sync_ignore.txt /path/to/local/${i} s3://my.cdn/path/to/remote/${i}
done
Say S3_ASSET_FOLDERS is:
("one/" "two/")
and say both of those folders contain a file called... "script.js"
and say I've made a change to two/script.js - but not touched one/script.js
running the above command will firstly copy the file from /one/ to the correct location, although I've no idea why it thinks it needs to:
INFO: Sending file
'/path/to/local/one/script.js', please wait...
File
'/path/to/local/one/script.js'
stored as
's3://my.cdn/path/to/remote/one/script.js' (13551
bytes in 0.1 seconds, 168.22 kB/s) [1 of 0]
... and then a remote copy operation for the second folder:
remote copy: two/script.js -> script.js
What's it doing? Why?? Those files aren't even similar. Different modified times, different checksums. No relation.
And I end up with an s3 bucket with two incorrect files in. The file in /two/ that should have been updated, hasn't. And the file in /one/ that shouldn't have changed is now overwritten with the contents of /two/script.js
Clearly I'm doing something bizarrely stupid because I don't see anyone else having the same issue. But I've no idea what??
First of all, try to run it without --no-check-md5 option.
Second, I suggest you to pay attention to directory names, specifically trailing slashes.
s3cmd documentation says:
With directories there is one thing to watch out for – you can either upload the directory and its contents or just the contents. It all depends on how you specify the source.
To upload a directory and keep its name on the remote side specify the source without the trailing slash
On the other hand to upload just the contents, specify the directory it with a trailing slash

Where do the files created with File.new actually get stored in Ruby?

I am creating files from within Ruby scripts and adding stuff to them. But where are these files stored that I am creating?
I'm very new to this, sorry!
The files are created at whatever location you specified. For instance:
f = File.new("another_test.txt","w+")
that will create the file in the current working directory. You specify the path along with the file name. For example:
f = File.new("~/Desktop/another_test.txt","w+") # will create the file on the desktop.
For more details, check the File documentation.
Updated:
Included mu is too short correction.

Deleted files not being removed from HDD in Ruby

I'm experiencing a weird situation with deleting files in Ruby, the code seems to report correctly, but the physical file isn't removed from my hard drive. I can do rm path/to/file on the command line - that works. I can even open up the Rails console and File.safe_unlink the file and that also works, it's just within my Rails app it fails to delete the actual file:
def destroy
Rails.logger.debug local_path #=> /Users/ryan/.../public/system/.../file.jpg
Rails.logger.debug File.exist?(local_path) #=> true
File.safe_unlink(local_path)
Rails.logger.debug File.exist?(local_path) #=> false
# yet the physical file actually STILL exists!
end
The physical file is within a Git repo (the repo is stored within /public/system/) any gotchas with that? I've tried using the ruby-git gem to delete the file using the rm command it provides, but that doesn't delete the file either.
I've opened up all the permissions on the files during testing and still nothing works. I've also confirmed this with File.writable?(local_path) which returned true.
Any thoughts why it could be preventing the file from being removed?
Have you checked the permissions on the directory? Deletion is a directory write operation, not file write operation. (rm will also check file perms and ask if you really want to do it, as a courtesy, if the file is write-protected; but if the directory isn't writable, it flat out refuses.)

Resources