Download large files in Heroku - ruby

I am facing some issues when downloading large files in Heroku. I have to download and parse files greater than 1Gb. What I am trying to do right now, is use curl to download them into /tmp folder (of a Rails application).
The curl command is: "curl --retry 999 -o #{destination} #{uri} 2> /dev/null" and destination is Rails.root.join("tmp", "file.example")
The problem is that after a few minutes downloading, the "curl" process that is downloading the file is finished, way far from the download is finished. Before being finished, the logs show lots of "Memory exceeded". This led me to the thinking that when I am saving to /tmp folder, it is storing the downloaded content in the memory and when it memory hit its limit, the process is killed.
I would like to know if any of you have already experienced a similar issue on Heroku and if saving to /tmp folder really works like this. If so, do you have any suggestions to get this working at Heroku?
thanks,
Elvio

You are probably better off saving the file in an external cloud provider like S3 using the fog gem. In any case, Heroku is a read only filesystem, so they won't allow you to curl, must less write to it.

Related

Can't save a temporary csv file to /home/app directory on heroku using R shiny app [duplicate]

I published my first simple app on Heroku with a free dyno. This app writes a simple .txt file, that seems to be correctly written because my API services are working fine.
But if I try to check this file by entering in the file system using "heroku run bash -a MYAPP", I can't see that file in the folder I thought to see. It is like the file is not existing. Can someone tell me why?
Thanks.
I found this on https://devcenter.heroku.com/articles/active-storage-on-heroku:
In addition, any files stored on disk will not be visible from one-off dynos such as a heroku run bash instance or a scheduler task because these commands use new dynos.
It is still not so clear to me, but at least I know it is a normal (but strange) behaviour of Heroku!

Is there any way to inspect the contents of the RocksDB instance used by NEAR Protocol?

Disclosure: I work with NEAR and am currently on-boarding.
When I start up a local node on a clean machine I see that a .near folder is created in my home directory with a few configuration files (exact files seem to depend on which start_ script I run). Another folder appears inside of the .near folder called data.
Running strings ~/.near/data/*.sst in the folder spits out a few lines starting with the string "rocksdb" which led me to this reference to RocksDB
Is there any way to inspect the contents of a node's RocksDB instance?
I found Keylord but it crashes when I try to configure a new connection to the database (by pointing the connection to ~/.near/data). I didn't pursue that thread.
PSA1: sometimes it's useful to backup the ~/.near folder between node restarts if you want to reset the environment or avoid reusing old data while troubleshooting
mv ~/.near ~/.near_`date +%Y-%m-%d.%s`
PSA2: on MacOS you can watch what happens to the contents of the ~/.near folder while the node boots up and runs. (brew install watch).
watch -d -c -n 0.5 find ~/.near
The content of RocksDB is serialized using our own binary serialization format (http://borsh.io/), so you won't be able to examine the content with general-purpose third-party tools

Google Cloud Functions and shared libraries

I'm trying to use wkhtmltopdf on GCF for PDF generation.
When my function tries to spawn the child process I get the following error:
Error: ./services/wkhtmltopdf: error while loading shared libraries: libXrender.so.1: cannot open shared object file: No such file or director
The problem is clearly due to the fact that wkhtmltopdf binary depends on external shared libraries which are not installed in GCF environment.
Is there a way to solve this issue or should I give up and use other solutions (AWS Lambda o GAE)?
Thank you in advance
Indeed, I’ve found a way to solve this issue by copying all required libraries in the same folder (/bin for me) containing wkhtmltopdf binary. In order to let the binary file use uploaded libraries I added the following lines to wkhtmltopdf.js:
wkhtmltopdf.command = 'LD_LIBRARY_PATH='+path.resolve(__dirname, 'bin')+' ./bin/wkhtmltopdf';
wkhtmltopdf.shell = '/bin/bash';
module.exports = wkhtmltopdf;
Everything worked fine for a while. At a sudden I receive many connection errors from GCF or timeouts but I think it’s not related to my implementation but rather to Google.
I’ve ended up setting a dedicated server.
I have managed to get it working, there are 2 things needed to be done, as wkhtmltopdf won't work if:
libXrender.so.1 can't be loaded
you are using stdout to collect resulting pdf. Wkhtmltopdf has to write the result into a file
First you need to obtain correct version of libXrender.
I have found out, which docker image Cloud functions are using as base for nodejs functions. I've ran it locally, installed libxrender and copied the library into my function's directory.
docker run -it --rm=true -v /tmp/d:/tmp/d gcr.io/google-appengine/nodejs bash
Then, inside the runing container:
apt update
apt install libxrender1
cp /usr/lib/x86_64-linux-gnu/libXrender.so.1 /tmp/d
I have put this into my function's project directory and under lib sub directory. In my function's source file, I then set-up LD_LIBRARY_PATH to include the /user_code/lib directory (/user_code is the directory, where at last your function will end up being put by google):
process.env['LD_LIBRARY_PATH'] = '/user_code/lib'
This is enough for wkhtmltopdf to be able to execute. It will fail, as it won't be able to write to stdout and the function will eventually timeout and be killed (as Matteo experienced). I think this is because google runs the containers without a tty (just speculation), I can run my code in their container, if I run it with docker run -it flags. To solve this, I am invoking wkhtmltopdf so that it writes the output into a file under /tmp (this is in-memory tmpfs). I then read the file back and send it as my response body. Note that the tmpfs might be reused between function calls, so you need to use unique file every time.
This seems to do the trick and I am able to run wkhtmltopdf as Google CloudFunction.

Phoenix Caches Indefinitely

I'm writing a project with a lot of static files
Whenever I change a custom.js file, even thought it get changed both in
web/static/assets/custom.js and priv/static/assets/custom.js
When I try to reach the resource I get the old version, which is also most of the times corrupted
I tried:
restarting the server
running brunch build
removing the whole _build directory
changing the files further
clearing browsers cache
using url localhost:4000/assets/js/custom.js -H 'Pragma: no-cache'
Still the server serves old file
Edit
It seems to be an issue with mtime difference between vagrant VM and host.
So the real question is:
How to eliminate that issue?

Does anyone know how to download a project from nitrous.io?

I made an ruby web application on nitrous.io, the tool is very nice and it helped a lot but now I want to download ther project in my computer and I didn't found any option to do that...
You can download and upload projects by any of the following options:
Utilize Nitrous Desktop to Sync your files locally.
Upload your project to Github, and pull the project from there. Here is a guide on adding the SSH key to Github if needed.
Upload the content via SCP. To do this, you will need to add an SSH Key to your account.
Next, run this command on your local machine, replacing {PORT} with the port # assigned to your Nitrous.IO box, and also changing usw1 with the proper region found in the SSH URI of your boxes page.
To Upload:
scp -P{PORT} -r path/to/yourFolder action#usw1-2.nitrousbox.com:~/workspace
To Download:
scp -P{PORT} -r action#usw1-2.nitrousbox.com:~/workspace path/to/yourLocalFolder
I do not know the service, but apparently they offer ssh access. Then you can use scp to copy the files to your machine. Anyway, probably you should ask their support...
...post a summary of their answer here and close the question :)
The easiest way is to store your project in a Git repository and then push this repository to an external host. You will then be able to clone your project from the external repository to any machine you want.
Personally, I use Bitbucket (Bitbucket as it is free and very easy to set up. Have a look at the tutorials there.
ok replying really late but I hope this will help anyone still looking for this. Here is how I download stuff from nitrous, no desktop utility download needed, and no ssh/scp or adding keys.
What you do is, simply make a archive for the folder you want to download by
tar -zcvf myarchive.tar.gz mydir/
now you got a *.gz file right? Whichever folder your gz file is in, be there and type:
python3.3 -m http.server 8080
you just started a cute little http server ready to serve you your download, now from the Preview menu click "Port 8080", this opens a new browser tab showing your gz file in the file listing (sample url http://yourboxes.apse1.nitrousbox.com:8080/). Now you can click your gz file and it will start downloading. Once done with the download, press Ctrl+C on the terminal to terminate the http server.
This is not limited to nitrous, you can make this work on many online VMs like cloud9 etc.

Resources