Flask CSV streaming randomly gets cut off? - heroku

My CSV is just a simple query pulled from the database (SQLAlchemy). For large files >5000 lines, the CSV stream sometimes quits around 3000 or 4000 lines.
My app is on Heroku, which has 30 second timeout, but I don't think that applies to downloading files as well.

I think the reason could be two parameters!
1) first the timeout that you mentioned it.
2) second it may be the body request size limitation.
so check your web server configuration and try to change the timeout and body request size limitation.

Related

What does the time elapsed between network requests mean?

As you can see the second request starts 784ms after the beginning. I wonder why that request starts at 784ms instead of 741ms (directly after the first one)?
What happens in the 43ms in between the first and second request? There is no indicator showing that there is a blocking request.
The 43ms between the two requests indicate the parsing time. I.e. this is the time needed to process the request to the page URL you requested and find embedded resources like images, JavaScripts, etc., which need to be loaded.
So there will always be a gap between the first request and the second.
The same happens for requests to other resources like CSS files. Their contents first need to be interpreted after their download to know the other resources they include like e.g. images, web fonts or other CSS files.

no response from the host :snmpwalk

I have implemented AgentX using mib2c.create-dataset.conf ( with cache enabled)
In my snmd.conf :: agentXTimeout 15
In testtable.h file I have changed cache value as below...
#define testTABLE_TIMEOUT 60
According to my understanding It loads data every 60 second.
Now my issue is if the data in data table is exceeds some amount it takes some amount of time to load it.
As in between If I fired SNMPWALK it gives me “no response from the host” If I use SNMPWALK for whole table and in between testTABLE_TIMEOUT occurs it stops in between and shows following error (no response from the host).
Please tell me how to solve it ? In my table large amount of data is present and changing frequently.
I read some where:
(when the agent receives a request for something in this table and the cache is older than the defined timeout (12s > 10s), then it does re-load the data. This is the expected behaviour.
However the agent does not automatically release the local cache (i.e. call the 'free' routine) as soon as the timeout has expired.
Instead this is handled by a regular "garbage collection" run (once a minute), which will free any stale caches.
In the meantime, a request that tries to use that cache will spot that it's expired, and reload the data.)
Is there any connection between these two ?? I can’t get this... How to resolve my problem ???
Unfortunately, if your data set is very large and it takes a long time to load then you simply need to suffer the slow load and slow response. You can try and load the data on a regular basis using snmp_alarm or something so it's immediately available when a request comes in, but that doesn't really solve the problem either since the request could still come right after the alarm is triggered and the agent will still take a long time to respond.
So... the best thing to do is optimize your load routine as much as possible, and possibly simply increase the timeout that the manager uses. For snmpwalk, for example, you might add -t 30 to the command line arguments and I bet everything will suddenly work just fine.

how to avoid a request timing out when uploading and resizing large amount of images in Coldfusion?

I'm running Coldfusion8 and have a cfc, that loops through a set of database records.
Each record contains two fields image path and image file. I'm constructing a path for every image, upload it to a temp folder, resize and then store it to S3.
Depending on the number of records, this may take quite some time and I have not been able to successfully finish the upload cycle with larger sets of images (eventually times out).
I'm already settings my timeout threshold to 5000, but it still does not seem enough.
I can pick up where I left, because I'm keeping a media log to check against, before uploading to S3. This way I can finish the task, but I need to trigger this function 5x to upload 400 items.
Question:
Is there way to avoid a timeout without setting (in S3 case) httptimeout to some 50000000? And would it make sense to run this in a CFTHREAD or will this be a problem if the user leaves the import page while the system is still uploading?
Thanks for some insights.
You can use a CFthread to perform the task, but make sure you LOCK THE SCOPE! otherwise you could end up running this memory intensive proccess several times over and kill the server, you only want this proccess running once at a time if its so intensive.
You have other options though, if this is not something that your application users will need to run and its a one-off proccess your doing, you could set a scheduled task with an exceedingly long timeout to run overnight, when the server is not very high use, This allows you to set the timeout independently to the application so the rest of the application is unaffected by global timeout changes.
Another option is, if this is something users will be doing semi-regularly then a thread which pushes a notification via email, log or other means (Ajax or Websockets) letting the user know they're task is complete. This has the upside that timeouts can be changed, calculated on the amount of data to be proccessed dynamically at thread generation. However, if your not careful you can overload your server with many threads proccessing large datasets (plus log file read-write locks will be harder to manage).
I would encourage you though, to take this away and see what solution works for you and post your final solution so others can see what the outcome is.
Hope this helps.

How to run multiple threads at the same time in ruby while working with a file?

I've been messing around with Ruby and threading a little bit today. I have a list of proxies that I want to check. Assuming a timeout of 10 seconds going through a very large list of proxies will take many hours if I write something that goes like:
proxies.each do |proxy|
check_proxy(proxy)
end
My first problem with trying to figure out threads is how to START multiple at the same exact time. I found a neat little snippet of code online:
for page in pages
threads << Thread.new(page) { |myPage|
puts "Fetching: #{myPage}\n"
doc = Hpricot(open(myPage.to_s)).to_s
puts "Got #{myPage}: #{doc.size}"
}
end
Seems to work nicely as far as starting them all at the same time. So now I can... start checking all 7 thousand records at the same time?
How do I go to a file, take out a line for each thread, run a batch of like 20 and repeat the process?
Can I run a while loop that in turn starts 20 threads at the same (which remove lines from a file) and keeps going until the file is blank?
I'm a little weak on the logic of what I'm supposed to do.
Thanks guys!
PS.
Another thought: Will there be file access issues if 20 workers are constantly messing with it randomly? What would be a good way around that if this is so?
The keyword you are after is threadpool. You can either try to find one for Ruby (I am sure there's couple at least on Github), or roll your own.
Here's a simple implementation here on SO.
Re: the file access, IMO you shouldn't let workers alter the file directly, but do it in your main thread. You don't want to allow simultaneous edits there.
Try to use gem DelayJob:
https://github.com/tobi/delayed_job
You don't need to generate that many Threads in order to do this work. In fact generating a lot of Threads can decrease the overall performance of your application. If you handle checking each proxy asynchronously, without blocking, you can get by with far fewer threads.
You'd create a file manager thread to process the file. Each line gets added as a request to an array(request queue). On the other end of the request queue you can use eventmachine to send the requests without blocking. eventmachine would also be used to receive the responses and handle the timeout. The response can then be placed on another array(response queue) which your file manager thread polls. The file manager thread pulls the responses from the response queue and resolves if the proxy exists or not.
This gets you down to just creating two threads. One issue that you will have is limiting the number of requests that have been sent since this model will be able to send out all of the requests in less than a second and flood the nearest router. In my experience you should be able to have around 500 outstanding requests at any one time.
There is more than one way to solve this problem asynchronously but hopefully the above is enough to help get you started with non-blocking I/O.

Why is Cacti showing an empty graph, even though the rrd file is created?

I have developed my own SNMP service, and i want to plot a graph of an OID provided.
So, i have created a graph in Cacti.
-) It is showing device up.
-) It is creating rrd file. (RRDTool says OK).
-) Showing the graph, but it's empty.
But when I check it, say
rrdtool fetch <rrd file> AVERAGE
it shows me nan for all the values. The monitored OID has value 47 and i have set min=0 and max=100.
I am using Cacti appliance by rpath:
http://www.rpath.org/ui/#/appliances?id=http://www.rpath.org/api/products/cacti-appliance
Still, I can't show value on graph..
Where is the problem? Can anyone please tell me?
First of all, use Cacti's "Rebuild Poller Cache" function under the Utilities menu.
If that didn't work ,check if the RRD file is actually updating with new data.
To do this use the command:
rrdtool last [filename.rrd]
This will output the last time (in unix timestamp) that a new value has been inserted into the RRA file which you can compare to the current time that date +%s will output.
If it's not updating with data then you should change the cacti log level to DEBUG via the settings page on Cacti's web UI and look for appropriate messages.
If the poller couldn't get the data then it's usually an issue relating to connectiviy/SNMP.
You can further check issues as such by manually polling the specific OID on that host:
snmpwalk -c[SNMP COMMUNITY] -v2c [HOSTNAME OR IP ADDRESS] 1.3.6.1.2.1
You can use the above command and OID (1.3.6.1.2.1) just to see if you're getting a reply.
If that worked then you should change the command from snmpwalk to snmpget and the OID to the actual OID you're trying to poll and retry.
If the RRD is updating with new data but you're still getting NaN in your graphs then I suggest looking into the heartbeat and step values of the data source (via the data template) in relation to your polling interval and poller cronjob interval.
These values determine how many times the RRD file will miss data before inserting a NaN.
The cronjob calls the cacti poller to start performing it's polling cycle.
The poller interval is the actual time that the poller will wait between two polling cycles if it was indeed invoked in time by the cronjob.
So for 1 minute polling (on the poller and the cronjob) you will have to use a step of 60 (seconds) and a heartbeat of 120.
For 5 minutes polling, the step will be 300 and the heartbeat will be 600.
This is mainly caused by someone changing the poller interval on the settings page.
Gandalf from the Cacti forums wrote a nice Guide that you can use and further help can be found on Cacti forums.
Good luck! :)
Maybe cacti doesn't have the needed permissions to access the rrd file and your test was done with a user who has the required permissions, for example root?
Are you sure you have collected enough data?
If your RRD has a step of 1 minute, and your first RRA has a consolidated count of 1 (1cdp=1pdp), then you should collect data for at least (step x ( count + 1 )) seconds before you expect to see any data in the graph. Make sure you are collecting data at least as often as the step size.
If you collect data for 10 min and nothing shows up, then make sure you are actually collecting the data, make sure the values you get are within range, and that they are being used. Check the last modification time on the RRD file. Print out the values before you update to verify they are what you think they are.
You should double check the range Cacti is plotting in. I moved the values in the graph filter and spotted a little chunk of data in the graphs, then you just have to adjust it.

Resources