Parse.com - Performance problems with 100K users - parse-platform

We have an a Parse application with about 100K users.
Our queries on the user tables timeout.
For example, I'm doing the following query:
var query = new Parse.Query(Parse.User);
query.exists("email");
query.find(...);
This query will timeout. If I limit the results to a low number, e.g. 10, I can get the first 10 results. But the next pages will timeout. I.e. this will timeout:
query.limit(10);
query.skip(500);
query.find(...);
We are currently at a situation where we are not able to manage our users. Whenever we try to get a list of users by some attribute or change something for a batch of users we get timeout.
We tried doing the queries in cloud code and using the javascript sdk. Both methods fail with timeouts eventually.
Am I doing something wrong or is it a Parse limitation?

Parse cloud functions have a timeout of 15 seconds, and before/after save triggers have a timeout of 3 seconds.
If you need more time, you should find a way to do what you need done in a background job rather than a cloud function. Those have 15 minute timers, which is more than enough to do anything reasonable, and anything that requires more time, you'll have to find a way to save where you left off, and have the function run multiple times until everything you wanted to do is complete.

Related

Same login/password pair is used for multiple threads sign in

I am trying to simulate the scenario when multiple users(100) are logged in using login\password from my CSV file, where I have 10 different combinations of valid credentials. But the problem is that JMeter always takes the same login\password pair from my CSV file for all simulated users. The only way to solve it is to set Ramp-Up period of my Thread Group to be 0, but this seems to be not so plausible scenario, is it?
Make sure Sharing Mode is set to All Threads.
Also, it looks like you're trying to have 100 Threads split up a set of 10 credentials, and there's not enough to go around. You will probably get 10 sets of threads all logging in with the same username/password. Try adding more usernames.
I'm not sure why a very small ramp up would fix this unless: if you don't have any think times, a user might go through your script quite quickly, then log in again with the same credentials. Even then, I'd expect it to get a new set, though.
SEE:
CSV_Data_Set_Config

Parse.com httpRequest timeout

I am using Parse.com's httpRequest to retrieve the source code of a website.
Code:
Parse.Cloud.define("extract_website_simple", function(request, response)
{
return Parse.Cloud.httpRequest({url: 'http://bet.hkjc.com/marksix/index.aspx?lang=en' }).then
(function(httpResponse)
{
response.success("code=" + httpResponse.text);
},
function (error)
{
response.error("Error: " + error.code + " " + error.message);
});
});
Question:
The html code cannot be retrieved. Instead, a ParseException, after loading 10 seconds, is appeared, written as follows:
com.parse.ParseException: i/o failure: java.net.SocketTimeoutException: Read timed out
How could I retrieve it properly without timeout? It seems there is no way to increase the timeout length?
Thanks!
As it it underlined by Parse support in many places like official Q/A, timeouts are low and they are not gonna be changed to keep good performance. Quote:
Héctor Ramos: I think that only two operations can run at any time in Cloud Code, so when you send three queries in parallel, the third one won't start until at least one of the first two has finished. Cloud Functions are not the best tool for long-running operations, and so they are limited to 15 seconds to keep Cloud Code performant for everybody. A better solution for long-running operations should be available shortly.
Official documentation says:
Resource Limits -> Timeouts
Cloud functions will be killed after 15 seconds of wall clock time. beforeSave, afterSave, beforeDelete, and afterDelete functions will be killed after 3 seconds of run time. If a Cloud function or a beforeSave/afterSave/beforeDelete/afterDelete function is called from another Cloud Code call, it will be further limited by the time left in the calling function. For example, if a beforeSave function is triggered by a cloud function after it has run for 13 seconds, the beforeSave function will only have 2 seconds to run, rather than the normal 3 seconds.
So even if pay them thousands of dollars every month they won't allow your function to run more than 10-15 seconds. Parse is not a tool for everything and is very specific. I meet limitations all the time, like lack of support multipart forms with many attachments.
Parse.Cloud.job
In order to support max 15 minutes with Parse request you can choose to work with Background Jobs. They support Promises with .then which highly conserves server resources over typical anonymous callbacks.
If you use Free edition you won't love another limit: Apps may have one job running concurrently per 20 req/s in their request limit, so you can run only single Background Job in your app and if you try to open another one: Jobs that are initiated after the maximum concurrent limit has been reached will be terminated immediately. To get 4 background jobs running you will have to pay $700/m with current pricing.
If you need more time or have less money to parse tens of pages at once you can choose different technology to support web scraping. There are many options, personally my favorites are:
Node.js
If you like JavaScript on server side you could try Node.js. To start from basics you could follow schotch.io tutorial.
PHP
Another alternative with thousands of examples is PHP. You could start with tutorial-like answer on stackoverflow itself.

no response from the host :snmpwalk

I have implemented AgentX using mib2c.create-dataset.conf ( with cache enabled)
In my snmd.conf :: agentXTimeout 15
In testtable.h file I have changed cache value as below...
#define testTABLE_TIMEOUT 60
According to my understanding It loads data every 60 second.
Now my issue is if the data in data table is exceeds some amount it takes some amount of time to load it.
As in between If I fired SNMPWALK it gives me “no response from the host” If I use SNMPWALK for whole table and in between testTABLE_TIMEOUT occurs it stops in between and shows following error (no response from the host).
Please tell me how to solve it ? In my table large amount of data is present and changing frequently.
I read some where:
(when the agent receives a request for something in this table and the cache is older than the defined timeout (12s > 10s), then it does re-load the data. This is the expected behaviour.
However the agent does not automatically release the local cache (i.e. call the 'free' routine) as soon as the timeout has expired.
Instead this is handled by a regular "garbage collection" run (once a minute), which will free any stale caches.
In the meantime, a request that tries to use that cache will spot that it's expired, and reload the data.)
Is there any connection between these two ?? I can’t get this... How to resolve my problem ???
Unfortunately, if your data set is very large and it takes a long time to load then you simply need to suffer the slow load and slow response. You can try and load the data on a regular basis using snmp_alarm or something so it's immediately available when a request comes in, but that doesn't really solve the problem either since the request could still come right after the alarm is triggered and the agent will still take a long time to respond.
So... the best thing to do is optimize your load routine as much as possible, and possibly simply increase the timeout that the manager uses. For snmpwalk, for example, you might add -t 30 to the command line arguments and I bet everything will suddenly work just fine.

how to avoid a request timing out when uploading and resizing large amount of images in Coldfusion?

I'm running Coldfusion8 and have a cfc, that loops through a set of database records.
Each record contains two fields image path and image file. I'm constructing a path for every image, upload it to a temp folder, resize and then store it to S3.
Depending on the number of records, this may take quite some time and I have not been able to successfully finish the upload cycle with larger sets of images (eventually times out).
I'm already settings my timeout threshold to 5000, but it still does not seem enough.
I can pick up where I left, because I'm keeping a media log to check against, before uploading to S3. This way I can finish the task, but I need to trigger this function 5x to upload 400 items.
Question:
Is there way to avoid a timeout without setting (in S3 case) httptimeout to some 50000000? And would it make sense to run this in a CFTHREAD or will this be a problem if the user leaves the import page while the system is still uploading?
Thanks for some insights.
You can use a CFthread to perform the task, but make sure you LOCK THE SCOPE! otherwise you could end up running this memory intensive proccess several times over and kill the server, you only want this proccess running once at a time if its so intensive.
You have other options though, if this is not something that your application users will need to run and its a one-off proccess your doing, you could set a scheduled task with an exceedingly long timeout to run overnight, when the server is not very high use, This allows you to set the timeout independently to the application so the rest of the application is unaffected by global timeout changes.
Another option is, if this is something users will be doing semi-regularly then a thread which pushes a notification via email, log or other means (Ajax or Websockets) letting the user know they're task is complete. This has the upside that timeouts can be changed, calculated on the amount of data to be proccessed dynamically at thread generation. However, if your not careful you can overload your server with many threads proccessing large datasets (plus log file read-write locks will be harder to manage).
I would encourage you though, to take this away and see what solution works for you and post your final solution so others can see what the outcome is.
Hope this helps.

Can EWS calls be done parallel without slowing down?

I want to retrieve information from an Excachange Server (2010 via EWS API). In detail I want build a windows service to iterate over all excachange users and index their private mailboxes using impersonalisation.
That works well but its very slow when I do this one user after another (depending on the mailbox volume and the amout of users). The indexing speed is now about 500 items per minute.
The following calls takes about 250 milliseconds on my test system:
PropertySet myPropertySet = new PropertySet(BasePropertySet.FirstClassProperties, ItemSchema.ParentFolderId);
myPropertySet.RequestedBodyType = BodyType.Text;
myPropertySet.Add(entryIdExtendedProperty);
Item item = Item.Bind(es, itemKey, myPropertySet);
So my idea was to do a parallelization. So far I tried 3 ways:
Background worker: One worker thread per user.
Result: No effect. It seems that doing this will slow down very call. In sum the overall speed stays the same.
Separate EXE processes: One EXE per user. I created a "Worker"-Exe and called them with the user as argument: IndexWorker.exe -user1
Result: Same result! The calls of every exe are slowed down!
Separate Windows Services: One service per user.
Result: Suddenly, the request did not slow down, which means I could bring the overall speed to a multiple of 500 items per minute (I triet up to 3 processes, thats 1500 items per minute). Not bad but I lets me alone with the question:
Why are EWS calls slowed down in 1) and 2) but not in 3)?
Threading would the most elegant way for me, is there any option oder setting that I may use?
I read a couple of things about Throttling Policies and the EWSFindCountLimit. Is this the right direction?
Did you get to the bottom of why the separate service gave you such an increase in performance? The throttling is applied at the Service Account level, so it should not matter where you are making the calls from.
Your issue is the throttling policy. You need to create a throttling policy for your service account that doesn't restrict EWS or RPC activity.

Resources