How to increase resource allocation to ravendb - go

I'm trying to process a document and store many documents into ravendb which I have running locally.
I'm getting the error
Tried to send *ravendb.BatchCommand request via POST http://127.0.0.1:8080/databases/mydb/bulk_docs to all configured nodes in the topology, all of them seem to be down or not responding. I've tried to access the following nodes: http://127.0.0.1:8080
I was able to fetch mydb topology from http://127.0.0.1:8080.
Fetched topology: ( url: http://127.0.0.1:8080, clusterTag: A, serverRole: Member)
exit status 1
To me, it sounds like maybe my local cluster is running out of compute to process the large amount of data I'm trying to store.
RavenDB says I'm using 3 of 12 available cores, and I'd also like to make sure it's using a reasonable amount of the ram I have available on the machine (I'd even be happy with giving it a swap)
But reading around online, I'm not finding much helpful information for making sure RavenDB is able to use what it needs. I found the settings.json so I can add in configurations which theoretically should get included into the server but I'm not making much progress.
I also found some settings and changed "reassign cores" to 12 but it says that still 3/12 are being used and 6/31.1 GB of memory are being used.
If an alternative solution is recommended I'm all ears. I just need to run things locally and storing everything as json's doesn't enable fast enough retrieval for my usecase.
Update
I was able to install mongodb and setup a local database. It hasn't given me any problems yet. RavenDB looks appealing if I understood it better but I guess I'll stick with the tried and true for this project.

It is highly unlikely that you managed to run out of resources on the server with 3 cores / 6 GB unless you are pushing hundreds of millions of documents and doing very heavy work.
Do you get any error on the server? There should be more details on the error or in the server log.

Related

Process Laravel/Redis job from multiple server

We are building a reporting app on Laravel that need to fetch users data from a third-party server that allow 1 request per seconds.
We need to fetch 100K to 1000K rows based on user and we can fetch max 250 rows per request.
So the restriction is:
1. We can send 1 request per seconds
2. 250 rows per request
So, it requires 400-4000 request/jobs to fetch a user data, So, loading data for multiple users is very time-consuming and the server gets slow.
So, now, we are planning to load the data using multiple servers, like 4-10 servers to fetch users data, so we can send 10 requests per second from 10 servers.
How can we design the system and process jobs from multiple servers?
Is it possible to use a dedicated server for hosting Redis and connect to that Redis server from multiple servers and execute jobs? Can any conflict/race-condition happen?
Any hint or prior experience related to this would be really helpful.
The short answer is yes, this is absolutely possible and is something I've implemented in production apps many times before.
Redis is just like any other service and can run anywhere, with clients from anywhere, connecting to it. It's all up to your configuration of the server to dictate how exactly that happens (and adding passwords, configuring spiped, limiting access via the firewall, etc.). I'd reccommend reading up on the documentation they have in the Administration section here: https://redis.io/documentation
Also, when you do make the move to a dedicated Redis host, with multiple clients accessing it, you'll likely want to look into having more than just one Redis server running for reliability, high availability, etc. Redis has efficient and easy replication available with a few simple configuration commands, which you can read more about here: https://redis.io/topics/replication
Last thing on Redis, if you do end up implementing a master-slave set up, you may want to look into high availability and auto-failover if your Master instance were to go down. Redis has a really great utility built into the application that can monitor your Master and Slaves, detect when the Master is down, and automatically re-configure your servers to promote one of the slaves to the new master. The utility is called Redis Sentinel, and you can read about that here: https://redis.io/topics/sentinel
For your question about race conditions, it depends on how exactly you write your jobs that are pushed onto the queue. For your use case though, it doesn't sound like this would be too much of an issue, but it really depends on the constraints of the third-party system. Either way, if you are subject to a race condition, you can still implement a solution for it, but would likely need to use something like a Redis Lock (https://redis.io/topics/distlock). Taylor recently added a new feature to the upcoming Laravel version 5.6 that I believe implements a version of the Redis Lock in the scheduler (https://medium.com/#taylorotwell/laravel-5-6-preview-single-server-scheduling-54df8e0e139b). You can look into how that was implemented, and adapt for your use case if you end up needing it.

Marklogic latency : Document not found

I am working on a clustered marklogic environment where we have 10 Nodes. All nodes are shared E&D Nodes.
Problem that we are facing:
When a page is written in marklogic its takes some time (upto 3 secs) for all the nodes in the cluster to get updated & its during this time if I then do a read operation to fetch the previously written page, its not found.
Has anyone experienced this latency issue? and looked at eliminating it then please let me know.
Thanks
It's normal for a new document to only appear after the database transaction commits. But it is not normal for a commit to take 3-sec.
Which version of MarkLogic Server?
Which OS and version?
Can you describe the hardware configuration?
How large are these documents? All other things equal, update time should be proportional to document size.
Can you reproduce this with a standalone host? That should eliminate cluster-related network latency from the transaction, which might tell you something. Possibly your cluster network has problems, or possibly one or more of the hosts has problems.
If you can reproduce the problem with a standalone host, use system monitoring to see what that host is doing at the time. On linux I favor something like iostat -Mxz 5 and top, but other tools can also help. The problem could be disk I/O - though it would have to be really slow to result in 3-sec commits. Or it might be that your servers are low on RAM, so they are paging during the commit phase.
If you can't reproduce it with a standalone host, then I think you'll have to run similar system monitoring on all the hosts in the cluster. That's harder, but for 10 hosts it is just barely manageable.

How many users should a EC2 Micro Instance be able to handle only with a nginx server?

I have a iOS Social App.
This app talks to my server to do updates & retrieval fairly often. Mostly small text as JSON. Sometimes users will upload pictures that my web-server will then upload to a S3 Bucket. No pictures or any other type of file will be retrieved from the web-server
The EC2 Micro Ubuntu 13.04 Instance runs PHP 5.5, PHP-FPM and NGINX. Cache is handled by Elastic Cache using Redis and the database connects to a separate m1.large MongoDB server. The content can be fairly dynamic as newsfeed can be dynamic.
I am a total newbie in regards to configuring NGINX for performance and I am trying to see whether I've configured my server properly or not.
I am using Siege to test my server load but I can't find any type of statistics on how many concurrent users / page loads should my system be able to handle so that I know that I've done something right or something wrong.
What amount of concurrent users / page load should my server be able to handle?
I guess if I cant get hold on statistic from experience what should be easy, medium, and extreme for my micro instance?
I am aware that there are several other questions asking similar things. But none provide any sort of estimates for a similar system, which is what I am looking for.
I haven't tried nginx on microinstance for the reasons Jonathan pointed out. If you consume cpu burst you will be throttled very hard and your app will become unusable.
IF you want to follow that path I would recommend:
Try to cap cpu usage for nginx and php5-fpm to make sure you do not go over the thereshold of cpu penalities. I have no ideia what that thereshold is. I believe the main problem with micro instance is to maintain a consistent cpu availability. If you go over the cap you are screwed.
Try to use fastcgi_cache, if possible. You want to hit php5-fpm only if really needed.
Keep in mind that gzipping on the fly will eat alot of cpu. I mean alot of cpu (for a instance that has almost none cpu power). If you can use gzip_static, do it. But I believe you cannot.
As for statistics, you will need to do that yourself. I have statistics for m1.small but none for micro. Start by making nginx serve a static html file with very few kb. Do a siege benchmark mode with 10 concurrent users for 10 minutes and measure. Make sure you are sieging from a stronger machine.
siege -b -c10 -t600s 'http:// private-ip /test.html'
You will probably see the effects of cpu throttle by just doing that! What you want to keep an eye on is the transactions per second and how much throughput can the nginx serve. Keep in mind that m1small max is 35mb/s so m1.micro will be even less.
Then, move to a json response. Try gzipping. See how much concurrent requests per second you can get.
And dont forget to come back here and report your numbers.
Best regards.
Micro instances are unique in that they use a burstable profile. While you may get up two 2 ECU's in terms of performance for a short period of time, after it uses its burstable allotment it will be limited to around 0.1 or 0.2 ECU. Eventually the allotment resets and you can get 2 ECU's again.
Much of this is going to come down to how CPU/Memory heavy your application is. It sounds like you have it pretty well optimized already.

MongoDB slow after periods of inactivity

I have created a site that uses MongoDB as the database engine and at the moment it is still under construction so it is not getting much traffic. This means that there are periods of no requests and therefor, no queries to the database.
When I do eventually hit the site pages that use the database, MongoDB seems to take 4 or 5 seconds to come back but from that request on, it is very fast.
I can't find any information on there being a timeout or anything like that. Is it just that the database in memory is being paged out and it takes a few seconds to page it back in? It is running on a Windows Server 2008 VM and I am running it as a windows service.
Any help would be appreciated.
MongoDB allows the OS Kernel to handle what is kept in Memory (the current "Working Set"). Even if nothing is happening, the system will still page objects out of RAM into the page/swap, even if the RAM capacity is not being taxed.
One way around this would be to monitor for idleness and send queries in the background, or even have a background process cat the files on-disk. This is especially helpful in pre-warming databases after startup, and likewise if your usage forms cyclical patterns.
Like most styles of databases recent query results can be cached, the execution plans can be stored in some but it doesn't seem like mongodb stores query caches.
Also to improve performance make sure you implement indexes well, so you don't mistakenly create full table scans and leverage some form of index. use the explain command to see your query execution plan. (http://docs.mongodb.org/manual/reference/method/cursor.explain/)
http://docs.mongodb.org/manual/faq/fundamentals/
Does MongoDB handle caching?
Yes. MongoDB keeps all of the most recently used data in RAM. If you have created indexes for your queries and your working data set fits in RAM, MongoDB serves all queries from memory.
MongoDB does not implement a query cache: MongoDB serves all queries directly from the indexes and/or data files.

Strange performance using JPA, am I missing something?

We have a JPA -> Hibernate -> Oracle setup, where we are only able to crank up to 22 transactions per seconds (two reads and one write per transaction). The CPU and disk and network are not bottlenecking.
Is there something I am missing? I wonder if there could be some sort of oracle imposed limit that the DBA's have applied?
Network is not the problem, as when I do raw reads on the table, i can do 2000 reads per second. The problem is clearly writes.
CPU is not the problem on the app server, the CPU is basically idling.
Disk is not the problem on the app server, the data is completely loaded into memory before the processing starts
Might be worth comparing performance with a different client technology (or even just a simple test using SQL*Plus) to see if you can beat this performance anyway - it may simply be an under-resourced or misconfigured database.
I'd also compare the results for SQLPlus running directly on the d/b server, to it running locally on whatever machine your Java code is running on (where it is communicating over SQLNet). This would confirm if the problem is below your Java tier.
To be honest there are so many layers between your JPA code and the database itself, diagnosing the cause is going to be fun . . . I recall one mysterious d/b performance problem resolved itself as a misconfigured network card - the DBAs were rightly insistent that the database wasn't showing any bottlenecks.
It sounds like the application is doing a transaction in a bit less than 0.05 seconds. If the SELECT and UPDATE statements are extracted from the app and run them by themselves, using SQL*Plus or some other tool, how long do they take, and if you add up the times for the statements do they come pretty near to 0.05? Where does the data come from that is used in the queries, and which eventually gets used in the UPDATE? It's entirely possible that the slowdown is not the database but somewhere else in the app, such a the data acquisition phase. Perhaps something like a profiler could be used to find out where the app is spending its time.
Share and enjoy.

Resources