Usergrid system requirements? - elasticsearch

question, how much resources I need to run apache usergrid?
I mean hardware resources, RAM CPU
I want to deploy apache usergrid to be used as backed in our apps, the apps have a low traffic now, are custom projects to be used in small users groups (<10k)
I want to know the minimum requirements to know if it is viable for us, thanks.

From what I see of usergrid, I can think that the most hungry for resources component will be Elasticsearch, so to have a production environment that's working well, I guess you should start following ES' requirements:
At least 8 GB of RAM
At least 4 cores (the more cores you give Elasticsearch, the more love you get as it tends to works with a lot of threading, i.e. give more cores rather than more CPU processing power)
Fast HDDs should perform fine
See this article on Elasticsearch.A last thing is that depending on your system, you can tune several settings on Elasticsearch to achieve a better throughput. (For instance see https://www.elastic.co/guide/en/elasticsearch/reference/master/tune-for-indexing-speed.html)

I have deployed latest version of Usergrid i.e. 2.1, which is working smoothly in apache-cassandra-2.1.20, apache-tomcat-7.0.85, elasticsearch-1.7.6 on single node of Cassandra on Ubuntu 16.04 with 8 GB Ram and 180 GB SSD. Hope this will help you.

Related

Laravel 6.0 local server

I have this project in laravel 6.0 stored locally in my laptop. I am just wondering if I can still run this same project to another local PC and use it as a server for local only. I know it would be better to have a dedicated server for this project but this is not a bigtime system. I just want to run it only in my property. Is it okay if the said server is with these specs?
Processor- Intel® Core™ i7-4770
MotherBoard- Gigabyte GA-Z97X Gaming 7 ATX
Cooler- Stock Cooler
RAM- 1x8gb ddr3 Team Elite
GPU- Asus RX 570 4gb ddr5
HDD- WD Green 1TB (100% Healthy)
PSU- Seasonic M12II 620watts fully modular
I don't know if I should I this here or to another site. Thanks if you could help me enlightened.
First of all, in addition to the hardware of the machine where you want to host the application, a more thorough study would be necessary such as the type of application you want to host and the data traffic that you will have. In addition to the type of database manager you are going to use, if it will be hosted on the same machine or remotely.
As you said, I understand that it is for domestic use and I imagine that you will use MySql hosted on the same machine (correct me if I am wrong), so with the hardware you described it should not cause you problems.
Hope this can help you
Your hardware should suffice, however if you can manage it, try setting it up on a Linux command only OS. Most importantly—if possible—replace your HDD with an SSD, even if it is a small one, it will improve a lot the performance of your application.

Why a virtual machine (VM) is used to run and deploy Hadoop cluster and its modules?

I am new to Hadoop and don't know the reason why a virtual machine (VM) is used to run and deploy Hadoop cluster and its modules?
Can we not use Hadoop through the local Linux/Unix system
the reason why a virtual machine (VM) is used to run and deploy Hadoop cluster and its modules
Because lots of data centers have more virtual space than physical space. Thousands of servers can run on hundreds of machines (approximately). That's what any Hadoop Cluster in the cloud would be - a bunch of virtualized machines.
Because some companies just want a small, cheap proof of concept that Haddon will work within their ecosystem of existing software.
Because it's makes an easy demo to boot up a VM rather than carry around several machines.
etc...
Anyways, I'd say it's strongly encouraged to use physical hardware, but that costs time and resources to maintain in terms of money and dealing with hardware failures and keeping software patched between Hadoop and the OS. Primarily you'd want to be able to pick and choose hardware that suits your use cases. Lots of storage for a "data lake" or lots of memory for fast processing. Mix in some SSD for fast caching...
Sure, VMs let you dynamically allocate some of those items, but when a disk or memory stick goes corrupt, it affects all VMs on one machine rather than one server

Under Windows [Redis 64Bit] whether can be used in a production environment?

I use this version on my dev environment : Redis-64 .
And I want to know if this version is suitable for the production environment?
If can use, then compared with under Linux, what need to be pay attention to?
Since version 3.0.3 the windows port developers abandoned the dlmalloc and began to use jemalloc as memory allocator. And the port was actually considered for production usage. The 3.0.500 build is approved for production by ms developers (see here).
And there is some kind of hell so how they bypassed the unix fork to save data to disk. Microsoft developers port call it point-in-time heap snapshot. And this is the most controversial part when used in production:
Redis under windows may need up to 3 times more memory than you need in linux version. This behavior is considered normal, because swap file in the windows can easily be up to 3 times larger than the actual amount of RAM.
I think this is acceptable only if the use Redis as LRU cache or not to save data to disk at all.
At least Redis under windows is absolutely susceptible if you Redis node use lot of memory. For example - we try to use Redis for windows (v2.8, v3.0.3, v3.0.5) on server with 512 gb of memory with 2 SSD drives (each 256 gb in raid 0) used as system disk. No any limits on windows swap file. Our test emulates our production - lots of writes and saves with RDB with utilization ~60-70% of memory. And here is was lots of hands up behaviours then this node try to save snapshots - memory consumption jumps, connection freeze during saving. Such behaviour never happens undex linux on same hardware.

How to install pyspark & spark for learning purpose on a laptop with limited resources?

I have a windows 7 laptop with 6GB RAM . What is the most RAM/resource efficient way to install pyspark & spark on this laptop just for learning purpose. I don't want to work on actual big data but small dataset is ideal since this is just for learning pyspark & spark in general. I would prefer the latest version of Spark.
FYI: I don't have hadoop installed.
Thanks
You've basically got three options:
Build everything from source
Install Virtualbox and use a pre-built VM like Cloudera Quickstart
Install Docker and find a suitable container
Getting everything up and running when you choose to build from source can be a pain. You've got to install the JDK, build hadoop and spark (both of which require you to install additional software to build them), set up a bunch of environment variables and then pray that didn't mess anything up.
VMs are nice, particularly the one from Cloudera, but you'll often be stuck with an older version of Spark and it might be tight with the resources you described.
I'd go with Docker.
Once you've got docker installed, it becomes very easy to try Spark (and lots of other technologies). My favorite containers for playing around use ipython or jupyter notebooks.
Install Docker:
https://docs.docker.com/installation/windows/
Jupyter Notebook Python, Spark, Mesos Stack
https://github.com/jupyter/docker-stacks/tree/master/pyspark-notebook
One thing to keep in mind is that you are going to have to allocate a certain amount of memory for the VM and the remaining memory still has to operate Windows. Windows 7 requires a minimum of 1 GB for a 32-bit OS or 2 GB for a 64-bit OS. So likely you are only going to wind up with around 4 GB of RAM for running the VM, which is not much.
Assuming you are 64-bit, note that Cloudera requires a minimum of 4 GB RAM to run CDH 5, but if you want to run Cloudera Express, you need 8 GB.
Running Docker from Windows will require you to use boot2docker, which keeps the entire VM in memory. It uses minimal memory (like around 27 MB) to run, so you should be fine there. A MUCH better solution than running VirtualBox!
Another option to consider would be to spin up a free machine on something like Amazon Web Services (http://aws.amazon.com) or Google Cloud (http://cloud.google.com). Particularly with the later, you can get a free trial amount of credits, which you could use to spin up a machine with more RAM than you would typically get with AWS.

Decreasing performance of dev machine to match end-user's specs

I have a web application, and my users are complaining about performance. I have been able to narrow it down to JavaScript in IE6 issues, which I need to resolve. I have found the excellent dynaTrace AJAX tool, but my problem is that I don't have any issues on my dev machine.
The problem is that my users' computers are ancient, so timings which are barely noticable on my machine are perhaps 3-5 times longer on theirs, and suddenly the problem is a lot larger. Is it possible somehow to degrade the performance of my dev machine, or preferrably of a VM running on my dev machine, to the specs of my customers' computers?
I don't know of any virtualization solutions that can do this, but I do know that the computer/CPU emulator Bochs allows you to specify a limit on the number of emulated instructions per second, which you can use to simulate slower CPUs.
I am not sure if you can cpu bound it, but in VirutalBox or Parallel, you can bound the memory usage. I assume if you only give it about 128MB then it will be very slow. You can also limit the throughput on the network with a lot of tools. I guess the only thing I am not sure about is the CPU. That's tricky. Curious to know what you find. :)
You could get a copy of VMWare Workstation and choke the CPU of your VM.
With most virtual PC software you can limit the amount of RAM, but you are not able to set the CPU to a slower speed as it does not emulate a CPU, but uses the host CPU.
You could go with some emulation software like bochs that will let you setup an x89 processor environment.
You may try Fossil Toys
* PC Speed
PC CPU speed monitor / benchmark. With logging facility.
* Memory Load Test
Test application/operating system behaviour under low memory conditions.
* CPU Load Test
Test application/operating system behaviour under high CPU load conditions.
Although it doesn't simulate a specific CPU clock speed.

Resources