I have 10 virtual enviroments for 10 diferent projects, but they have in common many requirements.
For example: let say all use pandas.
Question 1: this mean I have 10 pandas pip downloaded occuping storage?
Question 2: for this commonly used requierments wouldn't be a better solution to install at system level ("base")? how do you do this?
Question 3: Is it a crazy idea to create a virtual enviroment with this libraries and use it as a "base" and then set include-system-site-packages = true in pyvenv.cfg?
What are the good practices?
I'm on macOS and use hombrew python.
Thanks in advance for your insights and experience.
How about you create a docker container for your common packages, and then you just need to apply whatever you are doing on top of that image. Yes you will have 10 images, but only 1 image with pandas for example.
So to answer your questions...
Question 1, yes.
Question 2, check docker and start using it.
Question 3, in line with question 2 if you use docker.
Related
Quite new here, hope this isn't to much of simple question.
I am trying to get into deep learning, starting with pytorch.
Thing is, all tutorials I see use downloading in order to build a data set for training and testing. Unfortunately, I am restricted with my internet connection, meaning I can't download directly from the web.
What I can do is download files and transfer them into my computer.
So my question is - in order to use a previously downloaded dataset in pytorch:
Where should I store it?
How do I create a dataset once I have the files on my computer?
If any other information seems important to you I'll be glad to hear, I'm a serious newbie..
Thank you very much!
Not sure which dataset you are asking for. Since you mentioned the "tutorials", I am guessing that you just want to work with a dataset that comes with some library in PyTorch ecosystem (e.g. torchvision).
Dataset classess in PyTorch ecosystem have a "root" argument in their constrcutors.
mnist = MNIST(root='/some/path', download=True)
You can simply download it on some machine with internet, and transfer the contents of the folder /some/path to your machine at /my/machine/path. And simply point to it and turn off the download
# on your machine without internet
mnist = MNIST(root='/my/machine/path', download=False)
This should work.
Need your expert advice before building my docker image. I have a requirement where I want to install multiple programming languages in a docker image. I have two options to proceed
(a) Install all the softwares together and build a single image which may be around 4GB.
(b) Install all the softwares separately and build a separate image for each software where I will have each image around 1GB.
Now question is if I want to use these images on a single machine to create multiple containers which will run in parallel then which option is better one, to have single image with bigger size or multiple images with smaller size?
Thanks in advance for your kind suggestions.
Regards
Mohtashim
According to the Docker best practice, you have to put one service per image. This will allow you to have more finegraned service control. Look here: https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Just started going through Hadoop introduction videos.
How to practice it on your own? Is there a recommended way to install on local to practice?
I found that downloading and installing Hadoop, playing with it by working examples, making lots of mistakes and being ok with that worked well for practice.
By "install on local" if you're saying "how do I install it on my local machine without using HDFS?", there's an excellent guide here.
If you want to learn about Hadoop and Bigdata, look into bigdatauniversity.com. Its free and they give instructions on how to install Hadoop locally on a virtual machine and/or in Amazon's Web Services. BigDataUniversity provides labs and instructions to help guide your practice. I found it helpful so far.
Recently Cloudera launched a new platform online where you can play with Hadoop and its ecosystem as much as you want.Here you go -
cloudera.com/live
I have been training people on Hadoop for 2 years now. Here are my two cents.
For the learning part, I would recommend the following sources (as mentioned by others too above):
Yahoo Blog
Hadoop Definitive Guide
HortonWorks Practice Tutorials
And for practicing, traditionally people have been using Hadoop Virtual Machines but this approach has its downsides:
The VMs are huge in size for example HortonWorks' VM is 9.9 GB.
You might have to upgrade your RAM to 8GB.
Some BIOS don't allow virtualization. You might have change bios settings.
Some machines such as Office Desktops/Laptops may not allow installations.
My students and I too faced the these problems while. So, we setup a cluster for our students to practice Hadoop, Spark and related technologies. And we named it as CloudxLab.com.
...I liked bigdatauniversity.com and also noted that MapR, Hortonworks, and Cloudera all offer a downloadable environment that you can use to gain familiarity with the Hadoop operating paradigm.
In fact, if you are studying this with an eye toward working with Hadoop at an Enterprise scale, it's a good idea to explore the products that are being deployed at that level.
I've had a little chance now to explore hands-on with MapR's Hadoop environment and can commend it as a good way to looking into the matter.
---v
I would suggest https://developer.yahoo.com/hadoop/tutorial/ for hadoop self paced study. Its a very comprehensive guide, step by step, from beginner to advanced level.
You can install a virtual box that has Hadoop included but you may encounter some problems with it. I did so first when I started learning Hadoop and after several problems( IP, internet, different configs) I decided to learn with a Linux install.
You can find a tutorial here:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I got application server running in Windows – IIS6.0 with Zend Server to execute PHP. I am looking for lightweight static content only web server on this same machine which will relive IIS form handling static content and increase performance.
It need to be only static content web server – maximum small and maximum effective – lighttpd seems too big because allow to FastCGI.
I am looking for: Windows, static content only, fast, and lightweight.
I am using Windows Server 2003.
You can use Python as a quick way to host static content. On Windows, there are many options for running Python, I've personally used CygWin and ActivePython.
To use Python as a simple HTTP server just change your working directory to the folder with your static content and type python -m SimpleHTTPServer 8000, everything in the directory will be available at http:/localhost:8000/
Python 3
To do this with Python, 3.4.1 (and probably other versions of Python 3), use the http.server module:
python -m http.server <PORT>
# or possibly:
python3 -m http.server <PORT>
# example:
python -m http.server 8080
On Windows:
py -m http.server <PORT>
Have a look at mongoose:
single executable
very small memory footprint
allows multiple worker threads
easy to install as service
configurable with a configuration
file if required
The smallest one I know is lighttpd.
Security, speed, compliance, and flexibility -- all of these describe lighttpd (pron. lighty) which is rapidly redefining efficiency of a webserver; as it is designed and optimized for high performance environments. With a small memory footprint compared to other web-servers, effective management of the cpu-load, and advanced feature set (FastCGI, SCGI, Auth, Output-Compression, URL-Rewriting and many more) lighttpd is the perfect solution for every server that is suffering load problems. And best of all it's Open Source licensed under the revised BSD license.
Main site: http://www.lighttpd.net/
Edit: removed Windows version link, now a spam/malware plugin site.
Consider thttpd. It can run under windows.
Quoting wikipedia:
"it is uniquely suited to service high
volume requests for static data"
A version of thttpd-2.25b compiled under cygwin with cygwin dll's is available. It is single threaded and particularly good for servicing images.
Have a look at Cassini. This is basically what Visual Studio uses for its built-in debug web server. I've used it with Umbraco and it seems quite good.
I played a bit with Rupy. It's a pretty neat, open source (GPL) Java application and weighs less than 60KB. Give it a try!
You can try running a simple web server based on Twisted
nginx or G-WAN
http://nbonvin.wordpress.com/2011/03/24/serving-small-static-files-which-server-to-use/
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I was watching one of those incredibly retarded tv quiz money scams last night as I was reading reddit and they posed the question:
if you wrote down all the numbers between 32 to 287 how may times would you write down the number 6?
So I did some quick maths in my head (there are 11 sixes in each 100, there are two hundreds in between the two numbers and then there are six more = 22 + 6 = 28). The first caller rings up and says 28.
I am not great at maths in my head but I could think of a pretty easy for loop that would figure it out, but there is no way I was going to go through the hassle of installing an IDE on my home machine just to write five lines of code. My question:
Is there a website where I can write simple algorithms like this and compile them and get results all in-browser without having to install any crap or jump through any hoops?
Code Pad supports a lot of programming languages, is free and doesn't require registration.
There is also web based interpreters for Python:
http://try-python.mired.org/
And for Ruby:
http://tryruby.hobix.com/
Example for the Python online interpreter:
Python 2.5.2 (r252:60911, May 29 2008, 09:50:36) [C] on sunos5
Type "help", "copyright", "credits", or "license" for more information.
>>> total=0
>>> for a in range(32,288):
... total = total + str(a).count('6')
...
>>> total
56
Nowadays there's also the ideone.com. It supports a large variety of programming languages, including esoteric ones.
This answer is going to be language specific. For the best answer, javascript would work well. Since it already runs in the browser, writing an interpreter that runs in the browser is a piece of cake. Just google for "javascript interpreter" and you'll get a bunch of hits.
If you can write the algorithms in Javascript, use Project Bespin.
Bespin is a Mozilla Labs experiment that proposes an open, extensible web-based framework for code editing that aims to increase developer productivity, enable compelling user experiences, and promote the use of open standards.
http://compilr.com has IDE support for C#, java, c++, ruby, php, vb. And compile support for java.
Codiad or Codiad++ for the cpp version.
There's an online "live demo" for the LUA language here: http://www.lua.org/demo.html
There is an online Ruby interpreter at:
http://tryruby.hobix.com/
It has a pretty good tutorial too to help you learn Ruby as you go.
There's a whole bunch of BASIC emulators!
http://www.vavasour.ca/jeff/level1/simulator.html
Great for some instant
10 PRINT "HELLO"
20 GOTO 10
The best online tool I've found (except for codepad) is http://jsfiddle.net/
You are able to write the HTML, CSS and JavaScript for your app. You can choose from 10 JavaScript frameworks (I recommend jQuery for simple tests). And to test you only need to press the Run buttons. Allows for online saving (pastebin-like), which is also good.