I want to create a Python environment with the data science libraries NumPy, Pandas, Pytorch, and Hugging Face transformers. I use miniconda to create the environment and download and install the libraries. There is a flag in conda install, --download-only to download the required packages without installing them and install them afterwards from a local directory. Even when conda just downloads the packages without installing them, it also extracts them.
Is it possible to download the packages without extracting them and extract them afterwards before installation?
There is no simple command in the CLI to prevent the extraction step. The extraction is regarded as part of the FETCH operation to populate the package cache before running the LINK operation to transfer the package to the specified environment.
The alternative would be to do something manually. Naively, one could search Anaconda Cloud and manually download, however, it would probably be better to go through the solver to ensure package compatibility. All the info for operations to be run can be viewed by including the --json flag. This could be filtered to just the tarball URLs and then downloaded directly. Here's a script along these lines (assuming Linux/Unix):
File: conda-download.sh
#!/bin/bash -l
conda create -dn null --json "$#" |\
grep '"url"' | grep -oE 'https[^"]+' |\
xargs wget -c
which can be used as
./conda-download.sh -c conda-forge -c pytorch numpy pandas pytorch transformers
that is, it accepts all arguments conda create would, and will download all the tarballs locally.
Ignoring Cached Packages
If you already have some packages cached then the above will not redownload them. Instead, if you wish to download all tarballs needed for an environment, then you could use this alternate version which overrides the package cache using an empty temporary directory:
File: conda-download-all.sh
#!/bin/bash -l
tmp_dir=$(mktemp -d)
CONDA_PKGS_DIRS=$tmp_dir conda create -dn null --json "$#" |\
grep '"url"' | grep -oE 'https[^"]+' |\
xargs wget -c
rm -r $tmp_dir
Do you really want to use conda-pack? That lets you archive a conda-environment for reproducing without using the internet or re-solving for dependencies. To just prevent re-solving you can also use conda env export --explict but that still ties you to the source (internet or local disk repository).
If you have a static environment (read-only) and want to really reduce docker size, you can volume mount the environment at runtime. You would need to match the file paths (ie: /opt/anaconda => /opt/anaconda).
I am using git bash on Windows - that is git for Windows via the integrated bash. Apparently it uses the MINGW/MSYS underpinning. (Update from #VonC: It now uses msys2 since msysgit is obsolete since Q4 2015.)
So there are already a lot of MSYS tools installed - from awk to zcat. However I miss the man command and zip to compress multiple files into a zip file (unzip exists!).
Where from can I install them? I do not want to install another copy of the MINGW system! Any way just to add some pre-compiled tools to the git bash installation?
Here's another, slightly different, set of instructions to install zip for git bash on windows:
Navigate to this sourceforge page
Download zip-3.0-bin.zip
In the zipped file, in the bin folder, find the file zip.exe.
Extract the file zip.exe to your mingw64 bin folder (for me: C:\Program Files\Git\mingw64\bin)
Navigate to to this sourceforge page
Download bzip2-1.0.5-bin.zip
In the zipped file, in the bin folder, find the file bzip2.dll
Extract bzip2.dll to your mingw64\bin folder (same folder as above: C:\Program Files\Git\mingw64\bin)
7-zip can be added to gitbash as follows:
Install 7-zip on windows.
add 7-zip folder (C:\Program Files\7-Zip) to PATH
On gitbash exp: export PATH=$PATH:"C:\Program Files\7-Zip" (temporary)
On Windows, adding PATH like image below (permanent)
duplicate a copy of 7z.exe to be zip.exe
reopen gitbash again. done!
This way, it works on my laptop.
If you skip step 3. you still can call zip command as 7z instead of zip
Conclusion: Gitbash is running base on windows Path, I think you can run any command that you have added to your Windows PATH.
2016: The zip command can be installed from GoW (Gnu On Windows). man is not provided (too big).
It is to note, however, that if you only want to add the zip command from GoW, still the whole GoW system has to be downloaded and installed. Then you can delete the other commands from the bin directory, however make sure to keep the needed dlls in the directory.
Update 2021: tar/zip are by default installed on Windows 10.
7-zip based solutions are available below.
git-archive, is prepared without any installation, can create zip-archive.
mkdir workrepo
cd workrepo
git init
cp -r [target_file_or_dir] .
git add .
git commit -m commit
git archive -o ../myarchive.zip #
cd ..
rm -rf workrepo
Following script may be usable:
zip.sh foo.zip target_file_or_dir
#!/usr/bin/bash
set -eu
unset workdir
onexit() {
if [ -n ${workdir-} ]; then
rm -rf "$workdir"
fi
}
trap onexit EXIT
workdir=$(mktemp --tmpdir -d gitzip.XXXXXX)
cp -r "$2" "$workdir"
pushd "$workdir"
git init
git config --local user.email "zip#example.com"
git config --local user.name "zip"
git add .
git commit -m "commit for zip"
popd
git archive --format=zip -o "$1" --remote="$workdir" HEAD
I am so glad to share my experience on this issue that I haven't known for two years since the first day I played with Groovy. My method needs to have git for Windows installed in Windows OS.
These steps are for installing 7z command-line utility, which behaves a bit differently from zip:
Download and install 7-Zip from its official website. By default, it is installed under the directory /c/Program Files/7-Zip in Windows 10 as my case.
Run git Bash under Administrator privilege and navigate to the directory /c/Program Files/Git/mingw64/bin, you can run the command ln -s "/c/Program Files/7-Zip/7z.exe" 7z.exe
I am pretty sure it could help you a lot. Trust me!
On Windows, you can use tar instead of zip.
tar -a -c -f output.zip myfile.txt
which is same as,
zip output.zip myfile.txt
no need to install any external zip tool.
I use choco as my Windows Package Manager.
I install 7zip with choco using PowerShell (you must be admin to use Choco)
PS > choco install 7zip.install
Open another gitbash Terminal and locate the 7z.exe executable
$ which 7z
/c/ProgramData/chocolatey/bin/7z
Do a straight copy of 7z.exe to zip.exe and voila
$ cp /c/ProgramData/chocolatey/bin/7z.exe /c/ProgramData/chocolatey/bin/zip.exe
You can mimic a small subset of man behavior in the shell by mapping man <command> to <command> --help | less
Unfortunately, on my machine bash aliases won't add flags to positional arguments, it will try to run the flag as a command and fail (alias man="$1 --help" doesn't work).
And a function called man() is not allowed!
Luckily a combination of bash functions and aliases can achieve this mapping. Put the code below in your ~/.bashrc (create one if it is not there). Don't forget to source ~/.bashrc.
# man command workaround: alias can't pass flags, but can't name function man
m() {
"$1" --help | less
}
alias man="m"
It doesn't get you the full man page, but if all you're looking for is basic info on a command and its flags, this might be all you need.
You can install individual GNU tools from http://gnuwin32.sourceforge.net/packages.html such as zip.
Then add "/c/Program Files (x86)/GnuWin32/bin" to PATH in your startup script like .profile, .bash_profile, .bashrc, etc.
Here are the steps you can follow.
Go to the following link
https://sourceforge.net/projects/gnuwin32/files/
Find out whatever command you are missing
Here I need zip and bzip2 for zip command. Because zip command relies on bzip2.dll to run. Otherwise you will get error “error while loading shared libraries: ?: cannot open shared object file: No such file or directory”.
Unzip the downloaded files
Here I am downloading “zip-3.0-bin.zip” for “zip.exe” and “bzip2-1.0.5-bin.zip” for “bzip2.dll” in the bin folder. /bin/.exe
Copy the command exe file into git-bash folder
Here I am copying “zip.exe” and “bzip2.dll” to \Git\usr\bin.
Reference Link
https://ranxing.wordpress.com/2016/12/13/add-zip-into-git-bash-on-windows/
ln -s /mingw64/bin/ziptool.exe /usr/bin/zip
steps to install SDKMAN on windows
Run Windows Terminal in Admin rights. open git bash inside. (Ctrl + Shift + 4)
winget install -e --id GnuWin32.Zip
mkdir ~/bin
cp /usr/bin/unzip ~/bin/zip
curl -s "https://beta.sdkman.io" | bash
source "/c/Users/ajink/.sdkman/bin/sdkman-init.sh"
sdk selfupdate force
After you can install Java like this.
sdk install java 17.0.2-open
Done ! :)
In msys2, I restored the functionality of git help <command> by installing man-db:
|prompt> pacman -Syu man-db
|prompt> git help archive
For zip functionality, I also use git archive (similar to yukihane's answer).
Here's yet another 7-Zip option that I didn't notice:
Create a script named zip:
$ vi ~/bin/zip
Reference 7z specifying the add command followed by the args:
#!/bin/bash
/c/Progra~1/7-Zip/7z.exe a "$#"
Finally make it executable
$ chmod ugo+x ~/bin/zip
This helped to make a ytt build script happy.
+ zip ytt-lambda-website.zip main ytt
7-Zip 18.01 (x64) : Copyright (c) 1999-2018 Igor Pavlov : 2018-01-28
Scanning the drive:
2 files, 29035805 bytes (28 MiB)
Creating archive: ytt-lambda-website.zip
Add new data to archive: 2 files, 29035805 bytes (28 MiB)
Though this question as been answered quite thoroughly in regards to man there is one alternative to zipping that has not been highlighted here yet. #Zartc brought to my attention that there is a zip compression utility built-in: ziptool. In trying to use it however I found out it is no where near a drop-in replacement and you need to specify each individual file and folder. So I dug into the docs and experimented until I had a bash-function that can do all the heavy lifting and can be used very similar to a basic zip -qrf name * compression call:
zipWithZiptool() {
# Docs: https://libzip.org/documentation/ziptool.html
targetFilePath="$1"
shift
args=() # collect all args in an array so spaces are handled correctly
while IFS=$'\n\r' read -r line; do
if [[ -d "$line" ]]; then
args+=("add_dir" "$line") # Add a single directory by name
else
# add_file <pathInZip> <pathToFile> <startIndex> <length>
args+=("add_file" "$line" "$line" 0 -1)
fi
done <<< "$(find "$#")" # call find with every arg to return a recursive list of files and dirs
ziptool $targetFilePath "${args[#]}" # quotation is important for handling file names with spaces
}
You can then for example zip the contents of the current directory by calling it like this:
zipWithZiptool "my.zip" *
If you are willing to install CygWin also, you can add the CygWin path to your GitBash path, and if zip is there, it will work. e.g. add
PATH=$PATH:/c/cygwin/bin
export PATH
to your .bashrc; NOTE: I would put it at the end of the path as shown, not the beginning.
Since CygWin has a UI-based installer, it's easy to add or remove applications like zip or man.
You can figure out the windows paths of each by saying
`cygpath -w /bin`
in each respective shell.
Regarding zip, you can use a following perl script to pack files:
#!/usr/bin/perl
use IO::Compress::Zip qw(:all);
$z = shift;
zip [ #ARGV ] => $z or die "Cannot create zip file: $ZipError\n";
If you make it executable, name it zip, and put it in your $PATH, you can run it like this:
zip archive.zip files...
however it will not work for directories. There is no need to install anything, as perl and all required modules are already there in the Git for Windows installation.
Regarding man, at least for git there is a documentation invoked like this:
git option --help
it will open in your default browser.
Here is my experience, I cant run and exe or .msi files in my laptop. so downloaded filed from https://github.com/bmatzelle/gow/wiki > go to download Now and Downloaded Source Code (Zip) and unzipped this file in a folder and updated path variable with folder name.
This worked out for me.
If you want to zip files without needing to install any additional tools on Windows, in a way that works both on git bash and on other *nix systems, you might be able to use perl.
Per Josip Medved's blog, the following script creates an .epub (which is a zip file), and includes a filter for stripping src/ from the files added to the zip:
perl -e '
use strict;
use warnings;
use autodie;
use IO::Compress::Zip qw(:all);
zip [
"src/mimetype",
<"src/META-INF/*.*">,
<"src/OEBPS/*.*">,
<"src/OEBPS/chapters/*.*">
] => "bin/book.epub",
FilterName => sub { s[^src/][] },
Zip64 => 0,
or die "Zip failed: $ZipError\n";
'
install zip
https://gnuwin32.sourceforge.net/packages/zip.htm
copy zip.exe and bzip2.dll from C:\Program Files (x86)\GnuWin32\bin to C:\Program Files\Git\mingw64\bin
reopen git-bash
Solutions for me were just to install zip on my terminal(bash):
$ sudo apt-get update
$ sudo apt-get install zip unzip
I've created a custom channel on a windows box following the steps detailed here.
Now I'd like to access it from a different machine but the channel parameter is a URI and I don't know what form it should take with Windows.
Here's the command I tried to execute:
conda search -c file://machine\C\channel --override-channels scipy
which failed with the following error message:
Fetching package metadata: Error: Invalid index file
I have been trying to do the same thing, and the answer by Paul made me a bit pessimistic.
It turns out that it is possible to use a UNC-path.
After trying a few hundred combinations of slashes and backslashes, I found this combination to work:
conda search -c "file://\\DOMAIN\SERVER\SHARE\conda\channel" --override-channels
Similarly,
conda config --add channels "file://\\DOMAIN\SERVER\SHARE\conda\channel"
Adds the channel to your config file.
Let's say that your custom channel is located in the following directory:
N:\conda\channel. Then we would expect to see the following in this directory (1) the win-64 directory (2) the index files inside, in this case the directory N:\conda\channel\win-64\, of repodata.json and repodata.json.bz2 (3) any packages that you have added to your channel. A search on this channel for the scipy package, ignoring all other channels, would look like this conda search -c file://N:\conda\channel --override-channels scipy
Did you add the scipy package into your custom channel? If you did, then did you run conda index on that directory?
I'm a little confused by your directory structure but, if your channel is machine\C\channel, then what happens when you do dir machine\C\channel?
I had no success with the other answers though #gDexter42 helped me in the right direction. Perhaps the API has changed. Testing several different options I was somewhat surprised that
you can use / or \ interchangably
you do not need to escape spaces
you do not need to enclose paths with spaces in quotes
After creating a custom channel in a network accessible directory, you can search for a conda package using the file path, excluding the file:// mentioned in other posts and in the documentation.
For a UNC Path:
$ conda search -c //my_nas/some/path with spaces/channel --override-channels
or
$ conda search -c \my_nas\some\path with spaces\channel --override-channels
If the folder is local, or you have mounted a network directory to a local path (D:\ in this example), you would use that file path.
$ conda search -c D:/some/path with spaces/channel --override-channels
or
$ conda search -c D:\some\path with spaces\channel --override-channels
I tested these commands using both Git Bash for Windows and Anaconda Prompt (which I think is just cmd.exe with the path modified so base/root is the active environment).
Note that if you then want to add that path to your .condarc file, you can use the same path.
channels:
- \\my_nas\some\path with spaces\channel # UNC
- D:/some/path with spaces/channel # local drive
- defaults # this gives defaults lower priority
ssl_verify: true
If you are trying to search for a conda package in a local directory (not on UNC), the following two approaches worked for me.
Navigate to the drive containing the package and try
conda search -c file://folder_path/channel --override-channels
the better way is to drop the file flag which allows you to search from any drive. Type
conda search -c Drive://folder_path/channel --override-channels thus if you are searching from D: drive you would type this as
conda search -c D://folder_path/channel --override-channels
If your conda channel is at C:\conda-channel then do:
conda search -c file://\conda-channel --override-channels
There is currently a bug in conda 4.6+ where file://C:\conda-channel will not work as it removes the colon when parsing. And downgrading to 4.5 is dangerous and can mess up your installation.
From the (slightly) outdated documentation on pyrocksdb, it says:
"If you do not want to call make install export the following enviroment variables:"
$ export CPLUS_INCLUDE_PATH=${CPLUS_INCLUDE_PATH}:`pwd`/include
$ export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:`pwd`
$ export LIBRARY_PATH=${LIBRARY_PATH}:`pwd`
But the installation instructions for RocksDB do not seem to mention any sort of install target!
Is there an accepted procedure for installing RocksDB from source?
My thoughts are to just copy the contents of the include directory from the rocksdb directory into somewhere like /usr/local/include and copy the librocksdb.so and librocksdb.a files into /usr/local/lib. Is this an acceptable method?
Note: The method of exporting environment variables was less preferable to me, as I built rocksdb in a directory inside my home folder--I am hoping for a cleaner solution (interpret that how you want).
RocksDB recently has make install. If you use the latest version, you should be able to do make install in RocksDB.
There is no install target in the current Makefile.
This breaks the long-established conventions for writing Makefiles (or pretty-much every other build system...); it should be considered a defect.
Without spending a lot of time analysing I can't be sure, but the install target should be something like:
prefix=/usr/local
bindir=$(prefix)/bin
# Normally you'd write a macro for this; 'lib' for 32-bit, 'lib64' for 64...
libdir=$(prefix)/lib64
includedir=$(prefix)/include
# Define this to be the directory(s) the headers are installed into.
# This should not include the 'include' element:
# include/rocksdb/stuff -> rocksdb/stuff
HEADER_DIRS=...
# Define this so all paths are relative to both the $CWD/include directory...
# so include/rocksdb/foo.h -> HEADER_FILES=rocksdb/foo.h
HEADER_FILES=...
.PHONY: install
install: $(TOOLS) $(LIBRARY) $(SHARED) $(MAKEFILES)
mkdir -p $(DESTDIR)$(bindir)
mkdir -p $(DESTDIR)$(libdir)
mkdir -p $(DESTDIR)$(includedir)
for tool in $(TOOLS); do \
install -m 755 $$tool $(DESTDIR)$(bindir); \
done
# No, libraries should NOT be executable on Linux.
install -m 644 $(LIBRARY) $(DESTDIR)$(libdir)
install -m 644 $(SHARED3) $(DESTDIR)$(libdir)
ln -s $(SHARED3) $(DESTDIR)$(libdir)/$(SHARED2)
ln -s $(SHARED2) $(DESTDIR)$(libdir)/$(SHARED1)
for header_dir in $(HEADER_DIRS); do \
mkdir -p $(DESTDIR)$(includedir)/$$header_dir; \
done
for header in $(HEADER_FILES); do \
install -m 644 include/$$header $(DESTDIR)$(includedir)/$$header; \
done
This will then allow you to install the files into /usr/local, by simply doing:
make install
However, the reason it's so heavily parameterised, is so you can change the destination folder, without having to modify the Makefile. For example, to change the destination to /usr, you simply do:
make prefix=/usr install
Alternatively, if you'd like to test the installation process, without messing with your filesystem, you could do:
make DESTDIR=/tmp/rocksdb_install_test prefix=/usr install
This would put the files into /tmp/rocksdb_install_test/usr which you can then check to see if they're where you want them to be... when you're happy, you can just do rm -Rf /tmp/rocksdb_install_test to cleanup.
The variables I've used are essential for packaging with RPM or DEB.
I an use ubuntu 16.04
DEBUG_LEVEL=0 make shared_lib install-shared
In this way, the installation is already generated in the production mode.
If you want to save time, you can specify the quantities of processors used in the process by passing -j[n], in my case, -j4
DEBUG_LEVEL=0 make -j4 shared_lib install-shared
In the case of ubuntu, this is sufficient, but in the case of ubuntu for docker, you should specify where the lib was installed.
export LD_LIBRARY_PATH=/usr/local/lib
Hope this helps.
Kemper
I'm using the new domain feature of PackageMaker (introduced for Mac OS 10.5) to target the user home directory. I have created a .pmdoc file in PackageMaker.app, and everything works perfectly until I add my post-install script. Then, suddenly, my package wants root authorization when it didn't before. I've tried building from the command-line using packagemaker --doc mypackage.pmdoc --info Dist/PackageInfo supplying a tweaked PackageInfo file that explicitly specifies auth="none", but this doesn't work. When I investigate the output package by extracting it with xar -xf package.pkg, authentication seems to be specified in package.pkg/Distribution, an XML file that packagemaker generates for itself.
Due to frustration with the GUI, I've switched to using only packagemaker on the command line. However, now my packages don't display my user interface files (although they are included in the .pkg archive), and still demand root authentication. The offending line in the generated Distribution file is (notice auth="Root"):
<pkg-ref id="org.myUniqueID.pkg" installKBytes="12032" version="1.0" auth="Root">#grooveshark.pkg</pkg-ref>
This is how I run packagemaker:
packagemaker -r ./Grooveshark -f ./Dist/PackageInfo -s ./Dist/Scripts -e ./Dist/Resources -v --domain user --target 10.5 --no-relocate --discard-forks --no-recommend -o ./out.pkg
This is the layout of Dist:
Dist/Distribution # this isn't used by packagemaker, it generates its own
Dist/PackageInfo
Dist/Resources/en.lproj/background
Dist/Resources/en.lproj/License
Dist/Resources/en.lproj/ReadMe
Dist/Resources/en.lproj/Welcome.rtfd
Dist/Resources/en.lproj/Welcome.rtfd/gsDesktopPreview-mini.png
Dist/Resources/en.lproj/Welcome.rtfd/gsDesktopPreview-searchSmall.png
Dist/Resources/en.lproj/Welcome.rtfd/TXT.rtf
Dist/Scripts/jsuuid # specified as a postinstall in Dist/PackageInfo
Dist/Scripts/postflight
How can I configure my package so it will run a postinstall script without demanding root authentication? Is there some way I'm missing to specify both a PackageInfo file and a Distribution install-script XML file via the command line?
I ended up moving files int place in a distribution layout, then I used the following script to first build a traditional flat package, then expand it, copy in the settings that allow for per-user installation, then use a different process to compact it in-place, without processing, back into a PKG.
#!/usr/bin/bash
# Build Package for local install using witchcraft
PROJECT="some/filesystem/location/with/your/files"
BUILDDIR="$PROJECT/Dist/build"
PKGROOT="$PROJECT/Dist/Package_Root"
INFO="$PROJECT/Dist/PackageInfo"
DIST="$PROJECT/Dist/Distribution"
RESOURCES="$PROJECT/Dist/Resources"
SCRIPTS="$PROJECT/Dist/Scripts"
# Remove .DS_Store files
find "$PKGROOT" -name ".DS_Store" | sed 's/ /\\ /' | xargs rm
# make build dir
mkdir "$BUILDDIR"
# build flat package that needs root to install
packagemaker -r "$PKGROOT" -f "$INFO" -s "$SCRIPTS" $ARGS -o "$BUILDDIR/flat.pkg"
# Build distribution that installs into home dirs by unpacking the flat pkg
echo "Building Distribution"
echo " Copying filesystem"
cp -r "$RESOURCES" "$BUILDDIR/Resources"
cp "$DIST" "$BUILDDIR/Distribution"
echo " extracting flat package"
pkgutil --expand "$BUILDDIR/flat.pkg" "$BUILDDIR/grooveshark.pkg/"
rm "$BUILDDIR/flat.pkg"
echo " flattening distribution"
pkgutil --flatten "$BUILDDIR" "$PROJECT/$1.pkg"
echo "Finished!"