I ran this command to release disk space on anaconda
$ conda clean --all
However, there are still some big files that remain in pkgs folder in anaconda python.
Is it safe to manually delete all the files in pkgs folder? Any risk of corrupting my anaconda environment? What are some side effects, if any?
I am using anaconda 2018 on windows 10.
Actually, under certain conditions it is an option to have the pkgs subdirs removed. As stated here by Anaconda Community Support "the pkgs directory is only a cache. You can remove it completely is you want to.
However, when creating new environments, it is more efficient to leave whatever packages are in the cache around."
According to the documentation you can use conda clean --packages to remove unused packages in pkgs (which will move them to pkgs/.trash from which you can then safely delete them). While this does not check for packages installed using symlinks back to the package cache, this is not a topic if you don't use such environments or work under Windows. I guess that's why conda clean --packages is included in conda clean --all.
To more aggressively save space you can use conda clean --force-pkgs-dirs to remove all writable package caches (with the same caveat that there could be environments linked to these dirs). If you don't use environments or use Anaconda under Windows, you're probably safe. Personally, I use this option without issues.
Edit Commentary
After reviewing the documentation pointed out in #Robert's answer, I must admit my initial response was overly alarmist and, in parts, blatantly incorrect. My apologies for the misleading response.
Nevertheless, I do believe some of what I raised still has some merit for this thread, and so I am deciding to retain the answer with amendments. In particular, I think it worth emphasizing that deleting the pkgs directory may not actually achieve what OP was hoping for (to save space) and that removing the package cache undermines Conda's redundancy minimization strategy going forward by making it impossible to share already installed packages.
Instead, my final recommendation concurs with what #Robert suggested, namely, use conda clean -p to delete unused packages, but keep the cache (pkgs dir) so that future environments can still leverage hardlinks. One last point to note, is that some tools, such as conda-pack, rely on the integrity of the package cache in order work, so deleting pkgs will prevent their use.
Amended Original Response
No, it is definitely not safe, and in fact the only way you would actually free disk space is if you broke your base env. The issue is that all envs use hardlinks to the pkgs directory, so even if you delete the link located in the pkgs directory, the ones in the envs will still be there and so you won't delete any physical files on the disk. The only real deletion you might do is something that is only referenced by base, i.e., the only copy is in pkgs, hence the potential for a breaking base.
Correction: The base env still links packages to other locations, so deleting pkgs will not impact base as I originally concluded.
I'd highly recommend looking at this other post on estimating the real disk usage of Conda. You may be overestimating how much space is really being used. For most files in pkgs, there is only one physical copy, so there isn't any additional manual optimization to be done.
Related
One of the brilliant aspects of Firely.Terminal is its ability to interoperate with the local FHIR package cache (~/.fhir) in a way that is fully compatible with HAPI tools using the cache. Sadly, that no longer seems to be the case.
Today I updated Firely.Terminal to version 2.4.2 and it seems that the new version walks all over the FHIR package cache, changing files without having been asked to.
It used to be that the only thing Firely.Terminal changed in existing packages was the generation of a missing .index.json. For newly installed packages, the only difference to a HAPI-installed package was some additional fields in .index.json (presence of some fields containing null which would normally be suppressed, and the addition of a fhirVersion field).
When the new Firely.Terminal is told to add a package to a scope (fhir install) it automatically 'bakes' it, which seems to involve things like snapshotting all StructureDefinition resources and expanding all ValueSet resources. Even resources whose content remains unscathed get their timestamps trashed. The same fate befalls all packages that are listed as dependencies in the manifest of the package being added to the scope.
There is an 'unbake' command (e.g. fhir unbake --package kbv.ita.for#1.0.1) but this does not operate recursively. What's more, when it says 'Bake successfully removed from KBV.ITA.FOR#1.0.1' (note the erroneous capitalisation) then that is an outright lie - the contents of the package directory are completely unchanged, except for the removal of the file .bake.json.
Hence the only way of restoring the package cache to working order is to identify all trashed packages, delete them all, and then reference them with some HAPI tool in order to get them re-cached.
I wouldn't mind so much if Firely.Terminal trashed its own cache. But what it destroys is the global HAPI package cache for the current user, and that is simply not acceptable.
Is there any way of suppressing the destructive behaviour of Firely.Terminal? Ideally globally (with machine-wide effect), but a secret command switch would do in a pinch. If that is not possible: does anyone know which of the older versions is the newest that still works, and where to get it?
Note: if the cached packages are write-protected then Firely.Terminal doesn't take the hint - it tries to clobber the files anyway and spews out oodles of 'access denied' messages. What's more, it doesn't even stop when an error occurs; instead it continues on its merry way and trashes everything that one might have forgotten to write-protect.
Background: one of the properties of the FHIR package cache that is important for our work is that the files in the cache are exactly the same as those in the (normative) published packages. In particular, we need profiles published without snapshots to not contain snapshots, value sets published without expansions to not contain expansions and so on. For one thing, this makes it possible to verify that the cached files are exactly the same as those contained in the published packages (or fixed versions thereof). For another, we need to control the context in which profiles are snapshotted, value sets expanded and so on because it may be necessary to supply dependencies that are different from those declared in the package manifest. The latter is sometimes necessary because the profile/package version management in the context of electronic prescriptions in Germany is a bit, erm, peculiar and can diverge from FHIR standards. For this to work at all the resources must be snapshotted/expanded dynamically (depending on the use context), not statically on disk. Things are moving in a more standards-compliant direction but we are not quite there yet.
Latest version without bake (on install)
From some quick testing of the latest versions of Firely Terminal, it seems 2.2.0 is the latest without bake functionality (and auto-bake on install). Installation instructions:
> dotnet tool uninstall --global Firely.Terminal
> dotnet tool install --global Firely.Terminal --version 2.2.0
Baking
The bake functionality has been introduced to provide packages with snapshots, because not all downstream tooling (most notably sushi) are able to generate these themselves.
Currently bake might be a little too aggressive by default, also recalculating snapshots for packages that already have them. In principle, this should not be a problem since snapshots are a just a cache for the calculation of all the layered differentials. Since snapshot logic still evolves it might even be desirable now and then to recalculate. But in newer versions we will look to:
Change the default to not recalculate when already provided
Provide a global setting to change that default to never calculate/always (re)calculate snapshots
This should prevent Firely Terminal from touching any files that don't need touching in the package cache. I'm not sure from your question if there was anything broken in the state of the shared FHIR cache after 'baking', given your use of 'thrashing' and 'destroying'?
Unbaking
The unbake command is intended to remove snapshots from a folder of packages. I see in my testing that it's not doing that, which I'll take as an issue to fix.
I just used Clean My Mac's space lens feature to understand what was eating my disk space and I found this under ~/Libary/Caches
Even with the biggest imagination, I can't think at a reason for that folder being so big, is it possible to safely (and periodically) delete this folder?
Thank you
Yes, you can delete that directory (or run yarn cache clean -- see How to clear cache in Yarn?).
Yarn, by default caches the packages it downloads (including different versions). If you delete this cache, the main side-effect that you'll see is it may take longer to run a yarn install because it will need to fetch the necessary packages again.
I installed Miniconda a while ago, and since then I've noticed there seem several copies of the same files (or files with very similar names) in different locations on my computer.
For example, almost the exact same files in my folder "C:/ProgramData/Miniconda/pkgs" are also in the folder "C:/Users/me/.conda/pkgs". I should note that the only other things in the ".conda" folder is an "environments.txt" file and and "envs" folder with a file called "conda_envs_dir_test".
I've also noticed that the folder "C:/ProgramData/Miniconda/Lib/site-packages" also contains files with very similar names.
Anyway, I wanted to ask if all this is necessary, and why? Sorry if this seems like a weird question. I'm still relativity new to programming.
Conda Package Caching
Conda downloads and unpacks packages into a package cache, and then uses hardlinking to install those packages into environments. One can freely delete the files in the package caches, though this undermines Conda's ability to minimize redundancy across environments going forward. The safest way to clear the package cache is to use the command
conda clean -tp
Multiple Package Caches
It should be noted that you appear to have two package caches, a system-level cache at C:/ProgramData/Miniconda/pkgs and a user-level cache at C:/Users/me/.conda/pkgs. This occurs when users install with the "Install for All Users" option. This is typically not recommended for regular end users, but rather more for System Administrators who are managing a multi-user installation. Conda functions perfectly (and arguably with less hassle) without ever needing elevated privileges.
All that to say, you may need to elevate your privileges for the conda clean command to also clear out the system-level cache. Additionally, if you haven't been using it too long, you may consider uninstalling the system-level install and reinstalling at the user level.
at work we have a central, read-only, Linux Anaconda installation, and several projects need library packages for their individual project members.
Is there a way to conda install packages in a writable area set aside for each project?
Our Linux servers are also not directly web connected, but we can transfer data from a Windows machine that is. Is there a way for the windows conda to download data for our Linux install in such a way that I can transfer the downloaded files to Linux and then finish the install on Linux , with the conda linux not needing a direct web connection?
Thanks in advance :-)
The best answer to this question is a bit oblique: the Anaconda Distribution is designed for a single user on a single system with unrestricted access to the Internet. Any other use is considered "off label" and YMMV, though there are no license restrictions in place preventing you from trying to use it as you see fit. Anaconda Enterprise is the commercial product that is specifically designed for multi-user, server-deployed Anaconda with firewall restrictions. Security, governance, indemnification, support, collaboration, etc. etc. Check out https://www.continuum.io/ for more details.
But there are "work around" ways to achieve what you want, albeit complicated ones. For it to be reliable, reproducible, and maintainable you're going to end up reimplementing a lot of what is in Anaconda Enterprsie. Here are some tips:
Check out the "conda in multi-user environments" documentation
Check out the "Centralized Anaconda installation" documentation
Regular user alice for project foo can do conda create -p /nfs/project/foo/envs/custompython --offline anaconda; conda activate /nfs/project/foo/envs/custompython; conda install pkg1 pkg2 pkg3
You're going to run into ownership/permission issues. If you have sensible umask values then when alice's colleague bob tries to update pkg2 in the foo project he'll discover that he can't unlink the files alice wrote there. There is stuff you can do (as the IT admin) with chown, or alice can do with chmod, but its all a bit of a bother and there are lots of ways you can paralyze a conda environment because it is expecting "writability" to be binary for a particular environment. There is a long history in the conda GH issue tracker of people (myself included) shooting themselves in the foot by starting a conda env setup with one account and then making mods with another account that bork out half way through, leaving everything inconsistent.
Be careful about .condarc files. My advice: avoid them everywhere but in the base Anaconda installation (say, inside /opt/anaconda/.condarc). All sorts of weird stuff can happen when multiple overlaying .condarc files come together (the docs reference above discusses this).
People can create their own environments in an "offline" mode so long as the packages specified in those new environments (and their dependencies) are a subset of the packages available in the base environment (or subsequently added to the package cache), taking into account versions as well, of course.
You can download packages using your online Windows machine by grabbing them from repo.continuum.io and from anaconda.org. Make sure you download them for the right platform. But the challenge: you need to download a set of packages that will satisfy the dependencies of the package you want to install. There isn't a super easy way to get that information when you're offline.
Once you drop new packages into the Linux system's package cache be sure to re-run conda index.
Beware installing packages directly from their tarballs: this will not pick up any dependencies and does what is called a "force" install. So doing conda install /path/to/conda/pkg-ver.tar.bz2 is actually most similar to doing conda install --force --no-deps pkg=ver (though not identical, to be sure). --force means the install will happen NO MATTER WHAT, even if it will break your environment (violate existing package dependencies), and --no-deps means you won't get any of the dependencies of pkg installed.
Can I create a symlink to the local extension from aonther project folder? I have a common local-server and i need to implement same extension on all local project-installations. I tried to put the symlink, but some times i do not get expected output. I get it only after clearing the cache of that perticular project.
Your scenario is a common one I guess. But as Omar said, linking to the same code base of the extension through several typo3 instances is not a good practice.
But we have the same structure as yours, we realize this through SVN. All of our projects got a SVN repository and common extensions have their own repository. Through svn:externals the extensions are linked into the concrete project. This has the advantage that you can change the extension in the concrete project and after committing all other projects (that do have to update from svn though) contribute from it. I Think this would fit your needs, too.
If I understand your question correctly you have several Typo3 sites on the same server and want to share an extension between them using a symlink. I don't think that is a very great idea because many extensions use tables and every site normally has it's own database so you would have to do a lot of tinkering to get that to work.
Instead you could make all the modifications to the extension files in the typo3conf/ext/extension_name folder and then export the extension to a t3x file (Ext Manager in the Backend). This t3x file can be installed as a extension (Import extension) on all your other sites.
If you extension does not use a database and you are planning to make frequent changes then I guess you should be able to make that work (the symlink). Otherwise I recommend you use the first approach I described.
I have not tried this, but you should be able to install extensions globally in Typo3. What this means is that the given extension is placed inside '(typo3_src/)typo3/ext/' instead of 'typo3conf/ext/', presuming both sites use the same Typo3 Core/Source (and thus typo3_src is a symlink to the location of the core).
You can enable installing global extensions via the Install Tool. Once inside the tool, click on 'All Configuration', then search for allowGlobalInstall. Or put the following line into your localconf.php:
$TYPO3_CONF_VARS['EXT']['allowGlobalInstall'] = '1';
At last, but not least, you need to make sure the 'typo3/ext/' directory is writeable.
Hope this will be to some help. If you have any further questions, let me know :)
As Björn mentioned, I'd sugegst to install them globally. Mind you, updating the source will require to move the extensions accordingly..
As for "expected output": be aware that the code in these folders is cached in various ways (mainly page content and config settings), and hence not always run. This is the reason a change done from "outside" the current installation is likely not to propagate to your output without clearing these caches (as you have observed).
When you actually install an extension via the extension manager, the cache should (if correctly configured) be cleared (interested parties may search for clearCacheOnLoad in class.em_index.php to reveal a clear_cacheCmd('all')). There is a small checkbox, which is normally checked, during the installation process to accomplish this.
Omar's first approach is therefore, as I see it, the more easy way to get "expected output" and less jumbling around with global extensions.