What is the current full install size of Cygwin? - windows

Every source I found online says a full installation of Cygwin takes over 1 GB, but mine is only 100 MB. I was pretty sure I downloaded everything from the mirror servers, but the install took less than 5 minutes to complete instead of hours, as I'd expect if it were installing gigabytes of software.
Did Cygwin get a huge clean-up during 2012~2013, or did I do something wrong in the installation?

A full Cygwin installation can range from 23 to 112 GiB, depending on how you define "full."
Your 100 MB number tells me that you just clicked through the defaults presented by Cygwin's setup-*.exe program, selecting no optional packages, because that installs only the Base package set, which currently amounts to 0.1 GiB. Cygwin follows the modern net-connected software distribution model: it assumes you can just run setup-*.exe again and select new packages as you need them.
The Cygwin maintainers try to keep the Base category's package set as small as practical.¹ A Cygwin Base install gives you something much like an old-style Unix installation, covering little more than what POSIX specifies.
So How Do You Get a Full Installation?
The Cygwin installer does not have an obvious way to get a "full" installation, on purpose, because no one needs literally every package in the Cygwin repository.²
There is, however, a sneaky way to install everything. At the Select Packages screen...
...switch to Category view, then click the "Default" text to the right of the "All" group header. It will change to "Install," as will the corresponding text in all of the groups underneath it. This marks everything for installation.
I include this tip for completeness only. You do not want to do this! It will install gigs and gigs of stuff you will never use. Currently, there are 11242 packages in Cygwin,³ and installing every last one of them took 93 GiB of disk space for the installation tree plus 19 GiB for the download tree⁴ the last time I tried it. That gives the 112 GiB upper limit above.
All that unused software carries several costs. Even if disk space, download time, installation time, rebasing time, and local bandwidth use are of no consequence to you, consider the generous free mirror's wasted bandwidth.
I've done the experiment, so now you don't have to.
An Intelligent "Complete" Installation
I've come up with a simple set of package exclusion rules that results in a much smaller installation:
Skip all of the -debuginfo packages. Few people need these, and they take up a lot of space. Savings: about 53 GiB in the installation tree alone; more in the download tree.
It's easy to apply this rule. After selecting all packages for installation with the sneaky trick above but before you move on to the next screen, click the "Install" text next to the "Debug" category header until it switches back to "Default."
If you've already installed the debug packages, click that text until it says "Uninstall" instead.
Do not explicitly install any of the lib* packages. Let Cygwin's setup-*.exe automatically install libraries to satisfy package dependencies. Savings: about 5 GiB ⁵
To apply this rule, switch the "Libs" category to "Default" or "Uninstall" as you did with the "Debug" category. The installer will figure out which libraries you actually need in a later step.
Skip the cross-compilers and associated packages. Again, few people need these.⁶ Savings: About 4 GiB
There are two major sets of cross-development tools in Cygwin: the set for creating Cygwin executables of the other word size (i.e. 64-bit tools and libraries for 32-bit Cygwin, or vice versa) and the set for building MinGW executables of the same word size as your Cygwin installation.
To apply this rule for a 64-bit Cygwin installation, while still on the "Select Packages" screen, type cygwin32- in the package name search box at the top of that screen, then click the Default text next to each top-level category until it cycles to Default or Uninstall, as above.
Repeat that for mingw64-.
The idea is the same for 32-bit Cygwin, except that you search for and exclude packages with cygwin64- and mingw32- in their names instead.
By following this rule set, I was able to install nearly everything, taking only about 23 GiB.
Paring That Down
We can get the installation to be even smaller by excluding several other notorious disk hogs:
X11, the desktop environments, and the GUI apps together require about 11 GiB.⁷
A Cygwin Base + Devel installation comes to about 10 GiB.
A Cygwin Base + TeX category installation takes about 5 GiB. If you install only your native language's support package, it comes to about 3.7 GiB instead.
All of the -doc packages combined chew up about 5 GiB of disk space.
Someone who isn't a software developer, who doesn't use Cygwin for GUI stuff, who uses the Web for docs, and who doesn't use TeX for document creation could thus have a "full" Cygwin installation in only about 1 GiB.⁸
If you use Cygwin the way it is intended to be used, installing the base and only the extra packages you need at the moment, you probably won't even get your installation size to even those levels. I use Cygwin this way, and my installations are typically well under 1 GiB, yet they are "complete" by my lights, since they meet my current needs.
For Extra Disk-Filling Fun...
All of this testing was done with the 64-bit version of Cygwin. You can roughly double the above space requirements by installing a parallel 32-bit Cygwin installation.⁹
Doing so is not pointless. It is a viable alternative to using cross-compilers, for one thing. For another, the fundamental incompatibility of the two Cygwins means you may have need of both.
Footnotes
There are occasional threads on the Cygwin mailing list where someone argues that some very common package should be included in the Base category, such as Perl, and the result is usually that the maintainers decide not to add it to Base.
You also occasionally see the opposite: some package accidentally slips into Base, usually via an incorrect dependency. Shortly after the problem is brought to the attention of the Cygwin maintainers, the status quo ante is restored. (example)
Perhaps I am wrong.
There could be a vision-impaired Czech immigrant musician who completes US government software development contracts on the side while not busy brushing up on his technical Hindi vocabulary by translating electrical engineering reports into his adopted Mandarin.
I want to meet him. He sounds like an interesting guy.
Plus, I think I can help him with his plan to create a Tcl/Tk GUI front end for Orpie. Naturally, I will try to talk him into porting it from Ocaml to C++/Qt.
I mean, Tcl/Tk, seriously? In 2018? Let's not be ridiculous.
curl -s https://cygwin.com/packages/package_list.html | grep -c x86_64/
Cygwin's setup-*.exe doesn't delete the downloaded package files after installing them. This is useful at sites where you have multiple Cygwin installations, since you can put the download directory on a shared network drive. Each package then only has to be downloaded once at that site.
My 105 GiB upper limit assumes you will download and install to the same drive, and that you will keep the download tree in case you need to reinstall it later.
Not only does setup-*.exe not delete downloaded package files after installing them, it doesn't auto-purge old versions, so your download tree grows and grows over the years you use Cygwin. (There are scripts floating about the net to solve this problem, such as this one.)
All data storage values given in this answer are apparent disk usage numbers — du -bhs — rather than actual disk usage numbers, which would account for the file system overhead, since that varies between systems. This affects the installation tree to a much greater degree than it does the download tree since the proportion of small files is much greater in the installation tree. Expect something like +1% in the download tree and +5% in the installation tree.
You may wonder why there are libraries in the Cygwin package repository that you don't need even though you've installed "all" packages. There are several reasons:
some libraries are obsolete, but are still present on the mirrors
some libraries come in multiple alternative forms, so that people who know they need something other than the default can choose it
some libraries are there only for people writing their own programs, not to support any existing Cygwin package
Pretty much the only people who need the Cygwin cross-compilers are the people maintaining Cygwin packages, since maintainers are expected to build for both 32-bit and 64-bit Cygwin unless there is a good reason not to.
There are probably more people with a good justification for MinGW cross-development tools, but there's also the option of using MinGW and MSYS instead of Cygwin. Also, I am guessing that the number of people who do dual-stack Cygwin + MinGW development is smaller than the set of people who use one or the other exclusively, or nearly so.
It is not easy to do this exclusion, because GUI packages are scattered throughout the Cygwin packaging system, and their top-level category often contains non-GUI software, so you can't simply exclude the whole category. (e.g. Math.)
The first pass is to exclude the X11, GNOME, KDE, LXDE, MATE, Xfce, and Games categories using the same Categories view technique as above.
Then, using the search box as we did for the cross-compiler exclusion above, remove packages matching gtk, gnome, qt, and kde, optionally excluding those in the Devel, Debug or Libs packages, if you need those.
Finally, you'll have to switch to the Pending view and manually exclude a bunch of packages that weren't caught by either of those two broad exclusions: Abiword, Calligra, Celestia, Dia, Evince, Geany, gEdit, Geomview, Gimp, gLabels, GnuCash, Gnumeric, Kalzium, KMyMoney, KolourPaint, Konqueror, Krita, KStars, LyX, Marble, Pidgin, QupZilla, Scribus, Skrooge, Spectacle, Stellarium, Tellico, and Vinagre. If you don't exclude these, they'll drag back in much of what you excluded above as dependencies!
You may notice that the size of all the excluded package sets adds up to more than the 23 GiB of the "intelligent complete" installation. This is because of shared dependencies. That is to say, these sizes overlap to some extent.
If you put both setup-*.exe programs in the same directory, they will share the download tree so that all of the noarch packages are downloaded only once for both Cygwins. Between that and the fact that 32-bit and 64-bit Intel compilers generate code that differs in size, installing both Cygwins doesn't quite double the disk space requirement.

When installing packages for the first time, setup*.exe does not install every package. Only the minimal base packages from the Cygwin distribution are installed by default.
Clicking on categories and packages in the setup*.exe package installation screen will provide you with the ability to control what is installed or updated.
Clicking on the "Default" field next to the "All" category will provide you with the opportunity to install every Cygwin package.
Installing and Updating Cygwin Packages

I just did a full installation of Cygwin x64 (2018-01-03) on Windows 7 SP1 x64 Ultimate:
Cygwin folder (which by default is C:\cygwin64) has the following properties:
Size: 89.1 GiB (95,671,245,179 bytes)
Size on disk: 92.3 GB (99,148,398,592 bytes)
Contains: 1,433,494 Files, 94,363 Folders
Cygwin temporary folder where downloaded packages are stored prior to being installed (can be removed after Cygwin is installed):
Size: 18.0 GiB (19,363,639,932 bytes)
Size on disk: 18.0 GiB (19,383,767,040 bytes)
Contains: 9,711 Files, 10,354 Folders

I would like to add to this thread. This approach gives you a leaner, meaner, bare-bones, minimal Cygwin install, with just the tools/items you need. No dependency bloat, no unwanted packages, files etc.
I have been experimenting with Cygwin attempting to get a "bare-bones", minimal install. I do find that installing utilities like grep, gawk, sed and similar tools has dependencies on cygwin, base-Cygwin and sometimes unwanted tools like bash, coreutils etc.
I wanted to get only the tools and their required dlls installed and started examining the Cygwin package. I discovered that not using the setup.exe supplied by Cygwin is an alternative way to accomplish minimal Cygwin installs.
And this is how I got it done:
Download only the packages you want from any of the Cygwin mirror
sites using ftp or http. Alternatively you can use the setup.exe
supplied by Cygwin to download all the packages - download only and
no install.
Once the download is successfully completed, individual
packages like zlib, gawk, grep, libiconv are found under the
x86/release or x86_64/release directory Each package is 'tar'red and
compressed using tool 'xz' or bzip and stored in respective
directories.
To install a specific tool like sed or gawk, all that
needs to be done, is to extract the tool executable and its
dependencies (.dll)
Before you attempt the following, please ensure you have a tool like 7z.exe, xz.exe, bzip2 or other that is capable of uncompressing an .xz or bzip archive
Installing gawk example below :
Extract gawk.exe from gawk-4.1.3-1.tar.xz archive using the command 7z.exe e -so gawk-4.1.3-1.tar.xz | tar xvf -
Once that is done, you should find gawk.exe in a subfolder usually, usr/bin under the release/gawk folder
Find the dependencies for gawk, you can do this in a couple of ways.
Examine the Cygwin setup.ini file found in x86 or x86_64 folder.
Look for the string '# gawk' and in the lines after this line you should find a "requires:" line that lists the dependencies.
Mine reads like this - "requires: bash cygwin libgmp10 libintl8 libmpfr4 libreadline7"
For gawk to run, bash is not a must since we have the windows command shell. (bash is included to get a few other dlls required by gawk. However, that causes a lot more unecessary files to be installed.) The other dependencies contain files that gawk needs to run.
Extract each of the above packages using tools like 7z or xz into individual files.
After all the dependencies are extracted, copy your needed tool(s) (grep/sed/gawk) to a folder and all the dependent .dlls
You should now be able to run your tool with the minimum set of .dlls required in a bare-bones cygwin installation.
Caution : It may not be sufficient to just extract the dependencies listed in setup.ini for each tool. Sometimes, you may need to execute/run the tool to discover that there are more dlls required.
There are other means of finding out the dlls required by an .exe - you can use dumpbin from MS or dependency walker, ndepends or similar tools to find the list of dependent dlls
Consult - How do I detect the DLLs required by an application?
How do I find out which dlls an executable will load?
I also brute forced this dll info by just running the tool and installing the missing dlls listed one by one by extracting from the required packages.
When you run a tool and it errors out with a missing .dll message, search for the package that contains the dll here - https://cygwin.com/cgi-bin2/package-grep.cgi . Enter the full/partial name of the missing dll to find the name of the package containing the dll.
Eventually, I have ended up with a bare-bones cygwin install with only the tools and dlls that I need.
Example : gawk - gawk.exe and the following dlls - cygwin1.dll, cyggmp-10.dll, cygiconv-2.dll, cygintl-8.dll, cygmpfr-4.dll, cyggcc_s-seh-1.dll, cygncursesw-10.dll, cygreadline7.dll
sed - sed.exe and dlls - cygwin1.dll, cygintl-8.dll
Hope this is found useful. The Cywin installer also does dll re-basing, which I will not venture into here.

Under the devel you may need only the following:
gcc-core:GNU Compiler collection(C Open MP)
gcc-fortran:GNU Compiler collection(Fortran)
gcc-g++:GNU Compiler collection(C++)
gcc-objc:GNU Compiler collection(Objective-C)
gdb: The GNU Debugger
make: The GNU version of the "make" utility
Photo

Related

Does 'make'-ing something from source make it self-contained?

Forgive me before I start, as I'm not a C / C++ etc programmer, a mere PHP one :) but I've been working on projects that use some others sourced from online open source repos, such as svn and git. For some of these projects, I need to install libraries and then run "./configure", "make" and then "make all" (as an example) and I do this on a "build" virtual machine to get the binaries that I need to use within my project.
The ultimate goal of some of my projects is to then take these "compiled" (if that's the correct term) binaries and place them onto a virtual machine which I would then re-distribute (according to licenses etc).
My question is this : when I build these binaries on my build machine, with all the pre-requisities that I need in order to build them in the first place ("build-essential" and "cmake" and "gcc" etc etc) - once the binaries are on my build machine (in /usr/lib for example) are they self-contained to the point that I can merely copy those /usr/lib binary files that the build created and place them in the same folder on the virtual machines that I would distribute, without the build servers having all the build components installed on them?
With all the dependencies that I would need to build the source in the place, would that finally built binary contain them all in itself, or would I have to include them on the distributed servers as well?
Would that work? Is the question a little too general and perhaps it would all depend on what I'm building?
Update from original posting after a couple of responses
I will be distributing the VMs myself, inasmuch as I will build them and then install my projects upon them. Therefore, I know the OS and environment completely. I just don't want to "bloat" them with unnecessary software that's been installed that I don't actually need because the compiled executables I will place on the distributed VMs in for example /usr/local/bin ...
That depends on how you link your program to libraries it depends on. In most cases, the default is to link dynamically, which means that you need to distribute your executable along its deps. You can check out what libraries are required to run the file using ldd command.
Theoretically, you can link everything statically, which means that library code would be compiled into executable. Thus, executable would really be self-contained, but linking statically is not always possible. This depends on actual libraries you are using and probably require playing with ./configure args when building them.
Finally, there are some liraries that always linked dynamically, such as libc. The good thing is that machine you are distributing to would surely have this library. The bad thing is that versions of these libraries may differ, and you might face ABI mismatch.
In short, if your project not huge and there is possibility to link everything statically, go this way. If not, read about AppImage and Docker.
The distribution of built libraries and headers (binary distribution) is a possible way and should work. (I do it in my projects always.)
It is not necessary that all of the libraries you built are installed into /usr/lib. To keep your target machine clean you can install it in other folder to, e.g.
/usr/local/MYLIB/lib/libmylib.so
/usr/local/MYLIB/include/mylib.h
/usr/local/MYOTHERLIB/lib/libmyotherlib.so
/usr/local/MYOTHERLIB/include/libmyotherlib.so
Advantages:
Easy installation, easy remove
All files within one subfolder, no files are missing, no mix with other libs
Disadvantage:
The loader must know the extra search path

Not everyone has the libraries needed for my program to run

So I made this executable program that uses the Windows library and some others (string, ctime, lmcons...) in C++. When it runs on my computer it works great but when I transfer the executable to a computer that does not have some of those libraries on it the program does not run. How do I "add" those libraries in with my code?
1 - You need to identify libraries that need to exist on the system in order to execute your application.
2 - you need to create a package that contain these libraries. It could be an installation or a zip file. Depending on the libraries, sometimes they need to be registered on the system, sometimes just dropped in. If you use install packaging software, you can set up registration [if needed]. If you distribute zip or ftp folder, you may need to supply script file. Sometimes libraries are part of some Microsoft package and this package can be prerequisite to run your application. You may pack it into your installation and have it installed silently. There are many ways as you see.
3 - this is up to you how you want to distribute your application and supporting libraries. But best is when user doesn't have to jump the hoops trying to install your stuff. User should click and forget.

PAR Packer Executable Size

I have been using PAR:Packer (pp) to create binaries for windows for a while. They have always been understandably large (around 6-8MB), recently I updated my packages (I use stawberry perl on windows) and now it is producing binaries that are almost 20MB! I understand it is including an entire perl environment with all needed modules, but it is getting out of hand, and a little embarrassing to hand someone a simple script that is 19MB! Is there anyway to reduce the binary size? Any one know why the size has increases in the last couple releases or PAR Packer?
When you install new modules in your Perl environment (or even only upgrade some), pp will include a lot of additional files over time even if they are not used by your script. It's because pp is more safe than strict concerning what to include that it picks up a lot of useless dependencies.
The trick is to use two Perl environments. One for development and another one only for building binaries. For the building environment, start from a fresh Perl install, install PAR::Packer and only the modules needed by your application.
With my building environment, I can produce a Tk binary of only 5 Mb. With my development environment (which has a lot of junk from CPAN) on the same machine, the same script is 13 Mb.
What difference does it make? IMHO, I wouldn't be concerned with size, only with performance
PAR doesn't hide anything from you, you wantIf you're curious to know which files get packed, if they're stripped of pod, etc etc, all you have to do is look inside unzip -d foo foo.exe
FWIW, AFAIK, the size hasn't changed, so you must be using lots and lots of modules.
Typical no-module-print-print-pl yields about 1.6M
Load Moose/Tk and you're at about 5.1M
Load Gtk2/Glib/Pango/Cairo/threads and you're at about 9.7M

How should open source libraries be used on Windows?

There are many open-source libraries that can be compiled with Visual Studio. I'm porting a program from Linux to Windows, but it depends on a number of libraries. I don't know what the best practices regarding libraries are on Windows.
On Linux, these libraries are typically part of the distribution. To use sqlite on Debian, for example, you need only to install libsqlite3-dev and the include files and libraries (both static and dynamic) are automatically installed and available to your program.
If you need a different version than your distribution supplies, you can compile it in your home directory, install it to ~/include and ~/lib, and set the appropriate environment variables so that your compiler includes those directories in its search path.
What is the best way to use libraries that are distributed as source on Windows? If I link dynamically rather than statically, is there an easy way to copy required DLLs into the output directory to ease redistribution (assuming license requirements are met)?
Option 1 - Projects that have binary distributions for windows / do not build in DevStudio.
E.g. OpenSSL.
Projects like OpenSSL are best downloaded to their own folder and built using their own scripts. OpenSSL typically installs itself to C:\OpenSSL on windows builds, so one done, you can add C:\OpenSSL\include and C:\OpenSSL\lib to your project environment to access the OpenSSL headers and Libs. The actual dll files you will need to copy from C:\OpenSSL\bin into your projects staging folder (normally your SolutionDir\Debug or Release).
Once youve gone through the hassle of building OpenSSL once, you don't want to do it again. Or, if you've downloaded the binary distribution, its best left alone. Just document to others which binary distribution you used so they can set up their Visual Studio build environment appropriately.
Option 2 - Small libraries that are easy to create Visual Studio Projects for (or already have). Lua and sqllite fall into this category.
For projects that are small enough, it is not inconvenient to simply add them to your solution in a sub folder. This way you can get their outputs built directly to the solutions output folder, and you do not have to bundle pre-build binary files in your solution making it far easier to share the project with others.
Option 3 - As an alternative you could create your own standardized folder for the products of open source projects. Create C:\oss\include, c:\oss\lib, c:\oss\bin etc, add these paths to DevStudios lib and include paths, add c:\oss\bin to the systems PATH variable, as you build each OSS project, copy the appropriate files to these locations.
Again, while convenient, this setup makes it diffucult to replicate the build environment on a 2nd PC, so you might want to keep the entire C:\oss tree in source control as well.
On windows the easiest way is to build your own DLLs and include them in the program directory.
Yes it uses a bit more space, but HD are large these days and avoids a lot of headaches of incompatible versions (DLL hell). Windows also suffers a few more wrinkles with versions of libs built with different compilers so shipping your own builds is safest

How does an OS X installer package calculate required space?

I'm building an OS X Installer package for a product. When it is run, the 'Select a Destination' pane has an 'Installing this software requires X MB of space' label. But I can run the same package twice on the same machine, and see the claimed usage vary from, i.e. 85 to 127 MB, neither of which is the actual ~65MB usage of the product.
How does Installer calculate required space?
The installer .pkg file contains several components:
the archive of files to install
a bill of materials (metadata listing all the installable files)
resources for the installation itself (images, scripts, etc)
an Info.plist containing version information and defaults
The bill of materials, or "BOM", contains information such as permissions, file sizes, checksum, and so on. When the installer runs for a package the very first time, the total of the file sizes listed in the BOM is used to estimate the required size. (If there are any shared components, this will obviously affect the total.)
After an installation is complete, the BOM is saved in the package receipts folder (/Library/Receipts/boms) as a record of what was installed. The lsbom utility can be used to inspect the contents of these files.
On subsequent installations of the same package (as determined by the package identifier), the BOM receipts are consulted to determine what files are already installed, and their total size. The existing unchanged files are totalled and subtracted from the new files to be installed, while updated files that need to replace older files are taken into account too. The pkgutil tool can be used to display information about installed packages.
So this is why the installation size estimate can vary across installations. New and existing files add to the total, while existing unchanged files subtract from the installation requirements.
Could the installer be including any other files (Frameworks, StartupItems, Drivers, etc) that your program uses in the file size? If so, then the changes in sizes you are experiencing may be due to you not having those files at one point, and having them at another?
Of course, I could be wrong =]
I may be wrong but I'm guessing it's an aproxamation set by you the developer. You would put something like "Installing this software requires 120 MB of Space"
I know that when I install a product on my mac, I see what it says it will take, and I see what is currently available, however I NEVER go in and actually check that the software used EXACTLY what it said it would. especially if it's only about 50MB.

Resources