PAR Packer Executable Size

PAR Packer Executable Size - windows

I have been using PAR:Packer (pp) to create binaries for windows for a while. They have always been understandably large (around 6-8MB), recently I updated my packages (I use stawberry perl on windows) and now it is producing binaries that are almost 20MB! I understand it is including an entire perl environment with all needed modules, but it is getting out of hand, and a little embarrassing to hand someone a simple script that is 19MB! Is there anyway to reduce the binary size? Any one know why the size has increases in the last couple releases or PAR Packer?

When you install new modules in your Perl environment (or even only upgrade some), pp will include a lot of additional files over time even if they are not used by your script. It's because pp is more safe than strict concerning what to include that it picks up a lot of useless dependencies.
The trick is to use two Perl environments. One for development and another one only for building binaries. For the building environment, start from a fresh Perl install, install PAR::Packer and only the modules needed by your application.
With my building environment, I can produce a Tk binary of only 5 Mb. With my development environment (which has a lot of junk from CPAN) on the same machine, the same script is 13 Mb.

What difference does it make? IMHO, I wouldn't be concerned with size, only with performance
PAR doesn't hide anything from you, you wantIf you're curious to know which files get packed, if they're stripped of pod, etc etc, all you have to do is look inside unzip -d foo foo.exe
FWIW, AFAIK, the size hasn't changed, so you must be using lots and lots of modules.
Typical no-module-print-print-pl yields about 1.6M
Load Moose/Tk and you're at about 5.1M
Load Gtk2/Glib/Pango/Cairo/threads and you're at about 9.7M

Related

Recommended tool to automate complicated build procedure

I am developing an OS for embedded devices that runs bytecode. Basically, a micro JVM.
In the process of doing so, I am able to compile and run Java applications to bytecode(ish) and flash that on, for instance, an Atmega1284P.
Now I've added support for C applications: I compile and process it using several tools and with some manual editing I eventually get bytecode that runs on my OS.
The process is very cumbersome and heavy and I would like to automate it.
Currently, I am using makefiles for automatic compilation and flashing of the Java applications & OS to devices.
All steps, roughly, for a C application are as follows and consist of consecutive manual steps:
(1) Use Docker to run a Linux container with lljvm that compiles a .c file to a .class file (see also https://github.com/davidar/lljvm/tree/master)
(2) convert this c.class file to a jasmin file (https://github.com/davidar/jasmin) using the ClassFileAnalyzer tool (http://classfileanalyzer.javaseiten.de/)
(3) manually edit this jasmin file in a text editor by replacing/adjusting some strings
(4) convert the modified jasmin file to a .class file again using jasmin
(5) put this .class file in a folder where the rest of my makefiles (the ones that already make and deploy the OS and class files from Java apps) can take over.
Current options seem to be just keep using makefiles but this is a bit unwieldly (I already have 5 different makefiles and this would further extend that chain). I've also read a bit about scons. In essence, I'm wondering which are some recommended tools or a good approach for complicated builds.

Hopefully this may help a bit, but the question as such could probably be a subject for a heated discussion without much helpful results.
As pointed out in the comments by others, you really need to automate the steps starting with your .c file to the point you can integrated it with the rest of your system.
There is generally nothing wrong with make and you would not win too much by switching to SCons. You'd get more ways to express what you want to do. Among other things meaning that if you wanted to write that automation directly inside the build system and its rules, you could also use Python and not only shell (should that be of a concern though, you could just as well call that Python code from make). But the essence of target, prerequisite, recipe is still there. And with that need for writing necessary automation for those .c to integration steps.
If you really wanted to look into alternative options. bazel might be of interest to you. The downside being the initial effort to write the necessary rules to fit your needs could be costly. And depending on size of your project, might just be too much. On the other hand once done with that, it'd be very easy to use (apply those rules on growing code base) and you could also ditch the container and rely on its more lightweight sand-boxing and external rules to get the tools and bits you need for your build... all with a single system for build description.

What is the current full install size of Cygwin?

Every source I found online says a full installation of Cygwin takes over 1 GB, but mine is only 100 MB. I was pretty sure I downloaded everything from the mirror servers, but the install took less than 5 minutes to complete instead of hours, as I'd expect if it were installing gigabytes of software.
Did Cygwin get a huge clean-up during 2012~2013, or did I do something wrong in the installation?

A full Cygwin installation can range from 23 to 112 GiB, depending on how you define "full."
Your 100 MB number tells me that you just clicked through the defaults presented by Cygwin's setup-*.exe program, selecting no optional packages, because that installs only the Base package set, which currently amounts to 0.1 GiB. Cygwin follows the modern net-connected software distribution model: it assumes you can just run setup-*.exe again and select new packages as you need them.
The Cygwin maintainers try to keep the Base category's package set as small as practical.¹ A Cygwin Base install gives you something much like an old-style Unix installation, covering little more than what POSIX specifies.
So How Do You Get a Full Installation?
The Cygwin installer does not have an obvious way to get a "full" installation, on purpose, because no one needs literally every package in the Cygwin repository.²
There is, however, a sneaky way to install everything. At the Select Packages screen...
...switch to Category view, then click the "Default" text to the right of the "All" group header. It will change to "Install," as will the corresponding text in all of the groups underneath it. This marks everything for installation.
I include this tip for completeness only. You do not want to do this! It will install gigs and gigs of stuff you will never use. Currently, there are 11242 packages in Cygwin,³ and installing every last one of them took 93 GiB of disk space for the installation tree plus 19 GiB for the download tree⁴ the last time I tried it. That gives the 112 GiB upper limit above.
All that unused software carries several costs. Even if disk space, download time, installation time, rebasing time, and local bandwidth use are of no consequence to you, consider the generous free mirror's wasted bandwidth.
I've done the experiment, so now you don't have to.
An Intelligent "Complete" Installation
I've come up with a simple set of package exclusion rules that results in a much smaller installation:
Skip all of the -debuginfo packages. Few people need these, and they take up a lot of space. Savings: about 53 GiB in the installation tree alone; more in the download tree.
It's easy to apply this rule. After selecting all packages for installation with the sneaky trick above but before you move on to the next screen, click the "Install" text next to the "Debug" category header until it switches back to "Default."
If you've already installed the debug packages, click that text until it says "Uninstall" instead.
Do not explicitly install any of the lib* packages. Let Cygwin's setup-*.exe automatically install libraries to satisfy package dependencies. Savings: about 5 GiB ⁵
To apply this rule, switch the "Libs" category to "Default" or "Uninstall" as you did with the "Debug" category. The installer will figure out which libraries you actually need in a later step.
Skip the cross-compilers and associated packages. Again, few people need these.⁶ Savings: About 4 GiB
There are two major sets of cross-development tools in Cygwin: the set for creating Cygwin executables of the other word size (i.e. 64-bit tools and libraries for 32-bit Cygwin, or vice versa) and the set for building MinGW executables of the same word size as your Cygwin installation.
To apply this rule for a 64-bit Cygwin installation, while still on the "Select Packages" screen, type cygwin32- in the package name search box at the top of that screen, then click the Default text next to each top-level category until it cycles to Default or Uninstall, as above.
Repeat that for mingw64-.
The idea is the same for 32-bit Cygwin, except that you search for and exclude packages with cygwin64- and mingw32- in their names instead.
By following this rule set, I was able to install nearly everything, taking only about 23 GiB.
Paring That Down
We can get the installation to be even smaller by excluding several other notorious disk hogs:
X11, the desktop environments, and the GUI apps together require about 11 GiB.⁷
A Cygwin Base + Devel installation comes to about 10 GiB.
A Cygwin Base + TeX category installation takes about 5 GiB. If you install only your native language's support package, it comes to about 3.7 GiB instead.
All of the -doc packages combined chew up about 5 GiB of disk space.
Someone who isn't a software developer, who doesn't use Cygwin for GUI stuff, who uses the Web for docs, and who doesn't use TeX for document creation could thus have a "full" Cygwin installation in only about 1 GiB.⁸
If you use Cygwin the way it is intended to be used, installing the base and only the extra packages you need at the moment, you probably won't even get your installation size to even those levels. I use Cygwin this way, and my installations are typically well under 1 GiB, yet they are "complete" by my lights, since they meet my current needs.
For Extra Disk-Filling Fun...
All of this testing was done with the 64-bit version of Cygwin. You can roughly double the above space requirements by installing a parallel 32-bit Cygwin installation.⁹
Doing so is not pointless. It is a viable alternative to using cross-compilers, for one thing. For another, the fundamental incompatibility of the two Cygwins means you may have need of both.
Footnotes
There are occasional threads on the Cygwin mailing list where someone argues that some very common package should be included in the Base category, such as Perl, and the result is usually that the maintainers decide not to add it to Base.
You also occasionally see the opposite: some package accidentally slips into Base, usually via an incorrect dependency. Shortly after the problem is brought to the attention of the Cygwin maintainers, the status quo ante is restored. (example)
Perhaps I am wrong.
There could be a vision-impaired Czech immigrant musician who completes US government software development contracts on the side while not busy brushing up on his technical Hindi vocabulary by translating electrical engineering reports into his adopted Mandarin.
I want to meet him. He sounds like an interesting guy.
Plus, I think I can help him with his plan to create a Tcl/Tk GUI front end for Orpie. Naturally, I will try to talk him into porting it from Ocaml to C++/Qt.
I mean, Tcl/Tk, seriously? In 2018? Let's not be ridiculous.
curl -s https://cygwin.com/packages/package_list.html | grep -c x86_64/
Cygwin's setup-*.exe doesn't delete the downloaded package files after installing them. This is useful at sites where you have multiple Cygwin installations, since you can put the download directory on a shared network drive. Each package then only has to be downloaded once at that site.
My 105 GiB upper limit assumes you will download and install to the same drive, and that you will keep the download tree in case you need to reinstall it later.
Not only does setup-*.exe not delete downloaded package files after installing them, it doesn't auto-purge old versions, so your download tree grows and grows over the years you use Cygwin. (There are scripts floating about the net to solve this problem, such as this one.)
All data storage values given in this answer are apparent disk usage numbers — du -bhs — rather than actual disk usage numbers, which would account for the file system overhead, since that varies between systems. This affects the installation tree to a much greater degree than it does the download tree since the proportion of small files is much greater in the installation tree. Expect something like +1% in the download tree and +5% in the installation tree.
You may wonder why there are libraries in the Cygwin package repository that you don't need even though you've installed "all" packages. There are several reasons:
some libraries are obsolete, but are still present on the mirrors
some libraries come in multiple alternative forms, so that people who know they need something other than the default can choose it
some libraries are there only for people writing their own programs, not to support any existing Cygwin package
Pretty much the only people who need the Cygwin cross-compilers are the people maintaining Cygwin packages, since maintainers are expected to build for both 32-bit and 64-bit Cygwin unless there is a good reason not to.
There are probably more people with a good justification for MinGW cross-development tools, but there's also the option of using MinGW and MSYS instead of Cygwin. Also, I am guessing that the number of people who do dual-stack Cygwin + MinGW development is smaller than the set of people who use one or the other exclusively, or nearly so.
It is not easy to do this exclusion, because GUI packages are scattered throughout the Cygwin packaging system, and their top-level category often contains non-GUI software, so you can't simply exclude the whole category. (e.g. Math.)
The first pass is to exclude the X11, GNOME, KDE, LXDE, MATE, Xfce, and Games categories using the same Categories view technique as above.
Then, using the search box as we did for the cross-compiler exclusion above, remove packages matching gtk, gnome, qt, and kde, optionally excluding those in the Devel, Debug or Libs packages, if you need those.
Finally, you'll have to switch to the Pending view and manually exclude a bunch of packages that weren't caught by either of those two broad exclusions: Abiword, Calligra, Celestia, Dia, Evince, Geany, gEdit, Geomview, Gimp, gLabels, GnuCash, Gnumeric, Kalzium, KMyMoney, KolourPaint, Konqueror, Krita, KStars, LyX, Marble, Pidgin, QupZilla, Scribus, Skrooge, Spectacle, Stellarium, Tellico, and Vinagre. If you don't exclude these, they'll drag back in much of what you excluded above as dependencies!
You may notice that the size of all the excluded package sets adds up to more than the 23 GiB of the "intelligent complete" installation. This is because of shared dependencies. That is to say, these sizes overlap to some extent.
If you put both setup-*.exe programs in the same directory, they will share the download tree so that all of the noarch packages are downloaded only once for both Cygwins. Between that and the fact that 32-bit and 64-bit Intel compilers generate code that differs in size, installing both Cygwins doesn't quite double the disk space requirement.

When installing packages for the first time, setup*.exe does not install every package. Only the minimal base packages from the Cygwin distribution are installed by default.
Clicking on categories and packages in the setup*.exe package installation screen will provide you with the ability to control what is installed or updated.
Clicking on the "Default" field next to the "All" category will provide you with the opportunity to install every Cygwin package.
Installing and Updating Cygwin Packages

I just did a full installation of Cygwin x64 (2018-01-03) on Windows 7 SP1 x64 Ultimate:
Cygwin folder (which by default is C:\cygwin64) has the following properties:
Size: 89.1 GiB (95,671,245,179 bytes)
Size on disk: 92.3 GB (99,148,398,592 bytes)
Contains: 1,433,494 Files, 94,363 Folders
Cygwin temporary folder where downloaded packages are stored prior to being installed (can be removed after Cygwin is installed):
Size: 18.0 GiB (19,363,639,932 bytes)
Size on disk: 18.0 GiB (19,383,767,040 bytes)
Contains: 9,711 Files, 10,354 Folders

I would like to add to this thread. This approach gives you a leaner, meaner, bare-bones, minimal Cygwin install, with just the tools/items you need. No dependency bloat, no unwanted packages, files etc.
I have been experimenting with Cygwin attempting to get a "bare-bones", minimal install. I do find that installing utilities like grep, gawk, sed and similar tools has dependencies on cygwin, base-Cygwin and sometimes unwanted tools like bash, coreutils etc.
I wanted to get only the tools and their required dlls installed and started examining the Cygwin package. I discovered that not using the setup.exe supplied by Cygwin is an alternative way to accomplish minimal Cygwin installs.
And this is how I got it done:
Download only the packages you want from any of the Cygwin mirror
sites using ftp or http. Alternatively you can use the setup.exe
supplied by Cygwin to download all the packages - download only and
no install.
Once the download is successfully completed, individual
packages like zlib, gawk, grep, libiconv are found under the
x86/release or x86_64/release directory Each package is 'tar'red and
compressed using tool 'xz' or bzip and stored in respective
directories.
To install a specific tool like sed or gawk, all that
needs to be done, is to extract the tool executable and its
dependencies (.dll)
Before you attempt the following, please ensure you have a tool like 7z.exe, xz.exe, bzip2 or other that is capable of uncompressing an .xz or bzip archive
Installing gawk example below :
Extract gawk.exe from gawk-4.1.3-1.tar.xz archive using the command 7z.exe e -so gawk-4.1.3-1.tar.xz | tar xvf -
Once that is done, you should find gawk.exe in a subfolder usually, usr/bin under the release/gawk folder
Find the dependencies for gawk, you can do this in a couple of ways.
Examine the Cygwin setup.ini file found in x86 or x86_64 folder.
Look for the string '# gawk' and in the lines after this line you should find a "requires:" line that lists the dependencies.
Mine reads like this - "requires: bash cygwin libgmp10 libintl8 libmpfr4 libreadline7"
For gawk to run, bash is not a must since we have the windows command shell. (bash is included to get a few other dlls required by gawk. However, that causes a lot more unecessary files to be installed.) The other dependencies contain files that gawk needs to run.
Extract each of the above packages using tools like 7z or xz into individual files.
After all the dependencies are extracted, copy your needed tool(s) (grep/sed/gawk) to a folder and all the dependent .dlls
You should now be able to run your tool with the minimum set of .dlls required in a bare-bones cygwin installation.
Caution : It may not be sufficient to just extract the dependencies listed in setup.ini for each tool. Sometimes, you may need to execute/run the tool to discover that there are more dlls required.
There are other means of finding out the dlls required by an .exe - you can use dumpbin from MS or dependency walker, ndepends or similar tools to find the list of dependent dlls
Consult - How do I detect the DLLs required by an application?
How do I find out which dlls an executable will load?
I also brute forced this dll info by just running the tool and installing the missing dlls listed one by one by extracting from the required packages.
When you run a tool and it errors out with a missing .dll message, search for the package that contains the dll here - https://cygwin.com/cgi-bin2/package-grep.cgi . Enter the full/partial name of the missing dll to find the name of the package containing the dll.
Eventually, I have ended up with a bare-bones cygwin install with only the tools and dlls that I need.
Example : gawk - gawk.exe and the following dlls - cygwin1.dll, cyggmp-10.dll, cygiconv-2.dll, cygintl-8.dll, cygmpfr-4.dll, cyggcc_s-seh-1.dll, cygncursesw-10.dll, cygreadline7.dll
sed - sed.exe and dlls - cygwin1.dll, cygintl-8.dll
Hope this is found useful. The Cywin installer also does dll re-basing, which I will not venture into here.

Under the devel you may need only the following:
gcc-core:GNU Compiler collection(C Open MP)
gcc-fortran:GNU Compiler collection(Fortran)
gcc-g++:GNU Compiler collection(C++)
gcc-objc:GNU Compiler collection(Objective-C)
gdb: The GNU Debugger
make: The GNU version of the "make" utility
Photo

Make, install, executing a program

I have been a CS student for a while and it seems like I (or many of my friends) never understood what's happening behind the scene when it terms to make, install etc.
Correct me but is make a way to compile a set of files?
what is it mean by "installing a program to a computer" like on windows because when I am coding in different languages such as java or perl, we dont install what we wrote. we would compile (if not, interpret language) and just run it. So, why are programs such as Skype needs to be "installed"?
Can anyone clarify this? I feel like this is something i need to know as a programmer.

Make is a build system
Make is a build system which is simply a way to script the steps needed to compile a program. Make specifically can be used with anything, but is usually used to compile C or C++ programs. It simplifies and creates a standard way for programmers to script the preparation of their program, so that it can be built and installed with ease
Why a build system
You see, if your program is a simple one source file program, then using make might be an overkill, as compiling the simplest c program is as simple as
gcc simpleprogram.c -o simpleprogram.out
However, as the size of the software grows, the complexity of it grows, and the complexity of how it needs to be built grows. For example, you may want to determine which version of each library is installed in the computer which you are compiling in, you may want to run some tests after compiling your program to determine it is working correctly, or you may want to automatically download some dependencies your program has.
Most software built need a mixture of these tasks eventually. So, instead of reinventing the wheel, they use a build system which allow scripting this. If you are familiar with Java (which you mentioned) a build system comparable to make, but used in the java world is Apache Ant.
Why install
Well, lets assume that you used the "make" command but not "make install". The "make" command is usually used to just to prepare the program for compilation, and the compile it. However, once your program is compiled, all you have is an executable in the directory in which you compiled the program in. The program, its documentation, and it's configuration files haven't been put in the appropriate directories needed for all users to use it. That's what "make install" is for. Make install takes all the files associated with the program you just compiled, and puts said files in the appropriate directories, so that it becomes available to everyone, and so that each component is in the expected directory according to your operating system.

make is a bit of software that reduces the amount of code that needs to be compiled - it compares modification times of the source code with the target. If the code has changed a compile is done to construct the target otherwise you can skip that step.
Installing software is placing the executables/configuration files into the right places - perhaps constructing some files along the way. E.g. usernames in your skype example

Stop XCode from embedding build time in executables

XCode4 is putting build time in executables it creates. When I build the same code twice, binaries will differ by few bytes belonging to a unix timestmap.
Is there a way to prevent this from happening?
(I'm running expensive tests and benchmarks after each build and cache results based on hash of executables, but ever-changing executables broke my cache and pollute benchmark results with duplicates).

I've worked around this by switching to building project myself "old skool" way with Makefiles & gcc.

What are the other uses of the "make" command?

A sysadmin teacher told me one day that I should learn to use "make" because I could use it for a lot of other things that just triggering complilations.
I never got the chance to talk longer about it. Do you have any good example ?
As a bonus, isn't it this tool deprecated, and what are modern alternatives (for the compilation purpose and others) ?

One excellent thing make can be used for besides compilation is LaTeX. If you're doing any serious work with LaTeX, you'll find make very handy because of the need to re-interpret .tex files several times when using BibTex or tables of contents.
Make is definitely not deprecated. Although there are different ways of doing the same thing (batch files on Windows, shell scripts on Linux) make works the best, IMHO.

Make can be used to execute any commands you want to execute. It is best used for activities that require dependency checking, but there is no reason you couldn't use make to check your e-mail, reboot your servers, make backups, or anything else.
Ant, NAnt, and msbuild are supposedly the modern alternatives, but plain-old-make is still used extensively in environments that don't use Java or .NET.

isn't it this tool deprecated
What?! No, not even slightly. I'm on Linux so I realise I'm not an average person, but I use it almost daily. I'm sure there are thousands of Linux devs who do use it daily.

I remember seeing an article on Slashdot a few years ago describing a technique for optimising Linux boot sequence by using make.
edit:
Here's an article from IBM explaining the principle.

Make performs a topological sort, which is to say that given a bunch of things, and a set of requirements that one thing be before another thing, it finds a way to order all of the things so that all of the requirements are met. Building things (programs, documents, distribution tarballs, etc.) is one common use for topological sorting, but there are others. You can create a Makefile with one entry for every server in your data center, including dependencies between servers (NFS, NIS, DNS, etc.) and make can tell you what order in which to turn on your computers after a power outage, or what order to turn them off in before a power outage. You can use it to figure out what order in which to start services on a single server. You can use it to figure out what order to put your clothes on in the morning. Any problem where you need to find an order of a bunch of things or tasks that satisfies a bunch of specific requirements of the form A goes before B is a potential candidate for being solved with make.

The most random use I've ever seen is make being used in place of bash for init scripts on BCCD. It actually worked decently, once you got over the wtf moment....
Think of make as shell scripts with added oomph.

Well, I sure that the UNIX tool "make" is still being used a lot, even if it's waning in the .Net world. And while more people may be using MSBUILD, Ant, nAnt, and others tools these days, they are essentially just "make" with a different file syntax. The basic concept is the same.
Make tools are handy for anything where there's an input file which is processed into an output file. Write your reports in MSWord, but distribute them as PDFs? -- use make to generate the PDFs.

Configuration file changes through crontab, if needed.
I have examples for postfix maps, and for squid external tables.
Example for /etc/postfix/Makefile:
POSTMAP=/usr/sbin/postmap
POSTFIX=/usr/sbin/postfix
HASHES=transport access virtual canonical relocated annoying_senders
BTREES=clients_welcome
HASHES_DB=${HASHES:=.db}
BTREES_DB=${BTREES:=.db}
all: ${BTREES_DB} ${HASHES_DB} aliases.db
echo \= Done
${HASHES_DB}: %.db: %
echo . Rebuilding $< hash...
${POSTMAP} $<
${BTREES_DB}: %.db: %
echo . Rebuilding $< btree...
${POSTMAP} $<
aliases.db: aliases
echo . Rebuilding aliases...
/usr/bin/newaliases
etc

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio