7z get total size of uncompress contents? - 7zip

How might i get the size of the contents of a zip/rar/7z file after full extraction? Under both windows and linux. I thought about using 7z l filename command but i dont like the idea of the filename interfering with the code counting the size of each file.

The command line version of 7zip, 7z.exe, can print a list of files and their sizes, both compressed and uncompressed. This is done using the l flag.
Use this command:
7z.exe l path\folder.zip
You can also include wildcard which gives the overall total for all archives in a folder:
7z.exe l path\*.zip

This doesn't solve your filename problem, but may be good enough for others and myself (in cygwin):
/cygdrive/c/Program\ Files/7-Zip/7z.exe l filename.zip | grep -e "files,.*folders" | awk '{printf $1}'

Using Powershell:
'C:\Program Files\7-Zip\7z.exe' l '.\folder-with-zips\*.zip' | select-string -pattern "1 files" | Foreach {$size=[long]"$(($_ -split '\s+',4)[2])";$total+=$size};"$([math]::Round($total/1024/1024/1024,2))" + " GB"
Or if you want to watch it total things in real-time.
'C:\Program Files\7-Zip\7z.exe' l '.\folder-with-zips\*.zip' | select-string -pattern "1 files" | Foreach {$size=[long]"$(($_ -split '\s+',4)[2])";$total+=$size;"$([math]::Round($total/1024/1024/1024,2))" + " GB"}
If you run it multiple times, don't forget to reset your "total" variable in between runs, otherwise it'll just keep adding up
Remove-variable total

Related

terminal find file with latest patch number

I have a folder with a lot of patch files with pattern
1.1.hotfix1
1.2.hotfix2
2.1.hotfix1
2.1.hotfix2 ...etc
and I have to find out the latest patch(2.1.hotfix2 should be the result of the example) with a bash
how can i achieve it?
Reverse order all files by time and print the first line.
In case you have some other files then you can print files having hotfix text only.
ls -t1 *hotfix* | head -n 1
You can use find with regex, and take the last line from sort:
find * -type f -regex "^[^\d]+\.[^\d]+\.hotfix[^\d]+$" | sort | tail -1

Count and remove extraneous files (bash)

I am getting stuck on finding a succint solution to the following.
In a given directory, I have the following files:
10_MIDAP.nii.gz
12_MIDAP.nii.gz
14_MIDAP.nii.gz
16_restAP.nii.gz
18_restAP.nii.gz
I am only supposed to have two "MIDAP" files and one "restAP" file. The additional files may not contain the full data, so I need to remove them. These are likely going to be smaller in size and/or the earlier sequence number (e.g., 10).
I know how to count / echo the number of files:
MIDAP=`find $DATADIR -name "*MIDAP.nii.gz" | wc -l`
RestAP=`find $DATADIR -name "*restAP.nii.gz" | wc -l`
echo "MIDAP files = $MIDAP"
echo "RestAP files = $RestAP"
Any suggestions on how to succinctly remove the unneeded files, such that I end up with two "MIDAP" files and one "restAP" (in cases where there are extraneous files)? As of now, imagining it would be something like this...
if (( $MIDAP > 2 )); then
...magic happens
fi
Thanks for any advice!
here is an approach
create test files
$ for i in {1..10}; do touch ${i}_restAP; touch ${i}_MIDAP; done
sort based on numbers, and remove the top N-1 (or N-2) files.
$ find . -name '*restAP*' | sort -V | head -n -1 | xargs rm
$ find . -name '*MIDAP*' | sort -V | head -n -2 | xargs rm
$ ls -1
10_MIDAP
10_restAP
9_MIDAP
you may want to change the sort if based on file size.

Remove first n character from bunch of file names with cut

I am using
ls | cut -c 5-
This does return a list of the file names in the format i want them, but doesn't actually perform the action. Please advise.
rename -n 's/.{5}(.*)/$1/' *
The -n is for simulating; remove it to get the actual result.
you can use the following command when you are in the folder where you want to make the renaming:
rename -n -v 's/^(.{5})//' *
-n is for no action and -v to show what will be the changes. if you are satisfied with the results you can remove both of them
rename 's/^(.{5})//' *
Something like this should work:
for x in *; do
echo mv $x `echo $x | cut -c 5-`
done
Note that this could be destructive, so run it this way first, and then remove the leading echo once you're confident that it does what you want.
Kebman's little code is nice for if you want to cut off the leading dot of hidden files and folders in the current dir, before 7zipping or zipping.
I put this in a bash script, but this is wat I mean:
for f in .*; do mv -v "$f" "${f:1}"; done # cut off the leading point of hidden files and dirs
7z a -pPASSWORD -mx=0 -mhe -t7z ${DESTINATION}.7z ${SOURCE} -x!7z_Compress_not_this_script_itself_*.sh # compress all files and dirs of current dir to one 7z-file, excluding the script itself.
zip and 7z can have trouble with hidden files at top level in the current dir.
Hidden files in the subdirs are accepted.
mydir/myfile = ok
mydir/.myfile = ok
.mydir/myfile = nok
.mydir/.myfile = nok
If you get an error message saying,
rename is not recognized as the name of a cmdlet
This might work for you,
get-childitem * | rename-item -newname { [string]($_.name).substring(5) }

Get total size of a list of files in UNIX

I want to run a find command that will find a certain list of files and then iterate through that list of files to run some operations. I also want to find the total size of all the files in that list.
I'd like to make the list of files FIRST, then do the other operations. Is there an easy way I can report just the total size of all the files in the list?
In essence I am trying to find a one-liner for the 'total_size' variable in the code snippet below:
#!/bin/bash
loc_to_look='/foo/bar/location'
file_list=$(find $loc_to_look -type f -name "*.dat" -size +100M)
total_size=???
echo 'total size of all files is: '$total_size
for file in $file_list; do
# do a bunch of operations
done
You should simply be able to pass $file_list to du:
du -ch $file_list | tail -1 | cut -f 1
du options:
-c display a total
-h human readable (i.e. 17M)
du will print an entry for each file, followed by the total (with -c), so we use tail -1 to trim to only the last line and cut -f 1 to trim that line to only the first column.
Methods explained here have hidden bug. When file list is long, then it exceeds limit of shell comand size. Better use this one using du:
find <some_directories> <filters> -print0 | du <options> --files0-from=- --total -s|tail -1
find produces null ended file list, du takes it from stdin and counts.
this is independent of shell command size limit.
Of course, you can add to du some switches to get logical file size, because by default du told you how physical much space files will take.
But I think it is not question for programmers, but for unix admins :) then for stackoverflow this is out of topic.
This code adds up all the bytes from the trusty ls for all files (it excludes all directories... apparently they're 8kb per folder/directory)
cd /; find -type f -exec ls -s \; | awk '{sum+=$1;} END {print sum/1000;}'
Note: Execute as root. Result in megabytes.
The problem with du is that it adds up the size of the directory nodes as well. It is an issue when you want to sum up only the file sizes. (Btw., I feel strange that du has no option for ignoring the directories.)
In order to add the size of files under the current directory (recursively), I use the following command:
ls -laUR | grep -e "^\-" | tr -s " " | cut -d " " -f5 | awk '{sum+=$1} END {print sum}'
How it works: it lists all the files recursively ("R"), including the hidden files ("a") showing their file size ("l") and without ordering them ("U"). (This can be a thing when you have many files in the directories.) Then, we keep only the lines that start with "-" (these are the regular files, so we ignore directories and other stuffs). Then we merge the subsequent spaces into one so that the lines of the tabular aligned output of ls becomes a single-space-separated list of fields in each line. Then we cut the 5th field of each line, which stores the file size. The awk script sums these values up into the sum variable and prints the results.
ls -l | tr -s ' ' | cut -d ' ' -f <field number> is something I use a lot.
The 5th field is the size. Put that command in a for loop and add the size to an accumulator and you'll get the total size of all the files in a directory. Easier than learning AWK. Plus in the command substitution part, you can grep to limit what you're looking for (^- for files, and so on).
total=0
for size in $(ls -l | tr -s ' ' | cut -d ' ' -f 5) ; do
total=$(( ${total} + ${size} ))
done
echo ${total}
The method provided by #Znik helps with the bug encountered when the file list is too long.
However, on Solaris (which is a Unix), du does not have the -c or --total option, so it seems there is a need for a counter to accumulate file sizes.
In addition, if your file names contain special characters, this will not go too well through the pipe (Properly escaping output from pipe in xargs
).
Based on the initial question, the following works on Solaris (with a small amendment to the way the variable is created):
file_list=($(find $loc_to_look -type f -name "*.dat" -size +100M))
printf '%s\0' "${file_list[#]}" | xargs -0 du -k | awk '{total=total+$1} END {print total}'
The output is in KiB.

How do I prevent this infinite loop in PowerShell?

Say I have several text files that contain the word 'not' and I want to find them and create a file containing the matches. In Linux, I might do
grep -r not *.txt > found_nots.txt
This works fine. In PowerShell, the following echos what I want to the screen
get-childitem *.txt -recurse | select-string not
However, if I pipe this to a file:
get-childitem *.txt -recurse | select-string not > found_nots.txt
It runs for ages. I eventually CTRL-C to exit and take a look at the found_nots.txt file which is truly huge. It looks as though PowerShell includes the output file as one of the files to search. Every time it adds more content, it finds more to add.
How can I stop this behavior and make it behave more like the Unix version?
Use the -Exclude option.
get-childitem *.txt -Exclude 'found_nots.txt' -recurse | select-string not > found_nots.txt
First easy solution is rename file output extension to another

Resources