How to update one file in a zip archive - bash

Is it possible to replace a file in a zip file without unzipping?
The file to update is an XML file that resides in a huge zip archive. To update this XML file, I have to unzip the archive, delete the old XML file, add the new one and then rezip. This takes a considerable amount of time. So want to be able to replace that one XML through a script. I already have the one that checks for updates on the XML I have.
using zip command
Sorry, I would use the zip command to do things like that but the problem is the script is actually for an android phone and zip is not a command I can use unfortunately sorry I left that out. I would have used zip definitely if i could but I only have unzip for droid and then there is tar in busybox but tar doesn't do what I need

Try the following:
zip [zipfile] [file to update]
An example:
$ zip test.zip test/test.txt
updating: test/test.txt (stored 0%)

I've found the Linux zip file to be cumbersome for replacing a single file in a zip. The jar utility from the Java Development Kit may be easier. Consider the common task of updating WEB/web.xml in a JAR file (which is just a zip file):
jar -uf path/to/myapp.jar -C path/to/dir WEB-INF/web.xml
Here, path/to/dir is the path to a directory containing the WEB-INF directory (which in turn contains web.xml).

From zip(1):
When given the name of an existing zip archive, zip will replace identically named entries in the zip archive or add entries for new names.
So just use the zip command as you normally would to create a new .zip file containing only that one file, except the .zip filename you specify will be the existing archive.

Use the update flag: -u
Example:
zip -ur existing.zip myFolder
This command will compress and add myFolder (and it's contents) to the existing.zip.
Advanced Usage:
The update flag actually compares the incoming files against the existing ones and will either add new files, or update existing ones.
Therefore, if you want to add/update a specific subdirectory within the zip file, just update the source as desired, and then re-zip the entire source with the -u flag. Only the changed files will be zipped.
If you don't have access to the source files, you can unzip the zip file, then update the desired files, and then re-zip with the -u flag. Again, only the changed files will be zipped.
Example:
Original Source Structure
ParentDir
├── file1.txt
├── file2.txt
├── ChildDir
│ ├── file3.txt
│ ├── Logs
│ │ ├── logs1.txt
│ │ ├── logs2.txt
│ │ ├── logs3.txt
Updated Source Structure
ParentDir
├── file1.txt
├── file2.txt
├── ChildDir
│ ├── file3.txt
│ ├── Logs
│ │ ├── logs1.txt
│ │ ├── logs2.txt
│ │ ├── logs3.txt
│ │ ├── logs4.txt &lt-- NEW FILE
Usage
$ zip -ur existing.zip ParentDir
> updating: ParentDir/ChildDir/Logs (stored 0%)
> adding: ParentDir/ChildDir/Logs/logs4.txt (stored 96%)

I know this is old question, but I wanted to do the same. Update a file in zip archive. And none of the above answers really helped me.
Here is what I did. Created temp directory abc. Copied file.zip to abc and extracted the file in that directory. I edited the file I wanted to edit.
Then while being in abc, ran the following command
user#host ~/temp/abc $ zip -u file.zip
updating: content/js/ (stored 0%)
updating: content/js/moduleConfig.js (deflated 69%)
-u switch will look for changed/new files and will add to the zip archive.

You can use:
zip -u file.zip path/file_to_update

There is also the -f option that will freshen the zip file. It can be used to update ALL files which have been updated since the zip was generated (assuming they are in the same place within the tree structure within the zip file).
If your file is named /myfiles/myzip.zip all you have to do is
zip -f /myfiles/myzip.zip

From the side of ZIP archive structure - you can update zip file without recompressing it, you will just need to skip all files which are prior of the file you need to replace, then put your updated file, and then the rest of the files. And finally you'll need to put the updated centeral directory structure.
However, I doubt that most common tools would allow you to make this.

7zip (7za) can be used for adding/updating files/directories nicely:
Example:
Replacing (regardless of file date) the MANIFEST.MF file in a JAR file. The /source/META-INF directory contains the MANIFEST.MF file that you want to put into the jar (zip):
7za a /tmp/file.jar /source/META-INF/
Only update (does not replace the target if the source is older)
7za u /tmp/file.jar /source/META-INF/

yes its possible.
on linux based systems just install zip and you can call it in the command line. have a look at the manpage: http://linux.die.net/man/1/zip
but in my personal experience, if possible and compression is not so important, this works better with plain tar files and tar.

Related

using ../ at go:embed annotation

I want to embed a file placed one level above the golang file code.
for example:
dir1
file.go
dir2
file.txt
How to embed file.txt inside file.go using go:embed?
The documentation states:
Patterns may not contain ‘.’ or ‘..’ or empty path elements, nor may they begin or end with a slash.
So what you are trying to do is not supported directly. Further information is available in the comments on this issue.
One thing you can do is to put a go file in dir2, embed file.txt in that and then import/use that in dir1/file.go (assuming the folders are in the same package).
This is not supported in the embed package as stated by #Brits (https://pkg.go.dev/embed)
A pattern I like to use is to create an resources.go file in my project's internal package and put all my embedded resources in there eg:
├── cmd\
│ └── cool.go
└── internal\
└── resources\
├── resources.go
├── fonts\
│ └── coolfont.ttf
└── icons\
└── coolicon.ico
resources.go
import _ "embed"
//go:embed fonts/coolfont.fs
var fonts byte[] // embed single file
//go:embed icons/*
var icons embed.FS // embed whole directory
There are libraries that can help with this as well such as those listed here https://github.com/avelino/awesome-go#resource-embedding
But I've not run into a use case where plain old embed wasn't enough for my needs.

Bash - Copy files iteratively

I want to copy a list of files, called dti_fin_fa, from one folder to another.
These files are scattered in different folders.
Controls
└───C01
│ └───difusion
│ └───Deterministic
│ └───dti_fin_fa.nii
└───C02
│ └───difusion
│ └───Deterministic
│ └───dti_fin_fa.nii
└───C03
│ └───difusion
│ └───Deterministic
│ └───dti_fin_fa.nii
I want to select and copy all the dti_fin_fa, keeping the folder structure, so, in my new directory, I would have the folder distribution just seen above. The problem is that in this folders, (deterministic, difusion, etc), there are lots of other files I don´t want to copy, so I can´t just copy the main folder (C01 etc)
This is what I have:
#!/bin/bash
DIR="/media/Batty/Analysis"; cd "$DIR" || exit
for group in Controls; do
for folder in $group/*; do
for dti in $folder/difussion/Deterministic/dti_fin_fa.nii; do
echo $dti
cp $dti /media/Roy/Analysis/Controls/ --verbose
done;
done;
done;
The problem is that this code copies each of the dti_fin_fa.nii images into /media/Roy/Analysis/Controls/, hence keeping just the last one, instead of creating all the other subfolders.
Could something like this work?
cp $folder/difusion/Deterministic/$dti /media/Roy/Analysis/Patients/ --verbose
Iterate through each control, make the equivalent directory and copy only the file you want:
for group in "Control"/* "Patients"/*; do
orig="$group/difusion/Deterministic"
dest="/media/Roy/Analysis/Controls/$(dirname group)/$orig"
mkdir -p "$dest"
cp "$orig/dti_fin_fa.nii" "$dest/"
done
I suppose you execute this from the DIR directory of your example. So here orig is the relative path to the image folder, dest is the absolute path to your destination directory, including the preserved original folder structure. mkdir -p ensures that such end directory exists and finally the image is copied.

linux - batch move files into a directory and rename those files according to sequential syntax in that directory

I have two directories - A and B - that contain a bunch of photo files. Directory A is where I keep photos long-term, and the files inside are sequentially named "Photo-1.jpg, Photo-2.jpg, etc.".
Directory B is where I upload new photos to from my camera, and the naming convention is whatever the camera names the file. I figured out how to run some operations on Directory B to ensure everything is in .jpg format as needed (imagemagik convert), remove duplicate files (fdupes), etc.
My goal now is to move the files from B to A, and end up with the newly-added files in A sequentially named according to A's naming convention described above.
I know how to move the files into A, and then to batch rename everything in A after the new files have been added (which would theoretically occur every night), but I'm guessing there must be a more efficient way of moving the files from B to A without re-naming all 20,000+ photos every night, just because a few new files were added.
I guess my question is two parts - 1) I found a solution that works (us mv to rename all photos every night), is there any downside to this? and 2) If there is a downside and a more elegant method exists, can anyone help with a script that would look at whatever the highest number that exists in A, then re-name the files, appending onto that number, in B as they are moved over to A?
Thank you!
This bash script will only move and rename the new files from DiretoryB into your DirectoryA path. It also handles file names with spaces and/or any other odd characters in their name in DirectoryB
#!/bin/bash
aPath="./photos-A"
bPath="./photos-B"
aPattern="Photo-"
lNum=$(find $aPath -type f -name "*.jpg" -printf "%f\n" | \
awk -F'[-.]' '{if($2>m)m=$2}END{print m}')
while IFS= read -r -d $'\0' photo; do
mv "$photo" "$aPath/$aPattern$((++lNum)).jpg"
done < <(find $bPath -type f -name "*.jpg" -print0)
Note
The command to find the last numbered photo, aka $lNum will run over all 20K+ files, but it should be fairly quick. If it's not, you can always run this once and store the latest number into a file and read from that file.
Proof of Concept
$ tree photos-A/
photos-A/
├── Photo-1.jpg
├── Photo-2.jpg
├── Photo-3.jpg
├── Photo-5.jpg
├── Photo-6.jpg
├── Photo-7.jpg
└── Photo-8.jpg
0 directories, 7 files
$ tree photos-B/
photos-B/
├── bar.jpg
├── baz\ with\ spaces.jpg
└── foo.jpg
0 directories, 3 files
$ ./mvphoto.sh
$ tree photos-A/
photos-A/
├── Photo-10.jpg
├── Photo-11.jpg
├── Photo-1.jpg
├── Photo-2.jpg
├── Photo-3.jpg
├── Photo-5.jpg
├── Photo-6.jpg
├── Photo-7.jpg
├── Photo-8.jpg
└── Photo-9.jpg
0 directories, 10 files

How to find all empty folders and untracked files when using git?

Based on this post and this post ,
git ls-files --others --exclude-standard can list all untracked files.
But I test it, cannot list empty folder(both tracked and not tracked).
For example,cannot list empty folder archiver folder as below:
.
├── admin.php
├── api
│   ├── index.htm
│   └── remote
│   └── mod
│   ├── index.htm
│   ├── mod_cron.php
│   └── mod_index.php
└── archiver folder
Then my question is: how to list all untracked files and empty folders?
TL;DR: just look for empty directories. You can safely remove them—well, "safe" depends on your own software, but as far as Git is concerned, it's safe. (Watch out for missing files—see the definition of "missing" below—which may remove a directory that Git might want later, but that's sort of OK, because Git will just create it again.)
On a Unix / Linux system (edited to correct lost word in transcription):
find . -name .git -prune -o -type d -empty -print
(at the top level of the work-tree) will find the empty directories.
Long(ish)
Git is not interested in folders / directories. There's no such thing as an untracked folder in the same way that there's no such thing as a tracked folder: Git only cares about files. Specifically, a file is either in the index, or not in the index, and if it's not in the index, it's untracked.
When you use the various options to list untracked files (which tend to skip over ones that are untracked-and-ignored since you normally want that), Git will, sometimes, aggregate together all the files that are in some folder, notice that there are no tracked files in that folder, and report them using the aggregated notation. You can stop this with, e.g., git status --untracked-mode=all; then you'll get the individual file names.
Note that it's possible to have some file that is tracked, yet missing. For instance, suppose sub/README.txt is a tracked file, and actually exists. Then we run rm sub/README.txt. The file sub/README.txt remains in Git's index, and will be in the next commit, but it's missing. If that was the only file in sub in your work-tree, sub is now empty, and you can remove it with rmdir sub. Even though sub/README.txt remains missing (and sub is missing too!), that does not affect the next commit: it will still contain sub/README.txt, because that file is in the index. (Using git rm --cached sub/README.txt, you can remove it from the index too, if that's what you wanted.)
If and when Git goes to copy sub/README.txt back out of the index into the work-tree, Git will, at this point, discover that there is no sub. Git will merely shrug its metaphorical shoulders and create the directory sub, and then put sub/README.txt into it. So this is why Git is not interested in folders / directories: they're just boring and dull, required only when needed to hold files, created on demand.
If you want Git to create a directory, you need to store a file in it. Since programs managed by Git need to be able to ignore the file named .gitignore, this is a very good file name to stick into such a directory. You can write * into that file, and add it to your commits, so that Git will create the directory and write a .gitignore file there containing *, and will thus ignore all additional untracked files within that directory automatically.
Side note: In general, when Git pulls the last file out of some directory, it will remove the directory too, but occasionally I've seen it leave some behind. (Of course, it has to leave the directory behind if it still contains some untracked files. Note that git clean -fd will remove the empty directories, though it also removes the untracked files.)
git ls-files --others --exclude-standard> not_tracked
find . -depth -empty -type d \( ! -regex '.*/\..*' \) >> not_tracked
Please check my answer,I spent 2 days for it.
The command git clean does exactly what you want.

Extract specific file extensions from multiple 7-zip files

I have a RAR file and a ZIP file. Within these two there is a folder. Inside the folder there are several 7-zip (.7z) files. Inside every 7z there are multiple files with the same extension, but whose names vary.
RAR or ZIP file
|___folder
|_____Multiple 7z
|_____Multiple files with same extension and different name
I want to extract just the ones I need from thousands of files...
I need those files whose names include a certain substring. For example, if the name of a compressed file includes '[!]' in the name or '(U)' or '(J)' that's the criteria to determine the file to be extracted.
I can extract the folder without problem so I have this structure:
folder
|_____Multiple 7z
|_____Multiple files with same extension and different name
I'm in a Windows environment but I have Cygwin installed. I wonder how can I extract the files I need painlessly? Maybe using a single command line line.
Update
There are some improvements to the question:
The inner 7z files and their respective files inside them can have spaces in their names.
There are 7z files with just one file inside of them that doesn't meet the given criteria. Thus, being the only possible file, they have to be extracted too.
Solution
Thanks to everyone. The bash solution was the one that helped me out. I wasn't able to test Python3 solutions because I had problems trying to install libraries using pip. I don't use Python so I'll have to study and overcome the errors I face with these solutions. For now, I've found a suitable answer. Thanks to everyone.
This solution is based on bash, grep and awk, it works on Cygwin and on Ubuntu.
Since you have the requirement to search for (X) [!].ext files first and if there are no such files then look for (X).ext files, I don't think it is possible to write some single expression to handle this logic.
The solution should have some if/else conditional logic to test the list of files inside the archive and decide which files to extract.
Here is the initial structure inside the zip/rar archive I tested my script on (I made a script to prepare this structure):
folder
├── 7z_1.7z
│   ├── (E).txt
│   ├── (J) [!].txt
│   ├── (J).txt
│   ├── (U) [!].txt
│   └── (U).txt
├── 7z_2.7z
│   ├── (J) [b1].txt
│   ├── (J) [b2].txt
│   ├── (J) [o1].txt
│   └── (J).txt
├── 7z_3.7z
│ ├── (E) [!].txt
│ ├── (J).txt
│ └── (U).txt
└── 7z 4.7z
└── test.txt
The output is this:
output
├── 7z_1.7z # This is a folder, not an archive
│   ├── (J) [!].txt # Here we extracted only files with [!]
│   └── (U) [!].txt
├── 7z_2.7z
│   └── (J).txt # Here there are no [!] files, so we extracted (J)
├── 7z_3.7z
│   └── (E) [!].txt # We had here both [!] and (J), extracted only file with [!]
└── 7z 4.7z
└── test.txt # We had only one file here, extracted it
And this is the script to do the extraction:
#!/bin/bash
# Remove the output (if it's left from previous runs).
rm -r output
mkdir -p output
# Unzip the zip archive.
unzip data.zip -d output
# For rar use
# unrar x data.rar output
# OR
# 7z x -ooutput data.rar
for archive in output/folder/*.7z
do
# See https://stackoverflow.com/questions/7148604
# Get the list of file names, remove the extra output of "7z l"
list=$(7z l "$archive" | awk '
/----/ {p = ++p % 2; next}
$NF == "Name" {pos = index($0,"Name")}
p {print substr($0,pos)}
')
# Get the list of files with [!].
extract_list=$(echo "$list" | grep "[!]")
if [[ -z $extract_list ]]; then
# If we don't have files with [!], then look for ([A-Z]) pattern
# to get files with single letter in brackets.
extract_list=$(echo "$list" | grep "([A-Z])\.")
fi
if [[ -z $extract_list ]]; then
# If we only have one file - extract it.
if [[ ${#list[#]} -eq 1 ]]; then
extract_list=$list
fi
fi
if [[ ! -z $extract_list ]]; then
# If we have files to extract, then do the extraction.
# Output path is output/7zip_archive_name/
out_path=output/$(basename "$archive")
mkdir -p "$out_path"
echo "$extract_list" | xargs -I {} 7z x -o"$out_path" "$archive" {}
fi
done
The basic idea here is to go over 7zip archives and get the list of files for each of them using 7z l command (list of files).
The output of the command if quite verbose, so we use awk to clean it up and get the list of file names.
After that we filter this list using grep to get either a list of [!] files or a list of (X) files.
Then we just pass this list to 7zip to extract the files we need.
What about using this command line :
7z -e c:\myDir\*.7z -oc:\outDir "*(U)*.ext" "*(J)*.ext" "*[!]*.ext" -y
Where :
myDir is your unzip folder
outDir is your output directory
ext is your file extension
The -y option is for forcing overwriting in case you have the same filename in different archives.
You state it is OK to use linux, in the question bounty footer. And also I don't use windows. Sorry about that. I am using Python3 on, and you have to be in a linux environment (I will try to test this on windows as soon as I can).
Archive structure
datadir.rar
|
datadir/
|
zip1.7z
zip2.7z
zip3.7z
zip4.7z
zip5.7z
Extracted structure
extracted/
├── zip1
│   ├── (E) [!].txt
│   ├── (J) [!].txt
│   └── (U) [!].txt
├── zip2
│   ├── (E) [!].txt
│   ├── (J) [!].txt
│   └── (U) [!].txt
├── zip3
│   ├── (J) [!].txt
│   └── (U) [!].txt
└── zip5
├── (J).txt
└── (U).txt
Here is how I did it.
import libarchive.public
import os, os.path
from os.path import basename
import errno
import rarfile
#========== FILE UTILS =================
#Make directories
def mkdir_p(path):
try:
os.makedirs(path)
except OSError as exc: # Python >2.5
if exc.errno == errno.EEXIST and os.path.isdir(path):
pass
else: raise
#Open "path" for writing, creating any parent directories as needed.
def safe_open_w(path):
mkdir_p(os.path.dirname(path))
return open(path, 'wb')
#========== RAR TOOLS ==================
# List
def rar_list(rar_archive):
with rarfile.RarFile(rar_archive) as rf:
return rf.namelist()
# extract
def rar_extract(rar_archive, filename, path):
with rarfile.RarFile(rar_archive) as rf:
rf.extract(filename,path)
# extract-all
def rar_extract_all(rar_archive, path):
with rarfile.RarFile(rar_archive) as rf:
rf.extractall(path)
#========= 7ZIP TOOLS ==================
# List
def zip7_list(zip7file):
filelist = []
with open(zip7file, 'rb') as f:
for entry in libarchive.public.memory_pour(f.read()):
filelist.append(entry.pathname.decode("utf-8"))
return filelist
# extract
def zip7_extract(zip7file, filename, path):
with open(zip7file, 'rb') as f:
for entry in libarchive.public.memory_pour(f.read()):
if entry.pathname.decode("utf-8") == filename:
with safe_open_w(os.path.join(path, filename)) as q:
for block in entry.get_blocks():
q.write(block)
break
# extract-all
def zip7_extract_all(zip7file, path):
with open(zip7file, 'rb') as f:
for entry in libarchive.public.memory_pour(f.read()):
if os.path.isdir(entry.pathname.decode("utf-8")):
continue
with safe_open_w(os.path.join(path, entry.pathname.decode("utf-8"))) as q:
for block in entry.get_blocks():
q.write(block)
#============ FILE FILTER =================
def exclamation_filter(filename):
return ("[!]" in filename)
def optional_code_filter(filename):
return not ("[" in filename)
def has_exclamation_files(filelist):
for singlefile in filelist:
if(exclamation_filter(singlefile)):
return True
return False
#============ MAIN PROGRAM ================
print("-------------------------")
print("Program Started")
print("-------------------------")
BIG_RAR = 'datadir.rar'
TEMP_DIR = 'temp'
EXTRACT_DIR = 'extracted'
newzip7filelist = []
#Extract big rar and get new file list
for zipfilepath in rar_list(BIG_RAR):
rar_extract(BIG_RAR, zipfilepath, TEMP_DIR)
newzip7filelist.append(os.path.join(TEMP_DIR, zipfilepath))
print("7z Files Extracted")
print("-------------------------")
for newzip7file in newzip7filelist:
innerFiles = zip7_list(newzip7file)
for singleFile in innerFiles:
fileSelected = False
if(has_exclamation_files(innerFiles)):
if exclamation_filter(singleFile): fileSelected = True
else:
if optional_code_filter(singleFile): fileSelected = True
if(fileSelected):
print(singleFile)
outputFile = os.path.join(EXTRACT_DIR, os.path.splitext(basename(newzip7file))[0])
zip7_extract(newzip7file, singleFile, outputFile)
print("-------------------------")
print("Extraction Complete")
print("-------------------------")
Above the main program, I've got all the required functions ready. I didn't use all of them, but I kept them in case you need them.
I used several python libraries with python3, but you only have to install libarchive and rarfile using pip, others are built-in libraries.
And here is a copy of my source tree
Console output
This is the console output when you run this python file,
-------------------------
Program Started
-------------------------
7z Files Extracted
-------------------------
(J) [!].txt
(U) [!].txt
(E) [!].txt
(J) [!].txt
(U) [!].txt
(E) [!].txt
(J) [!].txt
(U) [!].txt
(J).txt
(U).txt
-------------------------
Extraction Complete
-------------------------
Issues
The only issue I faced so far is, there are some temporary files generating at the program root. It doesn't affect the program in anyway, but I'll try to fix that.
edit
You have to run
sudo apt-get install libarchive-dev
to install the actual libarchive program. Python library is just a wrapper arround it. Take a look at the official documentation.
This is somehow final version after some tries. Previous was not useful so I'm removing it, instead of appending. Read till the end, since not everything may be needed for final solution.
To the topic. I would use Python. If that is one time task, then it can be overkill, but in any other case - you can log all steps for future investigation, regex, orchestrating some commands with providing input, and taking and processing output - each time. All that cases are quite easy in Python. If you have it however.
Now, I'll write what to do to have env. configured. Not all is mandatory, but trying install did some steps, and maybe description of the process can be beneficial itself.
I have MinGW - 32 bit version. That is not mandatory to extract 7zip however. When installed go to C:\MinGW\bin and run mingw-get.exe:
Basic Setup I have msys-base installed (right click, mark for installation, from Installation menu - Apply changes). That way I have bash, sed, grep, and many more.
In All Packages there is mingw32-libarchive with dll as class. Since pythonlibarchive` package is just a wrapper you need this dll to actually have binary to wrap.
Examples are for Python 3. I'm using 32 bit version. You can fetch it from their home page. I have installed in default directory which is strange. So advise is to install in root of your disk - like mingw.
Other things - conemu is much better then default console.
Installing packages in Python. pip is used for that. From your console go to Python home, and there is Scripts subdirectory there. For me it is: c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\Scripts. You can search with for instance pip search archive, and install with pip install libarchive-c:
> pip.exe install libarchive-c
Collecting libarchive-c
Downloading libarchive_c-2.7-py2.py3-none-any.whl
Installing collected packages: libarchive-c
Successfully installed libarchive-c-2.7
After cd .. call python, and new library can be used / imported:
>>> import libarchive
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\__init__.py", line 1, in <module>
from .entry import ArchiveEntry
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\entry.py", line 6, in <module>
from . import ffi
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\ffi.py", line 27, in <module>
libarchive = ctypes.cdll.LoadLibrary(libarchive_path)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\ctypes\__init__.py", line 426, in LoadLibrary
return self._dlltype(name)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\ctypes\__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
TypeError: LoadLibrary() argument 1 must be str, not None
So it fails. I've tried to fix that, but failed with that:
>>> import libarchive
read format "cab" is not supported
read format "7zip" is not supported
read format "rar" is not supported
read format "lha" is not supported
read filter "uu" is not supported
read filter "lzop" is not supported
read filter "grzip" is not supported
read filter "bzip2" is not supported
read filter "rpm" is not supported
read filter "xz" is not supported
read filter "none" is not supported
read filter "compress" is not supported
read filter "all" is not supported
read filter "lzma" is not supported
read filter "lzip" is not supported
read filter "lrzip" is not supported
read filter "gzip" is not supported
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\__init__.py", line 1, in <module>
from .entry import ArchiveEntry
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\entry.py", line 6, in <module>
from . import ffi
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\ffi.py", line 167, in <module>
c_int, check_int)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\site-packages\libarchive\ffi.py", line 92, in ffi
f = getattr(libarchive, 'archive_'+name)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\ctypes\__init__.py", line 361, in __getattr__
func = self.__getitem__(name)
File "c:\Users\<<username>>\AppData\Local\Programs\Python\Python36-32\lib\ctypes\__init__.py", line 366, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: function 'archive_read_open_filename_w' not found
Tried with set command to directly provide information, but failed... So I moved to pylzma - for that mingw is not needed. pip install failed:
> pip.exe install pylzma
Collecting pylzma
Downloading pylzma-0.4.9.tar.gz (115kB)
100% |--------------------------------| 122kB 1.3MB/s
Installing collected packages: pylzma
Running setup.py install for pylzma ... error
Complete output from command c:\users\texxas\appdata\local\programs\python\python36-32\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\texxas\\AppData\\Local\\Temp\\pip-build-99t_zgmz\\pylzma\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\texxas\AppData\Local\Temp\pip-ffe3nbwk-record\install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build\lib.win32-3.6
copying py7zlib.py -> build\lib.win32-3.6
running build_ext
adding support for multithreaded compression
building 'pylzma' extension
error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": http://landinghub.visualstudio.com/visual-cpp-build-tools
Again failed. But that is easy one - I've installed visual studio build tools 2015, and that worked. I have sevenzip installed, so I've created sample archive. So finally I can start python and do:
from py7zlib import Archive7z
f = open(r"C:\Users\texxas\Desktop\try.7z", 'rb')
a = Archive7z(f)
a.filenames
And got empty list. Looking closer... gives better understanding - empty files are not considered by pylzma - just to make you aware of that. So putting one character into my sample files, last line gives:
>>> a.filenames
['try/a/test.txt', 'try/a/test1.txt', 'try/a/test2.txt', 'try/a/test3.txt', 'try/a/test4.txt', 'try/a/test5.txt', 'try/a/test6.txt', 'try/a/test7.txt', 'try/b/test.txt', 'try/b/test1.txt', 'try/b/test2.txt', 'try/b/test3.txt', 'try/b/test4.txt', 'try/b/test5.txt', 'try/b/test6.txt', 'try/b/test7.txt', 'try/c/test.txt', 'try/c/test1.txt', 'try/c/test11.txt', 'try/c/test2.txt', 'try/c/test3.txt', 'try/c/test4.txt', 'try/c/test5.txt', 'try/c/test6.txt', 'try/c/test7.txt']
So... rest is a piece of cake. And actually that is a part of original post:
import os
import py7zlib
for folder, subfolders, files in os.walk('.'):
for file in files:
if file.endswith('.7z'):
# sooo 7z archive - extract needed.
try:
with open(file, 'rb') as f:
z = py7zlib.Archive7z(f)
for file in z.list():
if arch.getinfo(file).filename.endswith('*.py'):
arch.extract(file, './dest')
except py7zlib.FormatError as e:
print ('file ' + file)
print (str(e))
As a side note - Anaconda is great tool, but full install takes 500+MB, so that is way too much.
Also let me share wmctrl.py tool, from my github:
cmd = 'wmctrl -ir ' + str(active.window) + \
' -e 0,' + str(stored.left) + ',' + str(stored.top) + ',' + str(stored.width) + ',' + str(stored.height)
print cmd
res = getoutput(cmd)
That way you can orchestrate different commands - here it is wmctrl. Result can be processed, in the way that allows data processing.

Resources