How to add a file to a .gz archive and delete the original file? - bash

My files name is <09/12/2020>_master. How would I be able to add this file to a .gz archive and then remove the original file?

GZip isn't an archive format, it's a compression format. A .gz file can only contain one compressed file; if you need to put more than one file in at a time, you'll need to pair it with an archive format (such as tar).

Related

Terminal - Unzip all .gz files in a folder without combining resulting files

I have a folder, TestFolder, that contains several .gz files. Each .gz file is a folder containing several sub-directories, with the deepest level of each .gz file containing 5 text files. For example, extracting one of the .gz files ultimately has 5 files at the deepest level of the directory, like:
Users/me/Desktop/TestFolderParent/TestFolder/folder1/subfolder1/subfolder2/subfolder3/subfolder4/subfolder5/subfolder6/TextFile1.txt
Users/me/Desktop/TestFolderParent/TestFolder/folder1/subfolder1/subfolder2/subfolder3/subfolder4/subfolder5/subfolder6/TextFile2.txt
Users/me/Desktop/TestFolderParent/TestFolder/folder1/subfolder1/subfolder2/subfolder3/subfolder4/subfolder5/subfolder6/TextFile3.txt
Users/me/Desktop/TestFolderParent/TestFolder/folder1/subfolder1/subfolder2/subfolder3/subfolder4/subfolder5/subfolder6/TextFile4.txt
Users/me/Desktop/TestFolderParent/TestFolder/folder1/subfolder1/subfolder2/subfolder3/subfolder4/subfolder5/subfolder6/TextFile5.txt
when I run gunzip -r /Users/myuser/Desktop/TestFolderParent/TestFolder in terminal, it extracts all of the .gz files, each as a single text file containing all 5 constituent text files concatenated together. Is there any way to instead run a command to extract each .gz file and return each of the 5 constituent text files as a separate file?
.gz files themselves do not and cannot contain "several sub-directories". The gzip format compresses a single file, and that's it. gunzip will extract exactly one file from one .gz file.
That single file can itself be an uncompressed archive of files. That is often done using the tar archiver, so you end up with a .tar.gz file. Is that what you have? Then you need to use tar, not gunzip to extract the files.

Chilkat unzip files only from root directory

zip.UnzipMatching("qa_output","*.xml",true)
With this syntax I can unzip every Xml in every directory from my zip file and create the same directory structure.
But how can I unzip only the xml in the root directory?
I cannot understand how to write the filter.
I tried with "/*.xml" but nothing is extracted.
If I write "*/*.xml" I only extract xml files from subdirectory (and I skip the xml in the root directory!).
Can anyone help me?
example of a zip files content:
a1.xml
b1.xml
c1.xml
dir1\a2.xml
dir1\c2.xml
dir2\dir3\c3.xml
with unzipmatching("qa_output","*.xml", true) I extract all this files with the original directory structure, but I want to extract only a1.xml, b1.xml and c1.xml.
Is there a way to write filter to achieve this result, or a different command, or a different approach?
I think what you want is to call UnzipMatchingInto: All files (matching the pattern) in the Zip are unzipped into the specfied dirPath regardless of the directory path information contained in the Zip. This has the effect of collapsing all files into a single directory. If several files in the Zip have the same name, the files unzipped last will overwrite the files already unzipped.

How to gzip compress a directory in hdfs without changing the name of the files

I need to gzip compress a directory which will have many files. As i cant modify the file name of the files inside the directory i cant use mapreduce. Is there any way using java interface we can compress a directory without changing the names of the files inside the directory.

Extracting specific files with file extension from a .tar.xz archive using MacOS terminal

I have a number of compressed archives with the extension .tar.xz. I am advised that, when decompressed, the total size required is around 2TB.
Within the archives are a number of images that I am solely after.
Is there a method to solely extract files for example with the extensions .jpeg, .jpeg and .gif from the compressed archives without having to extract every file?
Thanks
It's trivial to just extract one of the file types; for example:
tar -xjf archive.tar.xz '*.jpeg'
will extract all files with the .jpeg extension. It's important to quote the *, as otherwise the shell would attempt to expand it, and would only try to match only the files that were found (or fail because there were no files with that name).
You can similarly use other patterns like '*.gif', or both together:
tar -xjf archive.tar.xz '*.jpeg' '*.gif'
Because you tag that you're using OSX, I'll skip the need to use the --wildcards option, which is needed when trying to extract only those files under linux.

Listing the contents of a LZMA compressed file?

Is it possible to list the contents of a LZMA file (.7zip) without uncompressing the whole file? Also, can I extract a single file from the LZMA file?
My problem: I have a 30GB .7z file that uncompresses to >5TB. I would like to manipulate the original .7z file without needing to do a full uncompress.
Yes. Start with XZ Utils. There are Perl and Python APIs.
You can find the file you want from the headers. Each file is compressed separately, so you can extract just the one you want.
Download lzma922.tar.bz2 from the LZMA SDK files page on Sourceforge, then extract the files and open up C/Util/7z/7zMain.c. There, you will find routines to extract a specific archive file from a .7z archive. You don't need to extract all the data from all the entries, the example code shows how to extract just the one you are interested in. This same code has logic to list the entries without extracting all the compressed data.
I solved this problem by installing 7zip (https://www.7-zip.org/) and using the parameter l. For example:
7z l file.7z
The output has some descriptive information and the list of files in the compressed files. Then, I call this inside python using the subprocess library:
import subprocess
output = subprocess.Popen(["7z","l", "file.7z"], stdout=subprocess.PIPE)
output = output.stdout.read().decode("utf-8")
Don't forget to make sure the program 7z is accessible in your PATH variable. I had to do this manually in Windows.

Resources