Find and copy multiple pictures using powershell - image

I got a list on excel with picture names I have to find, is it in anyway possible to add the list to powershell and find the pictures and copy them out into one folder?
The list is (about 1000)1310 pictures and there is a total of 44k pictures in aprox a ton of folders. I think maybe it was 500k folders.
Picture of how the image software have made the folder structure
Exact number of files and folders, the last 14k pictures are in another main folder and not relevant for the list

Your question is very broad, and I can only give a very general answer. This is clearly scriptable, but it might take a lot of learning and a lot of effort.
First, you might want to consider what the relationship is between the way pictures are named inthe excel sheet and the way the picture files are named in the folders.
If they follow the same naming rules, that gets one big problem out of the way.
Next, you need to learn how to copy and excel table onto a Csv file.
Then you need to learn how to use Import-csv and feed the stream into a pipeline.
Then you need to process the output of the pipeline to a foreach loop that contains a copy-item cmdlet.
If there is a single master folder that contains all the other folders that contain pictures, then you are in luck. Learn the -path, -recurse, and -include parameters.
Perhaps someone who has already dealt with the same problem can provide you with code. But it may not do what you really want.

Related

for loop command with multiple variables

I am very new to command line and script language, and not so familiar with terms. I will try my best to explain my issue.
I am trying to edit/compress my ebooks in bulk process, and here comes the Calibre.
Calibre is the ebook editing software to make some changes to my epub files.
It's quite feature rich and in most cases easy to use, but only and biggest downside is whenever I edit/covert/read books in the GUI interface I first have to add books in the library, which involves making copy of every file into calibre's predefined directory. and on editor's GUI I can only edit one file at a time.
I have thousands of ebooks to edit, it will take days to do so, it's also bad for my storage space. Big NO NO.
Thankfully, it is possible to compress images in mass on command line interface with the help of useful plugin to asssist with handling bulk files.
for /r "C:\Users\foldername" %v in (*.epub) do calibre-debug -r "Editor Chains" "Compress Images" "%v" "%v.epub"
It works just as expected. it scans all epub files in that folder and its subfolders, and call the plugin to do its job - compress images - and save the output files in the same folder, adding the ".epub" extension to the name of original file name to avoid duplicate.(e.g. this is a book.epub --> this is a book.epub.epub)
Only problem is I have yet to find a way to save the output files in a different folder, with same file name. maybe there's something I am missing or it's just not possible.
If anyone knows how, please let me know.
I've tried:
for /r "C:\Users\foldername" %v & "C:\Users\newfoldername" %x in (*.epub) do calibre-debug -r "Editor Chains" "Compress Images" "%v" "%x"
... and it obviously failed, haha.

How can I find duplicately named files in Windows?

I am organizing a large Windows folder with many subfolders (with sub folders, etc...), in which files have been saved multiple times in different locations. Can anyone figure out how to identify all files with duplicate names across multiple directories? Some ways I am thinking about include:
A command or series of that could be run in the command line (cmd). Perhaps DIR could be a start...
Possibly a tool that comes with Windows
Possibly a way to specify in search to find duplicate filenames
NOT a separate downloadable tool (those could carry unwanted security risks).
I would like to be able to know the directory paths and filename to the duplicate file(s).
Not yet a full solution, but I think I am on the right track, further comments would be appreciated:
From CMD (start, type cmd):
DIR "C:\mypath" /S > filemap.txt
This should generate a recursive list of files within the directories.
TODO: Find a way to have filenames on the left side of the list
From outside cmd:
Open filemap.txt
Copy and paste the results into Excel
From Excel:
Sort the data
Add in the next column logic to compare to see if the current text = previous text (for filename)
Filter on that row to identify all duplicates
To see where the duplicates are located:
Search filemap.txt for the duplicate filenames identified above and note their directory location.
Note: I plan to update this as I get further along, or if a better solution is found.

How to create a powershell-5 module with multiple .psm1 files

Lest say I have three classes names Dog, Cat and Tree in matching files Dog_file.psm1,Cat_file.psm1 and Tree_file.psm1. I need to create one Powershell module with these three files. while I'm reading trough https://msdn.microsoft.com/en-us/library/dd878337(v=vs.85).aspx I feel it can be achieve.
By reading various post and articles I think NestedModules property in the .psd1 is the way. But not sure exactly how would be the folder structure that files need to place and syntax of the .psd1 file. So far couldn't find any sample
Someone done this before? I'm using Powershell 5

process 100K of image files with bash

here is the script to optimize jpg images: https://github.com/kormoc/imgopt/blob/master/imgopt
There is a CMS with image files (not mine).
I assume there is a complicated structure of subdirectories and script just recursively find all img files in given folder.
The question is how to mark already processed files so with next run
script won't touch them and just skip?
I dont know when the guys would like to add new files to it and process it. Also I think renaming is not a good choice either.
I was thinking about hash-table or associative array which will be filled from txt file during
start. But is it ok to have 100K of items array in bash? Seems complicated for a script.
Any other ideas about optimization are also welcome.
I think the easiest thing to do is just output a file with a similar name per processed image file.
For example image1.jpg after being processed would have an empty file with a similar name e.g. .image1.jpg.processed.
Then when your script runs it just checks if the for the current image: NAME.EXT if a file .NAME.EXT.processed exists. If the file doesn't exist then you know it needs to be processed. No memory issues and no hashtable needed granted you will have 100K of empty extra files.

Bash - Identify files not referenced by other files

I have a website that runs off an OpenWRT router. I'd like to optimize the site by removing an files that aren't being used. Here is my directory structure...
/www/images
/www/js
/www/styles
/www/otherSubDirectories <--- not really named that
I'm mostly concerned about identifying images that are not used because those take the most space. But it would also be nice to identify style sheets and javascript files that are not being used. So, is there a way I can search /www and all sub directories and files and print a list of files in /www/images, /www/js, and /www/styles that are not referenced by any other files?
When I'm looking for files that contain a specific string I use this:
find . | xargs grep -Hn 'myImage.jpg'
That would tell me all files that reference the image. Maybe some variation of that?
Any help would be appreciated!
EV
Swiss File Knife is very nice tool.
Find out which files are used (referenced) by other files through fuzzy content analysis
Consider using a cross-reference program (for example, lxr) for this problem. (I haven't verified if lxr can do the job, but believe it can.) If an off-the-shelf cross-reference program doesn't work, look for an open source cross-reference program in a language you know, and adapt it.

Resources