Bash - Identify files not referenced by other files - bash

I have a website that runs off an OpenWRT router. I'd like to optimize the site by removing an files that aren't being used. Here is my directory structure...
/www/images
/www/js
/www/styles
/www/otherSubDirectories <--- not really named that
I'm mostly concerned about identifying images that are not used because those take the most space. But it would also be nice to identify style sheets and javascript files that are not being used. So, is there a way I can search /www and all sub directories and files and print a list of files in /www/images, /www/js, and /www/styles that are not referenced by any other files?
When I'm looking for files that contain a specific string I use this:
find . | xargs grep -Hn 'myImage.jpg'
That would tell me all files that reference the image. Maybe some variation of that?
Any help would be appreciated!
EV

Swiss File Knife is very nice tool.
Find out which files are used (referenced) by other files through fuzzy content analysis

Consider using a cross-reference program (for example, lxr) for this problem. (I haven't verified if lxr can do the job, but believe it can.) If an off-the-shelf cross-reference program doesn't work, look for an open source cross-reference program in a language you know, and adapt it.

Related

Find and copy multiple pictures using powershell

I got a list on excel with picture names I have to find, is it in anyway possible to add the list to powershell and find the pictures and copy them out into one folder?
The list is (about 1000)1310 pictures and there is a total of 44k pictures in aprox a ton of folders. I think maybe it was 500k folders.
Picture of how the image software have made the folder structure
Exact number of files and folders, the last 14k pictures are in another main folder and not relevant for the list
Your question is very broad, and I can only give a very general answer. This is clearly scriptable, but it might take a lot of learning and a lot of effort.
First, you might want to consider what the relationship is between the way pictures are named inthe excel sheet and the way the picture files are named in the folders.
If they follow the same naming rules, that gets one big problem out of the way.
Next, you need to learn how to copy and excel table onto a Csv file.
Then you need to learn how to use Import-csv and feed the stream into a pipeline.
Then you need to process the output of the pipeline to a foreach loop that contains a copy-item cmdlet.
If there is a single master folder that contains all the other folders that contain pictures, then you are in luck. Learn the -path, -recurse, and -include parameters.
Perhaps someone who has already dealt with the same problem can provide you with code. But it may not do what you really want.

Getting data from .dat files

I'm hoping somebody out there can help me with this. I'm attempting to extract some barcode data from some .dat files. Its a B Tree file system with groups of three files .dat .ix. .dia. The company that wrote the software (a long time ago) say that the program is written in Pascal. I have no experience in reverse engineering but from what I read its most likely the only way to extract the data as the structure of the database is contained in the code of the program. I'm looking for advice on where to start.
I suppose the first thing you need to do is to see if the exe you've got was written with Delphi. You can check with this: http://cc.embarcadero.com/Item/15250
Then, to see if the exe that creates those .dat files were made with 'TurboPower B-Tree Filer', the I'd suggest you download and take a look at this: http://sourceforge.net/projects/tpbtreefiler/
At this step, looking at these sources is needed to familiarize yourself with the class names used in 'TurboPower B-Tree Filer' to help determine if any of those classes were used in your exe.
Then, using 'XN Resource Editor' [search the Internet for this] or, probhably better, 'MiTeC Portable Executable Reader' [ http://www.mitec.cz/pe.html ], see if any class names are relevant.
If they are, then you're in luck --sort of. All you will need to do is to write an app using 'TurboPower B-Tree Filer' to import the data in your dat files to export or manipulate as you wish.
At that point, you might find this link useful.
TurboPower B-Tree Filer and Delphi XE2 - Anyone done it?
If, OTOH, none of the above applies; I fear the only option is to reverse engineer the exe you have.

Handle single files while extracting tar.gz

I am having a huge .tgz file which is further structured inside like this:
./RandomFoldername1/file1
./RandomFoldername1/file2
./RandomFoldername2/file1
./RandomFoldername2/file2
etc
What I want to do is having each individual file extracted to standard output so that I can pipe it afterwards to another command. While doing this, I also need to get the RandomFoldername name and file name so that I can deal with them properly from within the second command.
Till now the options I have are
to either extract all of the tarball and deal with the structured files that I will be having, which is not an option since the extracted tar doesn't fit into the hard drive
Make a loop that pattern match each file and extract one file at time. This option although that solves the problem, is too slow because the tarball is sweeped each time for only one file.
While searching on how to solve this, I've started to fear that there is no better alternative to this.
Using tar the tool I don't believe you have any other options.
Using a tar library for some language of your choice should allow you to do what you want though as it should let you iterate over the entries in the tarball one-by-one and allow you to extract/pipe/etc. each file one-by-one as necessary.

Operating system independent image addressing

Due to using both Windows and Ubuntu on my computer I'd like to be able to create documents independently. I have one directory for logos and I want to use them in any documents everywhere.
The problem with different file addressing I solved with those commands:
\newcommand{\winlogo}{D:/logo/}
\newcommand{\linlogo}{/media/DATA/logo/}
\includegraphics{\winlogo logo_bw}
How to provide this feature:
if(parameter==windows){adress:=D:/logo/}
elseif(parameter=linux){adress:=/media/DATA/logo}
else{error}
I've run into this problem as well, and I found that hard-coding the paths is an absolutely terrible idea. Also, keeping these directories in sync will eventually be a problem once your projects begin to grow.
The way I solved this was to put everything in version control (I like git, your mileage may vary).
Then I created an images folder, so my folder hierarchy looks like this:
Working-Dir
|-- images/
|-- myfile.tex
|-- nextfile.tex
Then in the preamble of my documents: \usepackage{graphicx} and \graphicspath{{images/}} which tells latex to look for a folder called images, then look for the graphics inside the folder.
Then I do my work on on comp, push my finished work back the repo, and when I switch computers I just pull from my repo. This way, everything stays in sync, no matter which computer i'm working on.
Treating tex source like source code has greatly improved my work flow and efficiency. I'd suggest similar measures for anyone dealing with a lot of latex source.
EDIT:
From: http://en.wikibooks.org/wiki/LaTeX/Importing_Graphics
Graphics storage
There is a way to tell LaTeX where to
look for images: for example, it can
be useful if you store images
centrally for use in many different
documents. The answer is in the
command \graphicspath which you supply
with an argument giving the name of an
additional directory path you want
searched when a file uses the
\includegraphics command, here are
some examples:
\graphicspath{{c:\mypict~1\camera}}
\graphicspath{{/var/lib/images/}}
\graphicspath{{./images/}}
\graphicspath{{images_folder/}{other_folder/}{third_folder/}}
please see
http://www.ctan.org/tex-archive/macros/latex/required/graphics/grfguide.pdf
As you may have noticed, in the first
example I've used the "safe" (MS-DOS)
form of the Windows MyPictures folder
because it's a bad idea to use
directory names containing spaces.
Using absolute paths, \graphicspath
does make your file less portable,
while using relative paths (like the
last example), you shouldn't have any
problem with portability, but remember
not to use spaces in file-names.
Alternatively, if you are using
PDFLaTeX, you can use the package
grffile which will then allow you to
use spaces in file names.
The third option should do you well-- just specify multiple paths for the \graphicspath I wonder if LaTeX will fail gracefully if you just include all of your paths in there (one for images, one for your logs on linux, one for your logos on windows)?
Mica, thank you once more, your advice works properly!
I've tested this code in preamble, in .sty file it doesn't work:
\usepackage{graphicx}
\graphicspath{{/media/DATA/logo/}{d:/logo/}{img/}}
where
/media/DATA/logo/ is address to directory with logos on mounted partition in Linux
d:/logo/ is address to same directory in windows
img/ is address of images for current document in actual working directory
and this code in document:
\includegraphics{logo_zcu_c} from logo dir
\includegraphics{hvof} from img/ dir`

Utility to hash and list files with identical contents?

UltraEdit saves temporary, ie. unsaved/untitled, files as (regex) "Edit.\d+".
When UltraEdit is killed (I do this when some software nags me to reboot), I noticed that it doesn't always save files in the same directory, so I end up with a bunch of "Edit.\d+" files scattered in my two hard-disks, with a lot of identical contents.
So I'd like a free utility for Windows that can...
search my hard-disks for all files whose filename matches "Edit.\d+"
generate some hashing of the file so it has some signature, and
output a list of all identical files so that I don't waste time checking files that exist in multiple copies on my hard-disk, and just take care of unique files.
Anyone knows of such a thing?
Thank you.
found this: http://www.atory.com/Dupe_Checker/
can't give you a review but it looks legit

Resources