".JPG" or ".jpg"? [closed] - performance

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
Do someone know the real difference between this two types of files?
I know that i can't link .JPG as .jpg and vice versa, but there is no proper answer why something like this exists. I also recognized, that .JPG files are a little bit smaller than .jpg. Is it a bug or saving "photo.JPG" is better than "photo.jpg"?
//I swap the file names to know, that nomenclature is not a problem here. Difference between this pictures is only enhancement.
Two forums, which had also this question, but no acceptable answers:
1.https://in.answers.yahoo.com/question/index?qid=20070109235224AAQpacy
2.https://ubuntuforums.org/showthread.php?t=1372695

There is no difference. Filename extensions do not define the contents of a file or how they "perform" - the file contents do that. Or sometimes the file's mime-type, if that metadata is stored with the file and the OS/applications know to look for it.
The size difference is merely because of a slightly different algorithm, configuration settings, or decisions that the JPEG encoder made at the time the file was saved. Because JPEG is a lossy file format, it's possible to get different file sizes each time the file is saved.
Because Windows uses case-insensitive (by default) but case-preserving filesystems, the capitalization of the filename means nothing. All it says is that the user who saved the file, or the program that they used to save it, chose to use JPG in one case and jpg in another.
As Bakuriu correctly points out, the extension is 3 characters for historical/legacy reasons - there's no reason on modern systems that it needs to be JPG or jpg - it could be JPEG, JPEG or even JPEG-2000 and be equally valid as long as the system/application looking at the file knows how to look at the file header and properly identify it as a JPEG image. Using file extensions to indicate file types is kind of an antiquated notion.

The file extension is meaningless BUT usually:
It gives a hint of the expected mime-type of the file
It determines how the file will be handled by the operating system
It has no effect on the actual mime type of the file and is just convention in naming. The difference however is that NTFS is a case sensitive file system (although turned off by default and only acts in a case preserving way as pointed out by alroc), so accessing it can be problematic in a few circumstances.
To actually compare the file on a binary level to determine if there is any difference I suggest using something like Duplicate files finder or any other software that creates a cryptographic hash, which you can compare.

Related

How to compress file on HFS programmatically?

macOS HFS+ supports transparent filesystem-level compression. How can I enable this compression for certain files via a programmatic API? (e.g. Cocoa or C interface)
I'd like to achieve effect of ditto --hfsCompression src dst, but without shelling out.
To clarify: I'm asking how to make uncompressed file compressed. I'm not interested in reading or preserving existing HFS compression state.
There's afsctool which is an open source implementation of HFS+ compression. It was originally by hacker brkirch (macrumors forum link, as he still visits there) but has since been expanded and improved a great deal by #rjvb who is doing amazing things with it.
The copyfile.c file discloses some of the implementation details.
There's also a compression tool based on that: afsctool.
I think you're asking two different questions (and might not know it).
If you're asking "How can I make arbitrary file 'A' an HFS compressed file?" the answer is, you can,'t. HFS compressed files are created by the installer and (technically[1]) only Apple can create them.
If you are asking "How can I emulate the --hfsCompression logic in ditto such that I can copy an HFS compressed file from one HFS+ volume to another HFS+ volume and preserve its compression?" the answer to that is pretty straight forward, albeit not well documented.
HFS Compressed files have a special UF_COMPRESSED file flag. If you see that, the data fork of the file is actually an uncompressed image of a hidden resource. The compressed version of the file is stored in a special extended attribute. It's special because it normally doesn't appear in the list of attributes when you request them (so if you just ls -l# the file, for example, you won't see it). To list and read this special attribute you must pass the XATTR_SHOWCOMPRESSION flag to both the listxattr() and getxattr() functions.
To restore a compressed file, you reverse the process: Write an empty file, then restore all of its extended attributes, specifically the special one. When you're done, set the file's UF_COMPRESSED flag and the uncompressed data will magically appear in its data fork.
[1] Note: It's rumored that the compressed resource of a file is just a ZIPed version of the data, possibly with some wrapper around it. I've never taken the time to experiment, but if you're intent on creating your own compressed files you could take a stab at reverse-engineering the compressed extended attribute.

Go: read block of lines in a zip file [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I need to read a block of n lines in a zip files quickly as possible.
I'm beginer in Go. For bash lovers, I want to do the same as (to get a block of 500 lines between lines 199500 and 200000):
time query=$(zcat fake_contacts_200k.zip | sed '199500,200000!d')
real 0m0.106s
user 0m0.119s
sys 0m0.013s
Any idea is welcome.
Import archive/zip.
Open and read the archive file
as shown in the example right there in the docs.
Note that in order to mimic the behaviour of zcat you have to
first check the length of the File field of the zip.ReadCloser
instance returned by a call to zip.OpenReader,
and fail if it is not equal to 1 — that is, there is no files in the
archive or there are two or more files in it¹.
Note that you have to check the error value
returned by a call to zip.OpenReader for being equal to zip.ErrFormat,
and if it's equal, you have to:
Close the returned zip.ReadCloser.
Try to reinterpret the file as being gzip-formatted (step 4).
Take the first (and sole) File member and
call Open on it.
You can then read the file's contents from the returned io.ReaderCloser.
After reading, you need to call Close() on that instance and then
close the zip file as well. That's all. ∎
If step (2) failed because the file did not have the zip format,
you'd test whether it's gzip-formatted.
In order to do this, you do basically the same steps using the
compress/gzip package.
Note that contrary to the zip format, gzip does not provide file archival — it's merely a compressor, so there's no meta information on any files in the gzip stream, just the compressed data.
(This fact is underlined by the difference in the names of the packages.)
If an attempt to opening the same file as a gzip archive returns
the gzip.ErrHeader error, you bail out, otherwise you read the data
after which you close the reader. That's all. ∎
To process just the specific lines from the decompressed file,
you'd need to
Skip the lines before the first one to process.
Process the lines until, and including the last one to process.
Stop processing.
To interpret the data read from an io.Reader or io.ReadCloser,
it's best to use bufio.Scanner —
see the "Example (Lines)" there.
P.S.
Please read thoroughly this essay
to try to make your next question better that this one.
¹ You might as well read all the files and interpret their contents
as a contiguous stream — that would deviate from the behaviour of zcat
but that might be better. It really depends on your data.

Textstream to read non-text files

Is using the Microsoft scripting filesystemobject's OpenTextFile method (to set a textstream-typed or untyped variable), with open type = 8 (for appending), and seeing if that line of code can execute without error, a reasonably reliable way to ascertain whether or not the file is locked in any of the typical ways (i.e. another user or program has it open or locked in usage, or it actually has a file attribute of Read Only, but that last thing is not my primary goal, yes I already know about reading Attributes)...?
I've heard of doing this, but I'm just wanting to get some input. Obviously, the documentation on opentextfile generally focuses on the apparent assumption that you are actually working with TEXT files.
But my question then is two-fold:
Is the simple test of seeing if OpenTextFile (path,8) executes successfully pretty much a green light to assume it is not locked for some reason?
Will this work for other file types, like docx, PDF, etc. I mean I know the line of code seems to work, but is it equally applicable to the question of whether the file is locked for some reason.

Why I can't rename a file that is in use [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I just wounder why I can't rename a file that is opened, or in use by other program?
what is the purpose of that ?
The question is based on a false premise, you most certainly can rename a file that is in use on common file systems that are used on Windows. There is very little a process can do to prevent this, short from changing the ACL on the file to deny access. That is exceedingly rare.
Locking a file protects the file data, not the file metadata.
This feature has many uses, most notably the ReplaceFile() winapi function depends on it. Which is the way a program can save a file even if another process has it locked.
The one thing you cannot do is rename the file to move it to a different drive. Because that requires much more work then simply altering or moving the directory entry of the file. It also requires copying the file data from one drive to another. That of course is going to fail when the file data is locked.
Because file currently in use. You cannot change the name of the file.
when file is opened it's process is created. you can not change the name of process at runtime.
Hope question perfectly answered
It's a design decision resulting is less complicated behavior. When a file F is opened by process A, you must assume A works with the name of F as with useful information, e.g. displays it to the user, passes it around to some other process, stores it in configuration, MRU list, whatever, etc. Hence if process B renamed F, process A would now work with invalid information. Hence it is generally safer to disallow such manipulations.

Deleted file recovery program using C C++ [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to write a program that can recover deleted files from hard drive ( FAT32/NTFS partition Windows). I don't know where to start from. What should be the starting point of this? What should i read to pursue this? Help is required. Which system level structs should i study?
It's entirely a matter of the filesystem layout, how a "file" actually looks on disk, and what remains when a file is deleted. As such, pretty much all you need to understand is the filesystem spec (for each and every filesystem you want to support), and how to get direct block-level access to the HD data. It might be possible to reuse some code from existing filesystem drivers, but it will need to be modified to process structures that, from the point of view of the filesystem, are gone.
NTFS technical reference
NTFS.com
FAT32 spec
You should know first how file deletion is done in FAT32/NTFS, and how other undelete softwares work.
Undelete software understands the internals of the system used to store files on a disk (the file system) and uses this knowledge to locate the disk space that was occupied by a deleted file. Because another file may have used some or all of this disk space there is no guarantee that a deleted file can be recovered or if it is, that it won't have suffered some corruption. But because the space isn't re-used straight away there is a very good chance that you will recover the deleted file 100% intact. People who use deleted file recovery software are often amazed to find that it finds files that were deleted months or even years ago. The best undelete programs give you an indication of the chances of recovering a file intact and even provide file viewers so you can check the contents before recovery.
Here's a good read (but not so technical): http://www.tech-pro.net/how-to-recover-deleted-files.html
This is not as difficult as you think. You need to understand how files are stored in fat32 and NTFS. I recommend you use winhex an application used for digital forensics to check your address calculations are correct.
Ie NTFS uses master file records to store data of the file in clusters. Unlink deletes file in c but if you look at the source code all it does is removes entry from table and updates the records. Use an app like winhex to read information of the master file record. Here are some useful info.
Master boot record - sector 0
Hex 0x55AA is the end of MBR. Next will be mft
File name is mft header.
There is a flag to denote folder or file (not sure where).
The file located flag tells if file is marked deleted. You will need to change this flag if you to recover deleted file.
You need cluster size and number of clusters as well as the cluster number of where your data starts to calculate the start address if you want to access data from the master file table.
Not sure of FAT32 but just use same approach. There is a useful 21 YouTube video which explains how to use winhex to access deleted file data on NTFS. Not sure the video but just type in winhex digital forensics recover deleted file. Once you watch this video it will become much clearer.
good luck
Just watched the 21 min YouTube video on how to recover files deleted in NTFS using winhex. Don't forget resident flag which denotes if the file is resident or not. This gives you some idea of how the file is stored either in clusters or just in the mft data section if small. This may be required if you want to access the deleted data. This video is perfect to start with as it contains all the offset byte position to access most of the required information relative to beginning of the file record. It even shows you how to do the address calculation for the start of the cluster. You will need to access the table in binary format using a pointer and adding offsets to the pointer to access the required information. The only way to do it is go through the whole table and do a binary comparison of the filename byte for byte. Some fields are little eindian so make sure you got winhex to check your address calculations.

Resources