Working with hardlinks in *nix - hardlink

I have done the following operations:
echo "test" >> t1
echo "test2" >> t2
ln t1 l1
cp t2 t1
cat l1
To my surprise after overwriting t1 with t2 the hard link was still working. As per my understanding when you create a new version of a file the hard link is not pointed to the new version.
Why after using the cat command on hard link [after overwriting] the hard link is still valid and pointing to the content of the new t1 file?

According to the manpage of cp on OS X, this is intended behaviour:
-f
If the destination file cannot be opened, remove it and create a new file, without prompting for
confirmation regardless of its permissions. (The -f option overrides any previous -n option.)
The target file is not unlinked before the copy. Thus, any existing access rights will be
retained.
-n
Do not overwrite an existing file. (The -n option overrides any previous -f or -i options.)
Meaning that, removing the target file before copying is not done by default. The hard link will only break if one of the files is removed.
In order to consistently get the behaviour you are trying to achieve, you must remove the target file before copying.

Related

Taking back up of files while using Copy & Paste via command line

I am using the command cp -a <source>/* <destination> for copying and pasting the files inside one particular destination. In the destination the above command only replaces the files inside a folder that is present in source as well. If the there are other files present in destination, the command will not do anything and leave as it is. Now before doing the pasting, I want to take the back up of the files that are about to be replaced with the copy paste. Is there an option in the cp command that does this?
There is no such option in cp command. Here you need to create a shell script. First execute a ls command in your destination directory and store the output in a file like history.txt. Now just before cp command execute a grep command with the file you want to copy in the history file to check whether that file is already available in history file or not. If the file is available in destination directory (that means file available in history file) back up the file in destination directory first with todays datestamp and then copy the same file name from source to destination.
If you want to backup these files that will be copied from source, use -b option, available in GNU cp
cp -ab <source>/* <destination>
There is 2 caveats that you should know about.
This command, in my knoledge, is not available in non GNU
system (like BSD systems)
It will ask for confirmation for each existing file in target. We can reduce the probleme with the -u option but this is unusable in a script.
It appears to me that you are trying to make a backup (copy files to another location, don't erase them, don't overwrite those already in them), you probably want to take a look at the rsync command. This same command would be written
rsync -ab --suffix=".bak" <source>/ <destination>
and the rsync command is much more flexible to handle this sort of things.

Using functions as an argument in Bash

I want to move a couple of files from point a to point b
but I have to manually specify
mv /full/path/from/a /full/path/to/b
but some times there are 20 files which I have to move manually. Instead of /full/path/form/a, can't I just enter the a function which returns all the files which I want to move in my case;
/full/path/to/b is a directory, it's the target directory which all the files with extenstions mp3, exe and mp4 must go to:
mv ls *.{mp3,exe,mp4} /full/path/to/b
If I have to move a couple of files and I don't want to do it one by one, how can I optimize the problem?
The command mv ls *.{mp3,exe,mp4} /full/path/to/b in your question is not correct.
As pointed out in comments by #janos, the correct command is
mv *.{mp3,exe,mp4} /full/path/to/b
mv can complain about missing file if the file is really missing and/or the path is not accessible or is not valid.
As i can understand by your question description, if you go manually to the source path you can move the file to the desired directory.
Thus it seems that path is valid, and file exists.
In order mv to keeps complaining about *.mp3 not found (having a valid path and file) the only reason that pops up in my head is the Bash Pathname Expansion feature (enabled by default in my Debian).
Maybe for some reason this pathname expansion bash feature is disabled in your machine.
Try to enable this feature using command bellow and provide the correct command to mv and you should be fine.
$ set +f
PS: Check man bash about pathname expansion.

Error copying directories at command line using cp

Mac OS X Yosemite v.10.10.5.
I am trying to use the cp command to copy one Git directory to another.
This command-line statement:
cp -r /path/to/dir/from/ /path/to/dir/to/
Returns this error:
cp: /path/to/dir/to/.git/objects/00/00ad2afeb304e18870d4509efc89fedcb3f128: Permission denied
This error is returned one time each for (what I believe, but haven't verified, is) every file in the directory.
The first time I ran the command it worked properly, as expected, without error. But, without making any changes to any files, the second (and subsequent) times I ran the command, I got the error.
What's going on? And how can I fix this?
Edit:
In response to a question in the comment:
What does ls -l /path/to/dir/to/.git/objects/00/00ad2afeb304e18870d4509efc89fedcb3f128 show?
The answer is it shows:
-r--r--r-- 1 myusername staff 6151 May 6 00:45 /path/to/dir/to/.git/objects/00/00ad2afeb304e18870d4509efc89fedcb3f128
The reason you are getting Permission Denied is because you are trying to overwrite a file that already exists in the destination directory that has read only permissions set on it. Since it appears you're trying to overwrite it you could just remove the destination directory if it exists before the copy operation. Also you should use -R, not -r ...
Historic versions of the cp utility had a -r option. This
implementation
supports that option; however, its use is strongly discouraged, as it
does not correctly copy special files, symbolic links, or fifo's.
Using a command such as this should resolve your issue:
[[ ! -d dest ]] || rm -rf dest ; cp -R src dest
The above checks if dest exists; if it does recursively remove it, then copy the source to dest,
You may want cp -rp for this operation. -p preserves the user and group IDs associated with the file. Try starting over using -p and see if that solves the issue.
Anther reason you might be seeing this issue is if the permission really is denied. That is, if you're trying to copy into a folder owned by another user without superuser privileges.

Getting directory of a file in Unix

I have a requirement where I need to copy some files from one location to other (Where the file may exist). While doing so,
I need to take a backup if the file already exists.
Copy the new file to the same location
I am facing problem in point 2. While I am trying to get the destination path for copying files, I am unable to extract the directory of the file. I tried using various options of find command, but was unable to crack it.
I need to trim the file name from the full file path so that it can be used in cp command. I am new to shell scripting. Any pointers are appreciated.
You can use
cp --backup
-b'--backup[=METHOD]'
*Note Backup options::. Make a backup of each file that would
otherwise be overwritten or removed. As a special case, `cp'
makes a backup of SOURCE when the force and backup options are
given and SOURCE and DEST are the same name for an existing,
regular file. One useful application of this combination of
options is this tiny Bourne shell script:
#!/bin/sh
# Usage: backup FILE...
# Create a GNU-style backup of each listed FILE.
for i; do
cp --backup --force -- "$i" "$i"
done
If you need only the filename, why not do a
basename /root/wkdir/index.txt
and assign it to a variable which would return only the filename?

How can I find a directory-diff of millions of files to script maintenance?

I have been working on how to verify that millions of files that were on file system A have infact been moved to file system B. While working on a system migration, it became evident that all the files needed to be audited to prove that the files have been moved. The files were initially moved via rsync, which does provide logs, although not in a format that is helpful for doing an audit. So, I wrote this script to index all the files on System A:
#!/bin/bash
# Get directories and file list to be used to verify proper file moves have worked successfully.
LOGDATE=`/usr/bin/date +%Y-%m-%d`
FILE_LIST_OUT=/mounts/A_files_$LOGDATE.txt
MOUNT_POINTS="/mounts/AA mounts/AB"
touch $FILE_LIST_OUT
echo TYPE,USER,GROUP,BYTES,OCTAL,OCTETS,FILE_NAME > $FILE_LIST_OUT
for directory in $MOUNT_POINTS; do
# format: type,user,group,bytes,octal,octets,file_name
gfind $directory -mount -printf "%y","%u","%g","%s","%m","%p\n" >> $FILE_LIST_OUT
done
The file indexing works fine and takes about two hours to index ~30 million files.
On side B is where we run into issues. I have written a very simple shell script that reads the index file, tests to see if the file is there, and then counts up how many files are there, but it's running out of memory while looping through the 30 million lines on indexed file names. Effectively doing this little bit of code below through a while loop, and counters to increment for files found and not found.
if [ -f "$TYPE" "$FILENAME" ] ; then
print file found
++
else
file not found
++
fi
My questions are:
Can a shell script do this type of reporting from such a large list. A 64 bit unix system ran out of memory while trying to execute this script. I have already considered breaking up the input script into smaller chunks to make it faster. Currently it can
If as shell script is inappropriate, what would you suggest?
You just used rsync, use it again...
--ignore-existing
This tells rsync to skip updating files that already exist on the destination (this does not ignore existing directories, or nothing would get done). See also --existing.
This option is a transfer rule, not an exclude, so it doesn’t affect the data that goes into the file-lists, and thus it doesn’t affect deletions. It just limits the files that the receiver requests to be transferred.
This option can be useful for those doing backups using the --link-dest option when they need to continue a backup run that got interrupted. Since a --link-dest run is copied into a new directory hierarchy (when it is used properly), using --ignore existing will ensure that the already-handled files don’t get tweaked (which avoids a change in permissions on the hard-linked files). This does mean that this option is only looking at the existing files in the destination hierarchy itself.
That will actually fix any problems (at least in the same sense that any diff-list on file-exist tests could fix problem. Using --ignore-existing means rsync only does the file-exist tests (so it'll construct the diff list as you request and use it internally). If you just want information on the differences, check --dry-run, and --itemize-changes.
Lets say you have two directories, foo and bar. Let's say bar has three files, 1,2, and 3. Let's say that bar, has a directory quz, which has a file 1. The directory foo is empty:
Now, here is the result,
$ rsync -ri --dry-run --ignore-existing ./bar/ ./foo/
>f+++++++++ 1
>f+++++++++ 2
>f+++++++++ 3
cd+++++++++ quz/
>f+++++++++ quz/1
Note, you're not interested in the cd+++++++++ -- that's just showing you that rsync issued a chdir. Now, let's add a file in foo called 1, and let's use grep to remove the chdir(s),
$ rsync -ri --dry-run --ignore-existing ./bar/ ./foo/ | grep -v '^cd'
>f+++++++++ 2
>f+++++++++ 3
>f+++++++++ quz/1
f is for file. The +++++++++ means the file doesn't exist in the DEST dir.
Here is the bonus, remove --dry-run, and, it'll go ahead and make the changes for you.
Have you considered a solution such as kdiff3, which will diff directories of files ?
Note the feature for version 0.9.84
Directory-Comparison: Option "Full Analysis" allows to show the number
of solved vs. unsolved conflicts or deltas vs. whitespace-changes in
the directory tree.
There is absolutely no problem reading a 30 million line file in a shell script. The reason why your process failed was most likely that you tried to read the file entirely into memory, e.g. by doing something wrong like for i in $(cat file).
The correct way of reading a file is:
while IFS= read -r line
do
echo "Something with $line"
done < someFile
A shell script is inappropriate, yes. You should be using a diff tool:
diff -rNq /original /new
If you're not particular about the solution being a script, you could also look into meld, which would let you diff directory trees quite easily and you can also set ignore patterns if you have any.

Resources