find / -iregex ".*large_files.*" ran endless - bash

I run the command to find the files named ".*large_files.*"
[root#iz2ze9wve43n2nyuvmsfx5z ~]# find / -iregex ".*large_files.*"
/root/search_large_files.py
It found the file but the cursor is shinning endless even if I leave it alone for over half an hour.
What's the bug in my codes to cause the problem?

Well, it may be that you just have massive file systems :-)
But, if you think it shouldn't be taking that long, you may well have mount points that are slower than normal, such as NFS-mounts where you have to go out over the network to get file information.
You could probably see a slow-down in that case if you just run find / on its own. If it goes out to an external location (like, I don't know, a ZX80 running in Antartica), the output rate may show that, and you'll be able to identify where in the hierarchical structure it happens.
Another possibility is to restrict it to the actual file system you're on to minimise the chance it will go external. That would be by using the xdev flag to prevent it crossing file systems. On my VM with one root file system but mounts for my C and D host drives, I cut the time down from two minutes to seventeen seconds.
Of course, that won't go to other local file systems but you could, if necessary write a script to find (with xdev) the file on all file systems marked ext4 (and whatever other ones you deem to be local).

Related

How to debug potential CPU/RAM errors in Bash script on Linux

I have a relatively simple bash script that reads from a set of static input files, stores the input in bash variables and then does a bunch of processing over said input by calling out to external scripts (e.g. written in Python, Go, other bash scripts etc.) and using the intermediate results.
Lately I have been experiencing an intermittent problem where a single character seems to be getting altered somewhere during the processing which then causes subsequent errors. Specifically, a lot of the processing I'm doing involves slicing up a list of comma-separated records, and one of the values on each line is a unix timestamp, e.g. 1354245000.
What seems to be happening is that occasionally one of these values will get altered slightly, so I end up with a timestamp like 13542458=2 or 13542458>2 or 13542458;2 coming out of one of the intermediate scripts. This then subsequently gets fed into another script, which throws an exception when it tries to parse the value to an integer.
In the title of this question, I've suggested that this might be a potential CPU/RAM error. I know the general folly in thinking errors are caused by low level things like hardware/compilers etcetera, but the nature of this particular error makes me think it may be possible, for the following reasons:
The input files are the same on each invocation of the script, and the script only fails on some invocations.
I cannot think of any sources of randomness in the source code prior to where the script is breaking. It's basically just slicing and dicing csv input.
I cannot think of any sources of concurrency in the source code -- even the Go scripts aren't actually written to run anything concurrently.
This problem has only arisen in the last week or so. Prior to this time, this error would never occur.
While I haven't documented every erroneous character, they seem to often be quite close in the ASCII table to numeric values (=, >, ; etc). That said, I guess the Hamming distance between two characters quite far apart can be small also with changes to a high order bit.
The script often breaks at a different stage on different runs. i.e. I have a number of separate Python scripts, and sometimes it'll make it past one script and then the error will be induced in another. Other times it'll be induced on an earlier script.
What I'd like to know is, is there any methodical way to either confirm or rule out a hardware error for this problem? Or if it is a hardware problem, is it possibly undetectable by the operating system?
A bit of further info on the machine:
Linux 64-bit, Ubuntu 12.04
Intel i7 processor
16GB DDR3 RAM
I'm hoping someone can either point me to a reliable way to verify whether the hardware is to blame or otherwise a sound reason as to what else might be the cause.
Try booting into Memtest to check your memory.
While it is highly unlikely that it will be hardware, if you have exhausted you standard software debug as suggested by #OliCharlesworth, here is an outline of hardware error investigation:
(1) check your log area for any `MCE` logs (machine check exceptions).
If you find any in either your log area (syslog) or sometimes in
the present working dir or /dir -- you have a hardware failure.
(2) check your log area for disk errors. e.g:
smartd[3963]: Device: /dev/sda [SAT], 34 Currently unreadable (pending) sectors
(3) check your drive integrity, e.g.: (as root) # `smartctl -a /dev/sda` if any abnormality, run:
smartctl -t short /dev/sda (change drive as required)
(4) download/install/boot to [memtest86](http://www.memtest86.com/download.htm)
(run the complete test)
If your cpu/motherboard has thrown no mce's, you have no disk error, your drive tests OK with smartctl and you have no memory errors with memtest86, then recheck the software debugging. While additional hardware errors can still be present (bad capacitors, etc..) the likelihood at this point is software. Good luck.

Transferring (stopping, resuming) file using rsync

I have an external hard-drive that I suspect is on its way out. At the minute, I can transfer files from it, but only for a while. Unfortunately, I have one single file that's >50GB in size. My solution to this is to use rsync to transfer this one particular file a bit at a time, leave the drive to rest (switch it off), and resume a little while later.
I'm using rsync --partial --progress --inplace --append -a /Volumes/Backup\ Drive/chris/Desktop/Recording\ Sessions/S1/Session\ 1/untitled ~/Desktop/temp to transfer it. (The file is in the untitled folder, which I'm moving into the temp folder) However, after having stopped it and resumed it, it seems to be over-writing the previous attempt at the file, meaning I don't really get any further.
Is there something I'm missing? :X
Thankyou ^_^
EDIT: Still don't know :\
Well, since this is a programming site, here's a program to do it. I tested it on OS X, but you should definitely test it on some small files first to make sure it does what you want:
#!/usr/bin/env python
import os
import sys
source = sys.argv[1]
target = sys.argv[2]
begin = int(sys.argv[3])
end = int(sys.argv[4])
mode = 'r+b' if os.path.exists(target) else 'w+b'
with open(source, 'rb') as source_file, open(target, mode) as target_file:
source_file.seek(begin)
target_file.seek(begin)
buffer = source_file.read(end - begin)
target_file.write(buffer)
You run this with four arguments: the source file, the destination, and two numbers. The first number is the byte count to start copying from (so on the first run you'd use 0). The second number is the byte count to copy until (not including). So on subsequent runs you'd always use the previous fourth argument as the new third argument (new begin equals old end). And just go on like that until it's done, using whatever sizes you like along the way.
I know this is related to macOS, but the best way to get all the files off a dying drive is with GNU ddrescue. I have no idea if this runs nicely on macOS, but you can always use a Linux live-usb to do this. You'll want to open a terminal and be either root (preferred) or use sudo.
Firstly, find the disk that you want to backup. This can be done by running the following. Make note of the partition name or disk name that you want to back up. Hard drives/flash drives will typically use the format sdX, where X is the drive letter. Partitions will be listed under sdX1, sdX2... etc. NVMe drives/partitions follow a similar naming convention.
lsblk -o name,size,label,fstype,model
Mount and change directory (cd) to a writable location that is bigger than the drive/partition you want to back up.
Now we are going to do a first pass over the drive/partition. This will do a first pass, without stopping on problematic sections. This will ensure that ddrescue does not cause any more damage by trying to access a bad section. Think of it like a hole in a sweater -- you wouldn't want to keep picking at the hole or it would get bigger. Run the following, with sdX replaced with the drive/partition name from earlier:
ddrescue -d /dev/sdX backup.img backup.logfile
the -d flag uses direct disk access and ignores the kernel cache, and the logfile is important in case the drive gets disconnected or the process stops somehow.
Run ddrescue again with the -r flag. This will retry bad sections 3 times. Feel free to run this a few times, but note that ddrescue cannot restore everything. From my experience it usually restores in the high 90%s, and many of the files are system files (aka not your personal files).
ddrescue -d -r3 /dev/sdX backup.img backup.logfile
Finally, you can use the image however you want. You can either mount it to copy the files off or use it in a virtual machine/burn it to a working drive with dd. Do note that the latter options will not always work if system critical files were damaged.
Good luck and remember to make backups!

bash protect HD from excessive use

How do I avoid breaking the HD? I have a bash script running on an ubuntu machine, with this meta code:
bash1.sh
while(true)
run bash2.sh
sleep 60 seconds
done
bash2.sh:
if(directory is empty): exit
process file
delete file
The directory is network shared, and the computer is not doing anything else. Once per day a new file arrives and is processed. (I do know that bash1.sh can be replaced by watch). My concern is that bash1.sh is reading bash2.sh everytime - that can presumably be avoided by only having one script!? and bash2.sh is reading the same directory everytime. Is the directory really read from the HD, or is ubuntu somehow caching the dir in ram? -so it is only read when something changes? is it a problem that it is the same place on the HD that is read every time, or does it not matter because the HD is already spinning? If the HD never sleeps, does it matter if I set the loop time down to only one second?
Maybe the directory could be a pure ram dir - how do I do that? -or is there some simple way to check if something has arrived over the network without reading the directory?
Reading a file or directory once every sixty seconds is not excessive use.
Seriously, don't worry about it.
If it's really worrying you, you can rethink your strategy for detecting the file.
For example, do you really need to know, within sixty seconds, that the file has arrived? Can it arrive any time during the day? Can some parts of the day be considered unlikely?
Using information like that, you can adjust the timing of checks to suit. If the file is supposed to be delivered after 4pm, don't check for it at all before then.
Check for it every sixty seconds between 4pm and 5pm, then every ten minutes after that.
These are all business-related decisions that can be made but I would still suggest that it's unnecessary. Provided you regularly back up your disks (and have standby hardware if you need to be back up in a hurry), you shouldn't lose anything.
In fact, if you were really paranoid, you could dedicate an entire machine for this, whose sole purpose is to receive the file via FTP and, when it arrives, send it across to your real processing box.
Put nothing else on that machine and have a warm standby (exactly same software, IP address and so on but powered down) so that, if it fails, the standby can be activated in minutes.
The real processing machine is then only written to once a day - that's unlikely to affect the disk lifetime.
That's probably too paranoid for my liking but it shows that there are ways to mitigate almost any problem.

command line wisdom for 2 panel file manager user

Want to upgrade my file management productivity by replacing 2 panel file manager with command line (bash or cygwin). Can commandline give same speed? Please advise a guru way of how to do e.g. copy of some file in directory A to the directory B. Is it heavy use of pushd/popd? Or creation of links to most often used directories? What are the best practices and a day-to-day routine to manage files of a command line master?
Can commandline give same speed?
My experience is that commandline copying is significantly faster (especially in the Windows environment). Of course the basic laws of physics still apply, a file that is 1000 times bigger than a file that copies in 1 second will still take 1000 seconds to copy.
..(howto) copy of some file in directory A to the directory B.
Because I often have 5-10 projects that use similar directory structures, I set up variables for each subdir using a naming convention :
project=NewMatch
NM_scripts=${project}/scripts
NM_data=${project}/data
NM_logs=${project}/logs
NM_cfg=${project}/cfg
proj2=AlternateMatch
altM_scripts=${proj2}/scripts
altM_data=${proj2}/data
altM_logs=${proj2}/logs
altM_cfg=${proj2}/cfg
You can make this sort of thing as spartan or baroque as needed to match your theory of living/programming.
Then you can easily copy the cfg from 1 project to another
cp -p $NM_cfg/*.cfg ${altM_cfg}
Is it heavy use of pushd/popd?
Some people seem to really like that. You can try it and see what you thing.
Or creation of links to most often used directories?
Links to dirs are, in my experience used more for software development where a source code is expecting a certain set of dir names, and your installation has different names. Then making links to supply the dir paths expected is helpful. For production data, is just one more thing that can get messed up, or blow up. That's not always true, maybe you'll have a really good reason to have links, but I wouldn't start out that way, just because it is possible to do.
What are the best practices and a day-to-day routine to manage files of a command line master?
( Per above, use standardized directory structure for all projects.
Have scripts save any small files to a directory your dept keeps in the /tmp dir, .
i.e /tmp/MyDeptsTmpFile (named to fit your local conventions) )
It depends. If you're talking about data and logfiles, dated fileNames can save you a lot of time. I recommend dateFmts like YYYYMMDD(_HHMMSS) if you need the extra resolution.
Dated logfiles are very handy, when a current process seems like it is taking a long time, you can look at the log file from a week ago and quantify exactly how long this process took, a week, month, 6 months (up to how much space you can afford). LogFiles should also capture all STDERR messages, so you never have to re-run a bombed program just to see what the error message was.
This is Linux/Unix you're using, right? Read the man page for the cp cmd installed on your machine. I recommend using an alias like alias CP='/bin/cp -pi' so you always copy a file with the same permissions and with the original files' time stamp. Then it is easy to use /bin/ls -ltr to see a sorted list of files with the most recent files showing up at the bottom of the list. (No need to scroll back to the top, when you sort by time,reverse). Also the '-i' option will warn you that you are going to overwrite a file, and this has saved me more than a couple of times.
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, and/or give it a + (or -) as a useful answer.

How to programmatically find the difference between two directories

First off; I am not necessarily looking for Delphi code, spit it out any way you want.
I've been searching around (especially here) and found a bit about people looking for ways to compare to directories (inclusive subdirs) though they were using byte-by-byte methods. Second off, I am not looking for a difftool, I am "just" looking for a way to find files which do not match and, just as important, files which are in one directory but not the other and vice versa.
To be more specific: I have one directory (the backup folder) which I constantly update using FindFirstChangeNotification. Though the first time I need to copy all files and I also need to check the backup directory against the original when the applications starts (in case something happened when the application wasn't running or FindFirstChangeNotification didn't catch a file change). To solve this I am thinking of creating a CRC list for the backed up files and then run through the original directory computing the CRC for every file and finally compare the two CRCs. Then somehow look for files which are in one directory and not the other (again; vice versa).
Here's the question: Is this the fastest way? If so, how would one (roughly) get the job done?
You don't necessarily need CRCs for each file, you can just compare the "last modified" date for every file for most normal purposes. It's WAY faster. If you need additional safety, you can also compare the lengths. You get both of these metrics for free with the find functions.
And in your change notification, you should probably add the files to a queue and use a timer object to copy the new queued files every ~30sec or something, so you don't bog down the system with frequent updates/checks.
For additional speed, use the Win32 functions wherever possible, avoid any Delphi find/copy/getfileinfo functions. I'm not familiar with the Delphi framework but for example the C# stuff is WAY WAY WAY slower than the Win32 functions.
Regardless of you "not looking for a difftool", are you opposed to using Cygwin with it's "diff" command for the shell? If you are open to this its quite easy, particularly using diff with the -r "recursive" option.
The following generates the differences between 2 Rails installs on my machine, and greps out not only information about differences between files but also, specifically by grepping for 'Only', finds files in one directory, but not the other:
$ diff -r pgnindex pgnonrails | egrep '^Only|diff'
Only in pgnindex/app/controllers: openings_controller.rb
Only in pgnindex/app/helpers: openings_helper.rb
Only in pgnindex/app/views: openings
diff -r pgnindex/config/environment.rb pgnonrails/config/environment.rb
diff -r pgnindex/config/initializers/session_store.rb pgnonrails/config/initializers/session_store.rb
diff -r pgnindex/log/development.log pgnonrails/log/development.log
Only in pgnindex/test/functional: openings_controller_test.rb
Only in pgnindex/test/unit: helpers
The fastest way to compare one directory on the local machine to a directory on another machine thousands of miles away is exactly as you propose:
generate a CRC/checksum for every file
send the name, path, and CRC/checksum for each file over the internet to the other machine
compare
Perhaps the easiest way to do that is to use rsync with the "--dryrun" or "--list-only" option.
(Or use one of the many applications that use the rsync algorithm,
or compile the rsync algorithm into your application).
cd some_backup_directory
rsync --dryrun myname#remote_host:latest_version_directory .
For speed, the default rsync assumes, as Blindy suggested, that two files with the same name and the same path and the same length and the same modification time are the same.
For extra safety, you can give rsync the "--checksum" option to ignore the length and modification time and force it to compare (the checksum of) the actual contents of the file.

Resources