I'm doing a scrip that reads from a log and send the line throught netcat
tail -f /tmp/archivo.txt | grep --line-buffered status=bounced | while read LINE0
do
echo "${LINE0}"
echo "${LINE0}" > /tmp/mail-line.log
netcat localhost 5699 < /tmp/mail-line.log
sleep 1s
done
When I first launch this script it properly sends the data but when a new line is introduced it doesn't work unless I relaunch the script, any ideas?
Thank you.
Edit:
As #Kamil Cuk asked me, I tried only doing echos, it didn't work
What's happening?:
Well, I was introducing the new data using gedit, but this didn't work whit the -f flag but instead with the -F flag because it says that archivo.txt has been replaced. I tried doing echo "New line with bounce=status" >> archivo.txt" and it worked. So I'm assuming that gedit somehow changes metadata and tail doesn't show anything with -f thats why it's not working.
Under the hood files in the filesystem are just identified by numbers, called "inodes". Then you have a text label that points to the inode, called a "file name". When you run tail -f filename you start tailing the file behind the inode that filename currently points to; this is called "opening a file handle" because the tail program creates an open reference to the file through a "file handle".
What happened in your case is that you started tailing one file. Then your logging application (gedit :-) ) created a new file on a different inode and changed what inode filename refers to. The file at the inode that you are tailing still exists because tail still have an open file handle, it won't get removed until all programs close their file handles. But any new program that tries to open the file will get to the new inode.
To get around this you need to open the file in "append mode" (there are other modes as well). "Append mode" means you are pushing data to the end of the file without creating a new inode. All real logging applications will do this since it is much faster than rewriting the entire file every time.
Related
I have a script to automate adding movie data to my website and download the correct subtitles for them by using inotify to launch other scripts. So that it only runs on the new file I need the complete file path like "/var/www/html/movies/my movie (2020)/" and file name "my movie (2020).mp4"
I've tried different methods to get it working such as:
inotifywait -m -r --format "%e %w%f" /var/www/html/uploads/Videos/ -e create -e moved_to |
while read event fullpath; do
Above doesn't pick up any files.
inotifywait -mr --format "%w%f" /var/www/html/uploads/Videos/ -e create -e moved_to |
while read event fullpath; do
Works for movies like "Freaks (2018)" but not "Zombieland Double Tap (2019)" it becomes "ZombielandTap (2019)"
Without --format at all just completely messes up some directories:
2 fast 2 furious (2003) becomes:
/dir/22 Furious (2003)/ CREATE 2 Fast 2 Furious (2003).mp4
My exact needs for it is to grab the path to and file name (I use both separately). Upon download completion in a non monitored folder the movie is moved and renamed from whatever torrent name to the actual name in the monitored directory. The final file is all I care about having the link for.
What can I do to get the path and file name either separately so I can add them together when I need it (preferred) or full path I can break up with expressions?
You have a couple of issues. In your first attempt, you monitor the directory, but you pipe the output of inotifywait to read. The pipe is not within any loop so it is a one-shot deal. You are left monitoring the directory, but your output is no longer connected to your read loop.
Your second attempt has the same issue, but this is compounded by using --format "%w%f" that does not output the event, but then attempting to read event fullpath which since your filename contains whitespace, part of the name is read into event and the rest into fullpath.
Since you are only concerned with files added to the directory --create should be the only event you need to monitor. Your use of %w%f should be fine since you are monitoring a directory, '%w' will contain the watched directory (watched_filename) and '%f' will contain the filename (event_filename).
You are using bash, so you can use process substitution in bash to establish recursive monitoring of the directory and feed filenames to a while loop to process the changes. You must quote the resulting variables throughout the rest of your script to avoid word-splitting depending on how you use them.
You can do something similar to:
while read -r fullpath
do
echo "got '$fullpath'"
done < <(inotifywait -m -r --format '%w%f' -e create "/var/www/html/uploads/Videos/")
This will continually read the absolute path and filename into the fullpath variable. As written the watch simply echos, e.g. "got '/var/www/html/uploads/Videos/name of new file.xxx'"
Example Use/Output
With the watch set on a test directory "my dir" with spaces and creating files in that directory (also with spaces) results in, e.g.
Setting up watches. Beware: since -r was given, this may take a while!
Watches established.
got '/home/david/tmpd/my dir/my movie1 (2020).mp4'
got '/home/david/tmpd/my dir/my movie2 (2020).mp4'
got '/home/david/tmpd/my dir/my movie3 (2020).mp4'
You can use fullpath in whatever further processing you need. Give it a try and let me know if you have further questions.
I am new to bash scripting and I have to create a script that will run on all computers within my group at work (so it's not just checking one computer). We have a spreadsheet that keeps certain file information, and I am working to automate the updating of that spreadsheet. I already have an existing python script that gathers the information needed and writes to the spreadsheet.
What I need is a bash script (cron job, maybe?) that is activated anytime a user deletes a file that matches a certain extension within the specified file path. The script should hold on to the file name before it is completely deleted. I don't need any other information besides the name.
Does anyone have any suggestions for where I should begin with this? I've searched a bit but not found anything useful yet.
It would be something like:
for folders and files in path:
if file ends in .txt and is being deleted:
save file name
To save the name of every file .txt deleted in some directory path or any of its subdirectories, run:
inotifywait -m -e delete --format "%w%f" -r "path" 2>stderr.log | grep '\.txt$' >>logfile
Explanation:
-m tells inotifywait to keep running. The default is to exit after the first event
-e delete tells inotifywait to only report on file delete events.
--format "%w%f" tells inotifywait to print only the name of the deleted file
path is the target directory to watch.
-r tells inotifywait to monitor subdirectories of path recursively.
2>stderr.log tells the shell to save stderr output to a file named stderr.log. As long as things are working properly, you may ignore this file.
>>logfile tells the shell to redirect all output to the file logfile. If you leave this part off, output will be directed to stdout and you can watch in real time as files are deleted.
grep '\.txt$' limits the output to files with .txt extensions.
Mac OSX
Similar programs are available for OSX. See "Is there a command like “watch” or “inotifywait” on the Mac?".
#!/bin/bash
python /home/sites/myapp/Main.py &> /home/sites/myapp/logs/init.log &
This script produces a log of approximately 1G/week.
When i manually delete the init.log during runtime and not restarted the script it still save the data in a missing init.log. init.log will be visible again when the script is restarted.
Does restarting the script the only way to see the log?
On unix system When init.log is created then an inode is created. There is a counter in every inodes which counts all the references to that file. A reference means a hard link or when a file is opened. The file is only deleted when this counter goes back to zero.
So when stdout is redirected to init.log its inode has the counter value 2 (referenced by the directory entry, and counted because of open. When rm (uses unlink function) deletes the file, this counter become 1, so file is not referenced by any directory entry, but the inode still exists. When the script finishes, the counter become 0 and the inode is deleted.
There is no easy way to read an inode, which not referenced by any directory entry.
A file is not actually deleted until nothing references it. In this case, you have deleted all directory entries, but your program still has an open file descriptor so the data is not completely removed until the program exits. Note that it will also continue to hog disk space.
In Linux, you can still view the file's contents in /proc/PID/fd/FD where PID is the process's id and FD is the file descriptor that your are interested in. Once the program exits, the data is toast and the disk space can be reclaimed... so get your data while you can ;)
You shouldn't remove a log file if you know that a program still has open file descriptors on it. Instead, truncate the file with cat /dev/null > log.file or in bash just use > log.file.
I have a very large CSV file, over 2.5GB, that, when importing into SQL Server 2005, gives an error message "Column delimiter not found" on a specific line (82,449).
The issue is with double quotes within the text for that column, in this instance, it's a note field that someone wrote "Transferred money to ""MIKE"", Thnks".
Because the file is so large, I can't open it up in Notepad++ and make the change, which brought me to find VIM.
I am very new to VIM and I reviewed the tutorial document which taught me how to change the file using 82,449 G to find the line, l over to the spot, x the double quotes.
When I save the file using :saveas c:\Test VIM\Test.csv, it seems to be a portion of the file. The original file is 2.6GB and the new saved one is 1.1GB. The original file has 9,389,222 rows and the new saved one has 3,751,878. I tried using the G command to get to the bottom of the file before saving, which increased the size quite a bit, but still didn't save the whole file; Before using G, the file was only 230 MB.
Any ideas as to why I'm not saving the entire file?
You really need to use a "stream editor", something similar to sed on Linux, that lets you pipe your text through it, without trying to keep the entire file in memory. In sed I'd do something like:
sed 's/""MIKE""/"MIKE"/' < source_file_to_read > cleaned_file_to_write
There is a sed for Windows.
As a second choice, you could use a programming language like Perl, Python or Ruby, to process the text line by line from a file, writing as it searches for the doubled-quotes, then changing the line in question, and continuing to write until the file has been completely processed.
VIM might be able to load the file, if your machine has enough free RAM, but it'll be a slow process. If it does, you can search from direct mode using:
:/""MIKE""/
and manually remove a doubled-quote, or have VIM make the change automatically using:
:%s/""MIKE""/"MIKE"/g
In either case, write, then close, the file using:
:wq
In VIM, direct mode is the normal state of the editor, and you can get to it using your ESC key.
You can also split the file into smaller more manageable chunks, and then combine it back. Here's a script in bash that can split the file into equal parts:
#!/bin/bash
fspec=the_big_file.csv
num_files=10 # how many mini-files you want
total_lines=$(cat ${fspec} | wc -l)
((lines_per_file = (total_lines+num_files-1) / num_files))
split --lines=${lines_per_file} ${fspec} part.
echo "Total Lines = ${total_lines}"
echo "Lines per file = ${lines_per_file}"
wc -l part.*
I just tested it on a 1GB file with 61151570 lines, and each resulting file was almost 100 MB
Edit:
I just realized you are on Windows, so the above may not apply. You can use a utility like simple text splitter a Windows program which does the same thing.
When you're able to open the file without errors like E342: Out of memory!, you should be able to save the complete file, too. There should at least be an error on :w, a partial save without error is a severe loss of data, and should be reported as a bug, either on the vim_dev mailing list or at http://code.google.com/p/vim/issues/list
Which exact version of Vim are you using? Using GVIM 7.3.600 (32-bit) on Windows 7/x64, I wasn't able to open a 1.9 GB file without out of memory. I was able to successfully open, edit, and save (fully!) a 3.9 GB file with the 64-bit version 7.3.000 from here. If you're not using that native 64-bit version yet, give it a try.
I did the following:
nohup find / &
rm nohup.out
Oddly, the nohup -command continued to run. I awaited for a new file to be created. For my surprise there was no such file. Where did the stdout of the command go?
Removing a file in UNIX does two things:
it removes the directory entry for it.
if no processes have it open and no other directory entries point to it (hard links), it releases the space.
Your nohupped process will gladly continue to write to the file that used to be called nohup.out, but is now known as nothing but a file descriptor within that process.
You can even have another process create a nohup.out, it won't interfere with the first.
When all hard links are gone, and all processes have closed it, the disk space will be recovered.
if you will delete the nohup.out file, the handle will be lost and it will write only to the file descriptor but if you want to clean out the nohup.out file then just run this
true > nohup.out
it will delete the contents of the file but not the file.
That's standard behaviour on Unix.
When you removed the file, the last link in the file system was removed, but the file was still open and therefore the output of find (in this case) was written to disk blocks in the kernel buffer pool, and possibly even to disk. But the file had no name. When find exited, no process or file (inode) referenced the file, so the space was released. This is one way that temporary files that will vanish when a program exits are created - by opening the file and then removing it. (This presumes you do not need a name for the file; clearly, if you need a name for the temporary, this technique won't work.)
cat /dev/null > nohup.out
from here