I'm having trouble monitoring a file for changes. I need to be able to know when a file changes, and when it does, I need the new line that was added. I intend to parse each line and find ones that match certain criteria, and act on information in those lines. I know the expected number of matching lines ahead of time, but I do not know how many lines in total will be added to the file, or where the matching lines will be.
I've tried 2 packages so far, with no avail.
fsnotify/fsnotify
As fas as I can tell, fsnotify can only tell me when a file is modified, not what the details of the modification was. Since I need to know what exactly was added to the file, this is no good for me.
(As a side-question, can this be run in a loop? The example that I tried exited after just one modification. I need to monitor for multiple modifications.)
hpcloud/tail
This package tries to mimic the Unix tail command, but it seems to have its own issues. The output that I get includes timestamps and other data - I just want the added line, nothing else. Also, it seems to think a file has been modified multiple times, even when it's just one edit. Further, the deal breaker here is that it does not output the last line if the line was not followed by a newline character.
Delegating to tail
I came across this answer, which suggests to delegate this work to the tail command itself, but I need this to work cross-platform (specifically, macOS, Linux and Windows). I don't believe that an equivalent command exists on Windows.
How do I go about tackling this?
#user2515526,
Usually changed diff is out of scope of file watchers' functionality, because, you know, you could change an image, and a watcher would need to keep a track several Mb of a diff in memory, and what if we have thousands of files?
However, as bad as it sounds, this may be exactly the way you want to implement this (sure, depends on your app, etc. - could be fine for text files), i.e. - keeping a map of diffs (1 diff per file) since last modification. Cannot say I like it, but sounds like fsnotify has no support for changes/diffs that you need.
Also, regarding your question about running in a loop, maybe you can get some hints here: https://github.com/kataras/iris/blob/8370d76910cdd8de043753ed81ae080eae8dc798/utils/file.go
Its a framework that allows to build a server that watches for TypeScript file changes. So sounds similar to your case/question.
Cheers,
-D
Related
I'm writing a program which needs to look at a very large number of files, some of which are very large in size. I'd like to visit a file only once, unless it changes. If it changes I need to revisit it again.
The way I know of to do this is with datestamps. One can look at the modified date to see if it is newer than the last time you looked at the file. Obviously those can be changed programmatically, so I'm wondering if there is a way to determine if a file has changed other than that. (I'm thinking along the lines of a UUID for the file which is changed every time it is modified or an epoch counter, but I'm open to more exotic solutions)
You can monitor changes for these files, assuming you continue to run the whole time. Check the FindFirstChangeNotification API. You can take a look at this project as an example. Sysinternals also has a similar tool, I believe it's implemented in a similar way.
I got a partial answer to my problem before, and want to solve this problem fully now. The final line of my /Program Files/GNU/vim/_vimrc is
source /homedir/vimsession_file
The filenames that i edit do not change, only their content changes. But, once in a while, i would create a new session file before i exit vim, using
:mks! /homedir/vimsession_file
Everytime i start gVim, i get a message box listing all the files (which I load into the multiple tabs that i have) with a Line number and Character count listed. More detail of this can be found in my orignal post here.
Currently, i am not using the solution proposed in the link given above. The solution i got there was to replace the final line of /Program Files/GNU/vim/_vimrc with the following line:
autocmd VimEnter * source /homedir/vimsession_file
The reason I stopped using the above solution is because my buffers were all getting wiped out (as described in the original post link). So, i was forced to rebuild my buffers every once in a while, when i would restart gVim.
I did search and read in order to solve this on my own. But the closest solution that i saw was here in stackoverflow. But that solution did not work for me either, despite playing with the shortmess variable as suggested there. How can i stop this annoying message box that pops up with OK button, before the start of gVim ? I want to suppress the message box, because the only info i get from it are the line and character count for each file. (NOTE: I looked into the /homedir/vimsession_file and it is about 3500 lines long. I noticed that the file names occur with badd followed by the edit command. For example, i have line 96 and line 164 as below:
Line 96 : badd +16 \Program\ Files\GNU\vim\_vimrc
........
Line 164: edit \Program\ Files\GNU\vim\_vimrc
This pattern repeats for all the other files that get loaded into multiple windows/tabs.
Wanted to post the answer here because there seem to be very few VIM experts who regularly look at stackoverflow. I didn't have the patience to wait many days expecting an answer. I found the answer to my own question, after reading the help file in vim called "starting.txt" which explains the startup/init process of vim and the different initialization files used. The following steps removed the annoying pop-up message box for me, and also made my VIM process start much faster than before due to simplification.
According to what is suggested in starting.txt help file, I separated my numerous tabs/windows/files into different sessions. I recommend reading through this help file (atleast browsing it), if you are a regular user of VIM.
Previously i was lumping numerous files (from different projects) into one single vim session. This is messy and is not the recommended way to use a VIM session file. Before the creation of different session files (for different projects), i first saved my old vim session file so that i can reuse it, during the creation of different session files. You will understand the process clearly if you look at the help file.
I cleaned up my startup procedure further by setting up a new viminfo file, by adding the "-i" parameter to vim.exe (icon) for starting. This was pointed to a new directory and hence gave a fresh start.
The main init files used by vim are viminfo, vimrc and session file. Each is meant for a different purpose. My problem was caused by source /homedir/vimsession_file as the final line of in my vimrc. So, I removed that line, and instead issue that command manually now, after vim starts up (with an empty window). I can source different vimsession_files, in order to load different files (which belong together). On my machine this command takes about 1 second, to load many tabs/windows/files, which belong to a single project/sub-project.
As pointed in the original post URL given in my question, there maybe another way to resolve this by creating and looking at the vimlog file. But i didn't want to bother with that tedious process. The way i am setup now makes more sense to me, because I have various subprojects which properly belong in different sessions of VIM.
Looked around with numerous search strings but can't find anything quite like this:
I'm writing a custom log parser (ala analog or webalizer except not for webserver) and I want to be able to skip the hard work for the lines that have already been parsed. I have thought about using a history file like webalizer but have no idea how it actually works internally and my C is pretty poor.
I've considered hashing each line and writing the hashes out, then parsing the history file for their presence but I think this will perform poorly.
The only other method I can think of is storing the line number of the last parse and skipping until that number is reached the next time round. What happens when the log is rotated I am not sure.
Any other ideas would be appreciated. I will be writing the parser in ruby but tips in a similar language will help as well.
The solutions I can think of right now are bound to be brittle.
Even if you store the line number and later realize it would be past the length of the current file, what happens if old lines have been trimmed? You would start reading (well) after the last position.
If, on the other hand, you are sure your log files won't be tampered with and they will only be rotated, I only see two ways of doing what you want, and I'm not sure the second is applicable to you.
Anyway, here goes.
First solution
You store the last line you parsed along with a timestamp. At the next run, you consider all the rotated log files sorting them by their last modified date, figure out which one you read last time, and start reading from there.
I didn't think this through, there might be funny corner cases you will need to handle.
Second solution
You create a background script that continuously watches the log file. A quick search on Google turned out this gem, but I'm not sure if that's even an option for you. Even then, you might want to integrate this solution with the previous one just in case your daemon will get interrupted (because that's clearly bound to happen at some point).
As you read the file and parse the lines keep track of the byte count. Save that. On next read, try to seek to that byte offset in the file. If the file is smaller than the byte count, it's a new file so start at the beginning.
Want to upgrade my file management productivity by replacing 2 panel file manager with command line (bash or cygwin). Can commandline give same speed? Please advise a guru way of how to do e.g. copy of some file in directory A to the directory B. Is it heavy use of pushd/popd? Or creation of links to most often used directories? What are the best practices and a day-to-day routine to manage files of a command line master?
Can commandline give same speed?
My experience is that commandline copying is significantly faster (especially in the Windows environment). Of course the basic laws of physics still apply, a file that is 1000 times bigger than a file that copies in 1 second will still take 1000 seconds to copy.
..(howto) copy of some file in directory A to the directory B.
Because I often have 5-10 projects that use similar directory structures, I set up variables for each subdir using a naming convention :
project=NewMatch
NM_scripts=${project}/scripts
NM_data=${project}/data
NM_logs=${project}/logs
NM_cfg=${project}/cfg
proj2=AlternateMatch
altM_scripts=${proj2}/scripts
altM_data=${proj2}/data
altM_logs=${proj2}/logs
altM_cfg=${proj2}/cfg
You can make this sort of thing as spartan or baroque as needed to match your theory of living/programming.
Then you can easily copy the cfg from 1 project to another
cp -p $NM_cfg/*.cfg ${altM_cfg}
Is it heavy use of pushd/popd?
Some people seem to really like that. You can try it and see what you thing.
Or creation of links to most often used directories?
Links to dirs are, in my experience used more for software development where a source code is expecting a certain set of dir names, and your installation has different names. Then making links to supply the dir paths expected is helpful. For production data, is just one more thing that can get messed up, or blow up. That's not always true, maybe you'll have a really good reason to have links, but I wouldn't start out that way, just because it is possible to do.
What are the best practices and a day-to-day routine to manage files of a command line master?
( Per above, use standardized directory structure for all projects.
Have scripts save any small files to a directory your dept keeps in the /tmp dir, .
i.e /tmp/MyDeptsTmpFile (named to fit your local conventions) )
It depends. If you're talking about data and logfiles, dated fileNames can save you a lot of time. I recommend dateFmts like YYYYMMDD(_HHMMSS) if you need the extra resolution.
Dated logfiles are very handy, when a current process seems like it is taking a long time, you can look at the log file from a week ago and quantify exactly how long this process took, a week, month, 6 months (up to how much space you can afford). LogFiles should also capture all STDERR messages, so you never have to re-run a bombed program just to see what the error message was.
This is Linux/Unix you're using, right? Read the man page for the cp cmd installed on your machine. I recommend using an alias like alias CP='/bin/cp -pi' so you always copy a file with the same permissions and with the original files' time stamp. Then it is easy to use /bin/ls -ltr to see a sorted list of files with the most recent files showing up at the bottom of the list. (No need to scroll back to the top, when you sort by time,reverse). Also the '-i' option will warn you that you are going to overwrite a file, and this has saved me more than a couple of times.
I hope this helps.
P.S. as you appear to be a new user, if you get an answer that helps you please remember to mark it as accepted, and/or give it a + (or -) as a useful answer.
I am working on a script that has become fairly convoluted. I suspect there are several sections that have nearly identical code. Can I (and how can I) open the file in vim, with two (or more) windows on the buffer, and diff the contents of the windows on the same file? vimdiff seems to work only on two files. If I make a copy of the file and try to vimdiff the two versions, the diff origin remains locked on the beginning of the file. Although I can unscroll-lock the windows, and move the windows to the parts of the file I want to compare the diffs do not show up. Any hints or tips? I could cut and paste the sections I want to compare to different files and then apply vimdiff but then I risk getting lost in what section came from where when I try to patch the separate files together, and I feel sure there must be a more straightforward, easier way.
What I usually do is diff to a copy
:%w %.alt
:vert diffsplit %.alt
And then happiliy rearrange the 'alt' version so that the pseudo-matching bits get aligned.
Note that (presumably) git contains spiffy merge/diff cow-powers that should be able to detect sub-file moved block changes.
Although I haven't (yet) actually put this into practice, I have a hunch that the very nice git plugin fugitive for vim might be able to leverage some of this horsepower to make this easier. Note: fully expect this to require scriptinh before being usable, but I still thought it would be nice to share this idea (perhaps you can share a script if you get to it first!)
As an alternative solution that I've been using occasionally and which works very nicely in my opinion is linediff.vim.
It allows you to use visual mode to select two bodies of text from arbitrary buffers (or the same for that matter) and run vimdiff on them. The beauty of it, is that when you edit and save the temporary diff buffers, you update the original buffers with the changes, without saving.
One of my use-cases is when I'm resolving merge issues related to script refactoring and reordering, where a function has been moved and perhaps also modified. In order to make sure you do not lose any of the modifications coming in from either ancestor, you diff the two versions of the function alone by visually selecting them and running the linediff command.