How to process this kind of edge cases in shell script? - shell

I have a program in Linux to write log into files on a regular interval (for
example: 10 minutes every file), so I have a directory hierachy like this:
In which, 20151001 stands for the log dir for the day
2015(year)/10(month)/01(day), and 00 stands for the first hour in the day,
and 00.00 stands for the log file for the first 10 minutes in the first hour
of the day.
In my shell script I need to detect if there is any previous log file is empty,
for example if now the time is 2015/10/09 09:32, the newest log file should be
20151009/09/09.03, the file 09.03 could be an empty file, but the other
older files should not be empty. In order to simplify the problem, I just detect
if the previous file is empty, so I just detect if the file 20151009/09/09.02
is empty. However I have to handle some edge cases, like:
If now the time is:
2015/10/09 09:01
the previous file is in another directory, it’s:
20151009/08/08.05
If now the time is:
2015/10/09 00:01
the previous file is:
20151008/23/23.05
Is there any powerful algorithm or tools to handle my problem, especially the
edge cases?

Related

How can I cut several sound files using a script?

I am new to Praat and wondering, if someone can help me to find out, how I can cut all my sound files with a script or anything.
I have like 100 sound files I need for my research. They all have a different length, some are 1 min and others are 3 min long.
I would like to have only the first 22 sec from each sound file.
Thanks in advance!
Kind regards
Olga
The first step is to construct a script that extracts the initial 22 seconds of some specific sound object that is already open. In general, the easiest way to at least start a script is to do a thing manually once, and after you've done that, in a Praat script window, copy the command history (with ctrl-h) to see what the underlying commands are. The manual approach is to look for "Extract part" under "Convert", which corresponds to the command
Extract part: 0, 22, "rectangular", 1, "no"
There is also a command to save a file as a wav file, so you would add that to the core of the script.
Then you need to add a loop that does this a number of times, to different files. You will (probably) need a file with wav file names, and some system for naming the output files, for example if you have "input1.wav", you might want to call the cut-down version "output1.wav". This implies some computation of the output file name based on the input file name, so you need to get familiar with how string manipulation works in Praat.
If you have that much sorted out, then the basic logic is
get next input file name file
compute output name
open the input file
extract from that file
save the extracted file
remove the extract
remove the original
loop until no more files
I would plan on spending a lot of time trying to understand simple things like string variables, or object selection. I left out explicitly selecting objects since it is not necessarily required, but every command works on "the selected object" and it's easy to lose track of what is selected.
Another common approach is to beg a colleague to write it for you.

How to append current date to property file value every day in Unix?

I've got a property file which is read several times per day by an external application in order to process some files. One of the properties tells the app where to store the processed files. Application runs on Linux.
success_path=/u02/oapp/success
The problem is that every day several files are thrown in that path and after several months, I would have thousands of files in this plane folder.
Question: How can I append the current date to this property file so it would look like:
success_path=/u02/oapp/success/dd-MMM-yyyy
This would be updated every day at 12:00AM so for example today it would be
success_path=/u02/oapp/success/28-JAN-2017
The file is /u02/oapp/configuration/oapp.properties
Thanks in advance
Instead of appending current date to the property, add additional logic to the code that stores the processed files so that:
it takes the base directory from the property file (success_path in your case)
it creates a year/month/day directory to store the files
Something like:
/u02/oapp/success/year/month/day (as in `/u02/oapp/success/2017/01/01`)
or
/u02/oapp/success/yearmonth/day (as in `/u02/oapp/success/201701/01`)
or
/u02/oapp/success/yearmonthday (as in `/u02/oapp/success/20170101`)
If you don't have the capability to change the app's behavior, you might need to write a cron job that periodically moves the files external to the app.
jq -Rr 'select(startswith("success_path="))="success_path=/u02/oapp/success/"+(now|strflocaltime("%d-%b-%Y")|ascii_upcase)' /u02/oapp/configuration/oapp.properties | sponge /u02/oapp/configuration/oapp.properties

why replacing a file doesn't renew its creation time?

I use windows system. I have two files in two directory. One with creation time 2015/5/15 15:35, the other with creation time 2015/5/15 9:48. After I replace the second one with the first one, the modified time of the two files are the same, but why the creation time of the second file doesn't change and is still 2015/5/15 9:48?!

how to detect file change VBS

i want to make a Visual basic script console app that prints edited if a file has been modified. for example if i have a text file with some notes in and i add it to a folder when its edited the program checks the folder its in and the files then prints the name of the file and modified or not modified
how would i go about doing this i am relatively new to Visual basic script i probably have 4 months basic experience.
console.writeline("what do i do?")
console.writeline("and how do i do it")
and I'm trying to do it as a console app so the preferred outcome i would like to see would be
File Checker
test.txt - Edited
test2.pptx - Un-edited
etc etc etc
If you need an immediate notification, WMI is probably the best route. But WMI will also require your process to be running (in a blocked state) all of the time. Alternatively, you could schedule a VBScript to be launched at some interval and it could check each file's last-modified date against a text file or database that you use to store the modification date the last time the script was run.
An even easier solution would be to just check if the modification time changed since the last run. For example, if your script runs every 10 minutes and you discover a file that was changed within the last 10 minutes, report it.
With CreateObject("Scripting.FileSystemObject")
For Each File In .GetFolder("c:\folder").Files
If DateDiff("n", File.DateLastModified, Now) < 10 Then
' File has been modified in past 10 minutes.
End If
Next
End With

How can I speed up Perl's readdir for a directory with 250,000 files?

I am using Perl readdir to get file listing, however, the directory contains more than 250,000 files and this results long time (longer than 4 minutes) to perform readdir and uses over 80MB of RAM. As this was intended to be a recurring job every 5 minutes, this lag time will not be acceptable.
More info:
Another job will fill the directory (once per day) being scanned.
This Perl script is responsible for processing the files. A file count is specified for each script iteration, currently 1000 per run.
The Perl script is to run every 5 min and process (if applicable) up to 1000 files.
File count limit intended to allow down stream processing to keep up as Perl pushes data into database which triggers complex workflow.
Is there another way to obtain filenames from directory, ideally limited to 1000 (set by variable) which would greatly increase speed of this script?
What exactly do you mean when you say readdir is taking minutes and 80 MB? Can you show that specific line of code? Are you using readdir in scalar or list context?
Are you doing something like this:
foreach my $file ( readdir($dir) ) {
#do stuff here
}
If that's the case, you are reading the entire directory listing into memory. No wonder it takes a long time and a lot of memory.
The rest of this post assumes that this is the problem, if you are not using readdir in list context, ignore the rest of the post.
The fix for this is to use a while loop and use readdir in a scalar context.
while (
defined( my $file = readdir $dir )
) {
# do stuff.
}
Now you only read one item at a time. You can add a counter to keep track of how many files you process, too.
The solution would maybe lie in the other end : at the script that fills the directory...
Why not create an arborescence to store all those files and that way have lots of directories each with a manageable number of files ?
Instead of creating "mynicefile.txt" why not "m/my/mynicefile", or something like that ?
Your file system would thank you for that (especially if you remove the empty directories when you have finished with them).
This is not exactly an answer to your query, but I think having that many files in the same directory is not a very good thing for overall speed (including, the speed at which your filesystem handles add and delete operations, not just listing as you have seen).
A solution to that design problem is to have sub-directories for each possible first letter of the file names, and have all files beginning with that letter inside that directory. Recurse to the second, third, etc. letter if need be.
You will probably see a definite speed improvement on may operations.
You're saying that the content gets there by unpacking zip file(s). Why don't you just work on the zip files instead of creating/using 250k of files in one directory?
Basically - to speed it up, you don't need specific thing in perl, but rather on filesystem level. If you are 100% sure that you have to work with 250k files in directory (which I can't imagine a situation when something like this would be required) - you're much better off with finding better filesystem to handle it than to finding some "magical" module in perl that would scan it faster.
Probably not. I would guess most of the time is in reading the directory entry.
However you could preprocess the entire directory listing, creating one file per 1000-entries. Then your process could do one of those listing files each time and not incur the expense of reading the entire directory.
Have you tried just readdir() through the directory without any other processing at all to get a baseline?
You aren't going to be able to speed up readdir, but you can speed up the task of monitoring a directory. You can ask the OS for updates -- Linux has inotify, for example. Here's an article about using it:
http://www.ibm.com/developerworks/linux/library/l-ubuntu-inotify/index.html?ca=drs-
You can use Inotify from Perl:
http://metacpan.org/pod/Linux::Inotify2
The difference is that you will have one long-running app instead of a script that is started by cron. In the app, you'll keep a queue of files that are new (as provided by inotify). Then, you set a timer to go off every 5 minutes, and process 1000 items. After that, control returns to the event loop, and you either wake up in 5 minutes and process 1000 more items, or inotify sends you some more files to add to the queue.
(BTW, You will need an event loop to handle the timers; I recommend EV.)

Resources