Alternative to using ant <modified> selector? - bash

I'm trying to find an alternative to a helpful piece of functionality provided by ant - the <modified> selector.
When specifying a set of files in ant, you can use the <modified> selector to only include files whose content has changed since the last time it was run.
The selector computes a value for a file, compares that to the value stored in a cache and selects the file if these two values differ.
Is there an existing way of doing this in bash? I don't want to use a full blown build tool or similar just to return a list of modified file paths.

Related

Magic Chunks wildcard path

I want to replace the value property of multiple web.config file settings. For this we already use Magic Chunks so rather than write out multiple transformations, I was hoping to target every instance of my particular use.
The short question being: Is it possible to target multiple transformations with Magic Chunks using a wildcard expression?

Expressions in a build rule "Output Files"?

Can you include expressions in the "Output Files" section of a build rule in Xcode? Eg:
$(DERIVED_FILE_DIR)$(echo "/dynamic/dir")/$(INPUT_FILE_BASE).m
Specifically, when translating Java files with j2objc, the resulting files are saved in subfolders, based on the java packages (eg. $(DERIVED_FILE_DIR)/com/google/Class.[hm]). This is without using --no-package-directories, which I can't use because of duplicate file names in different packages.
The issue is in Output Files, because Xcode doesn't know how to search for the output file at the correct location. The default location is $(DERIVED_FILE_DIR)/$(INPUT_FILE_BASE).m, but I need to perform a string substitution to insert the correct path. However any expression added as $(expression) gets ignored, as it was never there.
I also tried to export a variable from the custom script and use it in Output Files, but that doesn't work either because the Output Files are transformed into SCRIPT_OUTPUT_FILE_X before the custom script is ran.
Unfortunately, Xcode's build support is pretty primitive (compared to say, make, which is third-odd years older :-). One option to try is splitting the Java source, so that the two classes with the same names are in different sub-projects. If you then use different prefixes for each sub-project, the names will be disambiguated.
A more fragile, but maybe simpler approach is to define a separate rule for the one of the two classes, so that it can have a unique prefix assigned. Then add an early build phase to translate it before any other Java classes, so the rules don't overlap.
For me, the second alternative does work (Xcode 7.3.x) - to a point.
My rule is not for Java, but rather for Google Protobuf, and I tried to maintain the same hierarchy (like your Java package hierarchy) in the generated code as in the source .proto files. Indeed files (.pb.cc and .pb.h) were created as expected, with their hierarchies, inside the Build/Intermediates/myProject.build/Debug/DerivedSources directory.
However, Xcode usually knows to continue and compile the generated output into the current target - but that breaks as it only looks for files in the actual ${DERIVED_FILE} - not within sub-directories underneath.
Could you please explain better "Output Files are transformed into SCRIPT_OUTPUT_FILE_X" ? I do not understand.

What is the most efficient way to make sure hadoop skips certain input files?

I have a hadoop application that -depending on a parameter- only needs certain (few!) input files from the input directory. My question is now: where is the best place (read: as early as possible) to skip those files? Right now I customized a RecordReader to take care of that, but I was wondering whether I could skip those files sooner? In my current implmentation hadoop still has a huge overhead due to irrelevant files.
Maybe I should add that it is very easy to see whether I need a certain input file. If the filename starts with a parameter, it is needed. Structuring my input directory hierachically might be a solution, but one that is not very likely for my project since every files would end up lonely in a certain directory.
I'd propose you to filter out the input files by applying the appropriate pattern on the input Paths as mentioned here: https://stackoverflow.com/a/13454344/1050422
Note that this solution doesn't consider subdirectories. Alter it
to be able to recursively visit all subdirectories, within the base path.
I've had success with using the setInputPaths() method on TextInputFormat to specify a single String containing comma-separated file names.

How do you filter Ruby Find.find() results?

Find.find("d") {|path| puts path}
I want to exclude certain type of files, say *.gif and directories.
PS: I can always add code inside my block to check for the file name and directory type, but I want find itself to filter files for me.
I don't think you can tell find to do that.You could try using Dir#[], which accepts file globs. If you are looking for particular types of files, or files that can be filtered with the file glob pattern language, it may be a better fit.
eg
Dir["dir/**/*.{xml,png,css,html}"]
would find all the xml, png, css, and html files under the directory d.
Check out the docs for more info.
You can't make find do it, but Find may help: in the block, you need to check whether the current path is one of those you'd like to exclude or not; if so, then call Find#prune. This seems to be the standard idiom when using Find.
If you decide to use Dir#[] instead, you may call reject on its result, passing a block to exclude certain types of files. However, note that, as far as I understand, Dir#[] reads all the contents of your d directory before you can filter, while Find#prune guarantees not to read the contents of pruned subdirectories if you call it within the block passed to Find#find.

Ruby library for manipulating XML with minimal diffs?

I have an XML file (actually a Visual C# project file) that I want to manipulate using a Ruby script. I want to read the XML into memory, do some work on them that includes changing some attributes and some text (fixing up some path references), and then write the XML file back out. This isn't so hard.
The hard part is, I want the file I write to look the same as the file I read in, except where I made changes. If the input file used double quotes, I want the output to use double quotes. If the input had a space before />, I want the output to do the same. Basically, I want the output to be the same as the input, except where I explicitly made changes (which, in my case, will only be to attribute values, or to the text content of an element).
I want minimal diffs because this project file is checked into version control -- and because the next time I make a change in Visual Studio, it's going to rewrite it in its preferred format anyway. I want to avoid checking in a bunch of meaningless diffs that will then be changed back again in the near future. I also want to avoid having to open the project in Visual Studio, make a change, and save, before I can commit my Ruby script's changes. I want my Ruby script to just make its changes, nothing more.
I originally just parsed the file with regexes, but ran into cases where I really needed an XML library because I needed to know more about child elements. So I switched to REXML. But it makes the following undesirable changes to my formatting:
It changes all the attributes from double quotes to single quotes.
It escapes all the apostrophes inside attribute values (changing them to &apos;).
It removes the space before />.
It sorts each element's attributes alphabetically, rather than preserving the original order.
I'm working around this by doing a bunch of gsub calls on REXML's output, but is there a Ruby XML-manipulation library that's a better fit for "minimal diff" scenarios?
You can build your own SAX parser (using Nokogiri, for example, it's very easy and I recommend to use it) to parse your XML file, change some data in it, and flush the processed XML file with your own customized, built from scratch, XML generator. The bad news is, you have to build a tiny XML library and generator routine in this case, so it is not an ordinary task.
Another way: don't build the SAX parser, but write an XML generator. Parse XML with your favourite library, change what you need to change and generate anything you want. You just need to recursively walk through all nodes in your document and output them within your conventions.

Resources