Sort a list of path in LINQ? - linq

Let say I have the following folders:
New Folder
- New Folder
- New Folder (2)
- New Folder (3)
- New Folder (4)
New Folder (2)
New Folder (3)
New Folder (4)
And a query
from s in Directory.GetDirectories(#"D:\Project\uploads", "*.*", SearchOption.AllDirectories)
select s
The results:
D:\Project\uploads\New Folder
D:\Project\uploads\New Folder (2)
D:\Project\uploads\New Folder (3)
D:\Project\uploads\New Folder (4)
D:\Project\uploads\New Folder\New Folder
D:\Project\uploads\New Folder\New Folder (2)
D:\Project\uploads\New Folder\New Folder (3)
D:\Project\uploads\New Folder\New Folder (4)
Is there anyway to sort the list to the right order? I expected it to be:
D:\Project\uploads\New Folder
D:\Project\uploads\New Folder\New Folder
D:\Project\uploads\New Folder\New Folder (2)
D:\Project\uploads\New Folder\New Folder (3)
D:\Project\uploads\New Folder\New Folder (4)
D:\Project\uploads\New Folder (2)
D:\Project\uploads\New Folder (3)
D:\Project\uploads\New Folder (4)
Any helps would be appreciated!

This wasn't as trivial as I thought. Probably the most sane solution (aside from building the list recursively) is to implement a comparer for this to do the sorting.
class DirectorySorter : IComparer<string>
{
public int Compare(string x, string y)
{
return StringComparer.Ordinal.Compare(x.Replace(Path.DirectorySeparatorChar, '\0'),
y.Replace(Path.DirectorySeparatorChar, '\0'));
var xPaths = x.Split(Path.DirectorySeparatorChar);
var yPaths = y.Split(Path.DirectorySeparatorChar);
var minLength = Math.Min(xPaths.Length, yPaths.Length);
for (int i = 0; i < minLength; i++)
{
var ires = xPaths[i].CompareTo(yPaths[i]);
if (ires != 0) return ires;
}
var lres = xPaths.Length.CompareTo(yPaths.Length);
if (lres == 0)
{
return lres;
}
else if (lres < 0)
{
var i = y.LastIndexOf(Path.DirectorySeparatorChar);
return x.Length == i ? lres : -lres;
}
else //if (lres > 0)
{
var i = x.LastIndexOf(Path.DirectorySeparatorChar);
return y.Length == i ? lres : -lres;
}
}
}
(Seeing Steck's answer shows that I was nearly there with what I originally had. Just that I needed to use the Ordinal string comparer. So it turns out it works using that change.)
On the other hand, we could use some properties of the directory structure to simplify this task and not implement a comparer.
var query = Directory
.EnumerateDirectories(#"D:\Project\uploads", "*", SearchOption.AllDirectories)
.OrderBy(name => name.Replace(Path.DirectorySeparatorChar, '\0'), StringComparer.Ordinal);

private class Comparer : IComparer<string>
{
public int Compare(string x, string y)
{
return StringComparer.Ordinal.Compare(x.Replace(Path.DirectorySeparatorChar, '\0'),
y.Replace(Path.DirectorySeparatorChar, '\0'));
}
}
and then
var source = Directory.GetDirectories(#"D:\Project\uploads", "*.*", SearchOption.AllDirectories)
var target = source.OrderBy(x => x, new Comparer()).ToArray();

The only thing you need to change about the default ordering is to make sure that the \ character is always treated as the first letter in your alphabet. I don't have an exact answer how to implement this, but:
You can use order by clause if you find a way to replace \ in the string with a character that is smaller than all other characters and use this replaced string as the key.
You can use Array.Sort and implement your string comparer that re-implements string comparison, but encodes this additional rule about the \ character.

With .NET 4.0 try
Directory.EnumerateDirectories(#"D:\Project\uploads", "*.*", SearchOption.AllDirectories)
it might do what you expect. If it doesn't, you can do it explicitly:
Directory.GetDirectories(#"D:\Project\uploads")
.SelectMany(dir => dir.GetDirectories().OrderBy(sub => sub.Name))
Lastly you might do something like:
from s in Directory.GetDirectories(#"D:\Project\uploads", "*.*", SearchOption.AllDirectories)
order by s.Parent.Name, s.Name
select s
from s in Directory.GetDirectories(#"D:\Project\uploads", "*.*", SearchOption.AllDirectories)
let members = s.Name.Split(new [] {Path.SeparatorChar})
order by members[2], s.Name
select s
to get even more control/flexibility. Chose the simplest approach depending on your needs

Thanks for ur comment and answer guys,
I think life'll be much easier with recursive
void Main()
{
string rootFolder = #"D:\Project\uploads";
string[] f = Directory.GetDirectories(rootFolder, "*.*", SearchOption.AllDirectories);
Func<string, string[]> build = null;
build = (p) => {
return (from x in f where Path.GetDirectoryName(x) == p
from y in new string[]{ x }.Union(build(x)) select y).ToArray();
};
f = build(rootFolder).Dump();
}

Related

Interview: Renaming all files in a directory using a data structure

This is a problem I have encountered in a tech interview. You have 500,000 files in a directory, which is configured so that they are always in alphabetical order. They have names as such:
Afile
Bfile
File00000001
File00000002
...
You want to rename all the files while preserving their order as such:
File00000001
File00000002
File00000003
...
You can probably see the obvious issue here. If you rename Afile into File00000001, it will collide with the existing file with the same name and also the order will be altered, which is not what we want.
The question here is, how can you devise an algorithm with the most optimal run-time to do the renaming task efficiently?
You cannot go through the files in ascending order and also not in decending order, both could lead to a conflict. Also renaming the files to something else first could potentially lead to a conflict. The goal seems to be to rename each file only once, so you can do something as follows:
private static File dir;
public static void renameFiles(String path) {
dir = new File(path);
File[] files = dir.listFiles();
Map<String, String> map = new HashMap<>();
int number = 1;
for (int i = 0; i < files.length; i++)
if (files[i].isFile())
map.put(files[i].getName(), "File" + pad(number++));
// so we created a map with original file names and the name it should get
for (int i = 0; i < files.length; i++)
if (!files[i].getName().equals(map.get(files[i].getName())) // not same name
renameFile(files[i].getName(), map);
}
private static void renameFile(String file, Map<String, String> map) {
String newName = map.get(file);
if (newName != null) {
if (map.containsKey(newName))
renameFile(newName, map)
File f = new File(dir, file);
f.renameTo(new File(dir, newName));
map.remove(file);
}
}
Time complexity O(n). We recursively go ahead until we don't have a renaming conflict any more and then start renaming from the tail. There won't be a conflict because it is possible that File004 becomes File007 or that File007 becomes File004 but not both, so no circular renaming. If there are too many files then recursion depth might not be sufficient and we have to implement it with a stack, but it is the same principle.
private static void renameFile(String file, Map<String, String> map) {
String newName = map.get(file);
if (newName != null) {
Stack<String> stack = new Stack<>();
do {
stack.push(file);
file = newName;
newName = map.get(file);
} while (newName != null);
while (!stack.empty()) {
file = stack.pop();
File f = new File(dir, file);
f.renameTo(new File(dir, map.get(file)));
map.remove(file);
}
}
}
This will work on Linux, but for Windows you could still have problems, because the file names are not case sensitive. You could store all the keys in the map as lower case and always call toLowerCase() when accessing the map.
for i in {100..1..-1} ; do o=$(printf "File%04d" $i); n=$(printf "File%04d" $((i + 2))); echo mv $o $n; done;
or better readable:
for i in {100..1..-1}
do
o=$(printf "File%04d" $i)
n=$(printf "File%04d" $((i + 2)))
echo mv $o $n
done
FileA and FileB can be renamed by hand.
You have to adapt the size, but for testing, a human number of files seemed more appropriate to me.
Ah, yes, that's bash-syntax; important to notice. And it doesn't mv files yet, only echos the mv-command.
Don't try to run it in parallel. :)
But you could as well move them in opposite, normal order to a new dir, and then move them all back into the old dir, to prevent overriding. This would allow to perform it in parallel.
The for-statement is equivalent to what is elsewhere known as
for (i = 100; i >=1; --i)
and
printf "File%04d" $i
prints a 4 digit i with leading zeros.

C# Find All SubFolders That Contain String

EDIT - Revised the title of my post to make it more relevant to the problem.
I have folders that may or may not contain subfolders that start with "REV". If there are subfolders that start with "REV" they are followed by an integer value padded with leading zeros. (ie: REV010 or REV003).
My goal here is to:
find the test folder (C:\temp\TEST\REV003)
Read the string name of the folder and parse its integer value
Add the integer to a list of integers. Find the max integer value
Increment the max integer value
Create a new string folder name starting with "REV" and padded with new int value
When I debug the code (below), it cannot seem to find the REV003 folder (The folder definitely exists in the path).
Is something wrong with my LINQ statement in finding the folder?
Also, if there is an easier procedure to achieve the same thing - I'm definitely open for it! Thanks!
int nextRev = 0;
List<int> listOfRevs = new List<int>();
IEnumerable<string> revFolders = Directory.GetDirectories(destDirName, "*REV*", SearchOption.AllDirectories).Where(f => f.StartsWith("REV"));
foreach (var rev in revFolders)
{
Console.WriteLine(int.Parse(rev.Replace("REV", "")));
listOfRevs.Add(int.Parse(rev.Replace("REV", "")));
}
if (listOfRevs.Count > 0)
{
nextRev = listOfRevs.Max();
Console.WriteLine(nextRev);
nextRev++;
}
revFolder = "REV" + nextRev.ToString("000");
Console.WriteLine("New Folder: " + revFolder);
** UPDATE **
Thanks to NetMage the problem was fixed, however I still had a few bugs. Here's the working code:
string revf = "";
int nextRev = 0;
List<int> listOfRevs = new List<int>();
IEnumerable<string> revFolders = Directory.GetDirectories(destDirName, "REV*", SearchOption.AllDirectories);
foreach (var rev in revFolders)
{
if (rev.Contains("REV"))
{
revf = rev.Split('\\').Last();
listOfRevs.Add(int.Parse(revf.Replace("REV", "")));
}
}
if (listOfRevs.Count > 0)
{
nextRev = listOfRevs.Max();
nextRev++;
}
revFolder = "REV" + nextRev.ToString("000");
Change your directory search to
IEnumerable<string> revFolders = Directory.GetDirectories(destDirName, "REV*", SearchOption.AllDirectories);
Based on the results from this, you will to change the maximum finding code - I would use LINQ:
var maxREV = Directory.GetDirectories(destDirName, $"REV*", SearchOption.AllDirectories)
.Select(d => Int32.TryParse(Path.GetFileName(d).Substring(3), out int num) ? num : (int?)null)
.Where(n => n.HasValue)
.Select(n => n.Value)
.Max();
var revFolder = "REV" + (maxREV+1).ToString("000");
Console.WriteLine("New Folder: " + revFolder);
I put in some error handling to skip files that don't have an integer after "REV".

Check if folder exists

my structure is as follows
MyRootFolder
└──subfolder1
└──subfolder2
.
.
.
└──subfolder n
I have a requirement where-in I need to check if a sub-folder exists within a root folder and if not create it. I can't find a direct API to check for the sub-folder existence. Instead, I see an API like
folder.get_SubFolders();
which would give me a list of all the sub-folders and then iterate to check if sub-folder exists or not. The problem here is that I might end up having to iterate many folders which I don't want to do. Is there a different way to achieve this? I'm using Filenet 5.2.1
ok, this is the closest option that I could find.
Search in FileNet for the subfolder using below query.
SELECT FolderName FROM Folder WHERE FolderName='subfolder1' and Parent=OBJECT({parent-folder-guid})
If the above search returns a result, the subfolder is present if not create one.
Another approach could be - you try to create a folder you want, if already exists dont create it else create it. To achieve it you dont have to iterate through the existing folders. I had the same requirement for one of my project where I had to move the content from shared drive to FileNet P8 and assigning properties to them while maintaining the same folder paths. Following snippet might help someone:
public static Folder createFolderStructure() {
Connection conn = Factory.Connection.getConnection(ConfigInfo.CE_URI);
Domain dom = Factory.Domain.getInstance(conn, null);
ObjectStore os = Factory.ObjectStore.getInstance(dom,
ConfigInfo.OBJECT_STORE_NAME);
Folder folder = null;
// pass your desired folder string here, excluding the starting root "/"
String folderpath = "APITest/some/folder/2014/12/2";
System.out.println("\nGiven input folder path is :: " + folderpath + "\n");
String[] myarray = folderpath.split("/");
for (int x = 0; x < myarray.length; x++) {
try {
if (x == 0) {
folder = Factory.Folder.createInstance(os, null);
Folder rootFolder = Factory.Folder.getInstance(os, null, "/");
folder.set_Parent(rootFolder);
folder.set_FolderName(myarray[x]);
System.out.println("Creating main (first) folder.. \t" + myarray[x]);
folder.save(RefreshMode.NO_REFRESH);
} else {
String currentfolder = myarray[x];
String parentfolder = "";
for (int i = 0; i < x; i++) {
folder = Factory.Folder.createInstance(os, null);
parentfolder = parentfolder + "/" + myarray[i];
Folder nxtrootFolder = Factory.Folder.getInstance(os, null, parentfolder);
folder.set_Parent(nxtrootFolder);
folder.set_FolderName(currentfolder);
}
System.out
.println("Trying to create " + currentfolder + " in " + parentfolder);
folder.save(RefreshMode.NO_REFRESH);
}
} catch (EngineRuntimeException ere) {
ExceptionCode code = ere.getExceptionCode();
if (code != ExceptionCode.E_NOT_UNIQUE) {
throw ere;
}
System.out.println("Above folder already exists...skipping...");
}
}
return folder;
}
This is copied from my own blog.

How Do I Remove Extra Spaces in Directories & Filenames / Filepaths in Windows

I want to remove extra spaces in a number of filepaths as the filepaths under scrutiny are rather long.
For example, I have this filepath:
C:\TEST Filepath\TEST Filepath\TEST Filepath\..\File.doc
and would like it to become:
C:\TEST Filepath\TEST Filepath\..\File.doc
I have hundreds of filepaths which are like this and would like to know if there is a quick and efficient way to remove the extra space from them?
Many thanks.
Tried with a small set on a spare disk. Please be careful.
void RemoveExtraSpace(string sourceDir)
{
var filePaths = Directory.GetDirectories(sourceDir, "*.*", SearchOption.AllDirectories);
Regex rx = new Regex(#"\s\s+");
for(int x = filePaths.Length - 1; x >= 0; x--)
{
string cur = filePaths[x];
DirectoryInfo di = new DirectoryInfo(cur);
if(rx.IsMatch(di.Name))
{
string result = Regex.Replace(di.Name, #"\s\s+", " ");
result = Path.Combine(di.Parent.FullName, result);
Directory.Move(di.FullName, result);
}
}
}

Script to rename files

I have about 2200 different files in a few different folders, and I need to rename about about 1/3 of them which are in their own subfolder. Those 700 are also in various folders as well.
For example, there might be
The top-most folder is Employees, which has a few files in it, then the folder 2002 has a few, 2003 has more files, 2004 etc.
I just need to attach the word "Agreement" before the existing name of each file. So instead of it just being "Joe Schmoe.doc" It would be "Agreement Joe Schmoe.doc" instead.
I've tried googling such scripts, and I can find stuff similar to what I want but it all looks completely foreign to me so I can't understand how I'd modify it to suit my needs.
Oh, and this is for windows server '03.
I need about 2 minutes to write such script for *NIX systems (may be less), but for Windows it is a long song ... ))
I've write simple VBS script for WSH, try it (save to {script-name}.vbs, change Path value (on the first line of the script) and execute). I recommend to test script on small amount of data for the first time just to be sure if it works correctly.
Path = "C:\Users\rootDirectory"
Set FSO = CreateObject("Scripting.FileSystemObject")
Sub visitFolder(folderVar)
For Each fileToRename In folderVar.Files
fileToRename.Name = "Agreement " & fileToRename.Name
Next
For Each folderToVisit In folderVar.SubFolders
visitFolder(folderToVisit)
Next
End Sub
If FSO.FolderExists(Path) Then
visitFolder(FSO.getFolder(Path))
End If
I used to do bulk renaming with batch scripts under Windows. I know it's a snap on *nix (find . -maxdepth N -type f -name "$pattern" | sed -e 'p' -e "s/$str1/$str2/g" | xargs -n2 mv). Buf after some struggle in vain, I found out, to achieve that effect using batch scripts is almost impossible. So I turned to javascript.
With this script, you can add prefix to file names by 'rename.js "s/^/Agreement /" -r *.doc'. A caret(^) means to match the beginning. The '-r' options means 'recursively', i.e. including sub-folders. You can specify a max depth with the '-d N' option. If neither '-r' or '-d N' is given, the script does not recurse.
If you know the *nix 'find' utility, you would notice that 'find' will match the full path (not just the file name part) to specified regular expression. This behavior can be achieved by supplying the '-f' option. By default, this script will match the file name part with the given regular expression.
If you are familiar with regular expressions, complicated renaming is possible. For example, 'rename.js "s/(\d+)/[$1]/" *' which uses grouping to add brackets to number sequences in filenames.
// rename.js --- bulk file renaming utility (like *nix rename.pl)
// (c) Copyright 2012, Ji Han (hanji <at> outlook <dot> com)
// you are free to distribute it under the BSD license.
// oops... jscript doesn't have array.map
Array.prototype.map = function(f, t){
var o = Object(this);
var a = new Array(o.length >>> 0);
for (var i = 0; i < a.length; ++i){ if (i in o) a[i] = f.call(t, o[i], i, o) }
return a;
};
/// main
(function(){
if (WScript.Arguments.Length == 0){
WScript.Echo('rename "<operator>/<pattern>/<string>/[<modifiers>]" [-f] [-r] [-d <maxdepth>] [<files>]');
WScript.Quit(1);
}
var fso = new ActiveXObject('Scripting.FileSystemObject');
// folder is a Folder object [e.g. from fso.GetFolder()]
// fn is a function which operates on File/Folder object
var recurseFolder = function(folder, fn, depth, maxdepth){
if (folder.Files){
for (var e = new Enumerator(folder.Files); !e.atEnd(); e.moveNext()){
fn(e.item())
}
}
if (folder.Subfolders){
for (var e = new Enumerator(folder.SubFolders); !e.atEnd(); e.moveNext()){
fn(e.item());
if (depth < maxdepth){ arguments.callee(e.item(), fn, depth + 1, maxdepth) }
}
}
}
// expand wildcards (asterisk [*] and question mark [?]) recursively
// given path may be relative, and may contain environment variables.
// but wildcards only work for the filename part of a path.
// return an array of full paths of matched files.
// {{{
var expandWildcardsRecursively = function(n, md){
var pattern = fso.GetFileName(n);
// escape regex metacharacters (except \, /, * and ?)
// \ and / wouldn't appear in filename
// * and ? are treated as wildcards
pattern = pattern.replace(/([\[\](){}^$.+|-])/g, '\\$1');
pattern = pattern.replace(/\*/g, '.*'); // * matches zero or more characters
pattern = pattern.replace(/\?/g, '.'); // ? matches one character
pattern = pattern.replace(/^(.*)$/, '\^$1\$'); // matches the whole filename
var re = new RegExp(pattern, 'i'); // case insensitive
var folder = fso.GetFolder(fso.GetParentFolderName(fso.GetAbsolutePathName(n)));
var l = [];
recurseFolder(folder, function(i){ if (i.Name.match(re)) l.push(i.Path) }, 0, md);
return l;
}
// }}}
// parse "<operator>/<pattern>/<string>/[<modifiers>]"
// return an array splitted at unescaped forward slashes
// {{{
var parseExpr = function(s){
// javascript regex doesn't have lookbehind...
// reverse the string and lookahead to parse unescaped forward slashes.
var z = s.split('').reverse().join('');
// match unescaped forward slashes and get their positions.
var re = /\/(\\\\)*(?!\\)/g;
var l = [];
while (m = re.exec(z)){ l.push(m.index) }
// split s at unescaped forward slashes.
var b = [0].concat(l.map(function(x){ return s.length - x }).reverse());
var e = (l.map(function(x){ return s.length - x - 1 }).reverse()).concat([s.length]);
return b.map(function(_, i){ return s.substring(b[i], e[i]) });
}
// }}}
var expr = WScript.Arguments(0);
var args = [];
var options = {};
for (var i = 1; i < WScript.Arguments.Length; ++i){
if (WScript.Arguments(i).substring(0, 1) != '-'){
args.push(WScript.Arguments(i));
} else if (WScript.Arguments(i) == '-f'){
options['fullpath'] = true;
} else if (WScript.Arguments(i) == '-r'){
options['recursive'] = true;
} else if (WScript.Arguments(i) == '-d'){
options['maxdepth'] = WScript.Arguments(++i);
} else if (WScript.Arguments(i) == '--'){
continue;
} else {
WScript.Echo('invalid option \'' + WScript.Arguments(i) +'\'');
WScript.Quit(1);
}
}
if (options['maxdepth']){
var md = options['maxdepth'];
} else if (options['recursive']){
var md = 1<<31>>>0;
} else {
var md = 0;
}
var tokens = parseExpr(expr);
if (tokens.length != 4){
WScript.Echo('error parsing expression \'' + expr + '\'.');
WScript.Quit(1);
}
if (tokens[0] != 's'){
WScript.Echo('<operator> must be s.');
WScript.Quit(1);
}
var pattern = tokens[1];
var substr = tokens[2];
var modifiers = tokens[3];
var re = new RegExp(pattern, modifiers);
for (var i = 0; i < args.length; ++i){
var l = expandWildcardsRecursively(args[i], md);
for (var j = 0; j < l.length; ++j){
var original = l[j];
if (options['fullpath']){
var nouveau = original.replace(re, substr);
} else {
var nouveau = fso.GetParentFolderName(original) + '\\' + fso.GetFileName(original).replace(re, substr);
}
if (nouveau != original){
(fso.FileExists(original) && fso.GetFile(original) || fso.GetFolder(original)).Move(nouveau)
}
}
}
})();

Resources