Replacing a string in a file in windows - windows

I am looking for a way to replace all occurrences of string A with B in a file.
I tried using GnuWin32 sed utility, but the result file is trimmed. It probably happens because the file contains non unicode characters. The same command worked on Mac with the same file only after adding LC_ALL=C before the command.
What other tools i can use and how? Can i pass some flag to GnuWin32 sed that will work with non unicode characters?

In PowerShell something like this should work:
$f = 'C:\path\to\your.txt'
(Get-Content $f) -replace 'A','B' | Out-File $f

Another option is to use variable syntax:
${C:\path\to\your.txt} -replace 'A','B' | Out-File C:\path\to\your.txt

Without seeing the files encoding it's impossible to tell if this will work but this uses a helper batch file called repl.bat from - http://www.dostips.com/forum/viewtopic.php?f=3&t=3855
type "file.txt" |repl "A" "B" >"newfile.txt"

Related

How can i convert a sed command to its PowerShell equivalent?

Editor's note:
The macOS sed command below performs an in-place (-i '') string-substitution (string-replacement) operation on the given file, i.e. it transforms the file's existing content. The specific substitution shown, s/././g, replaces all non-newline characters (regex metacharacter .) with verbatim . characters, so be careful when trying the command yourself.
While the intended question may ultimately be a different one, as written the question is well-defined, and can be answered to show the full PowerShell equivalent of the sed command (a partial translation is in the question itself), notably including the in-place updating of the file.
I have a mac command and i need it to run on windows. I have no experience in mac whatsoever.
sed -i '' 's/././g' dist/index.html
After research i found that i should use
get-content path | %{$_ -replace 'expression','replace'}
but can't get it to work yet.
Note:
The assumption is that s/././g in your sed command is just a example string substitution that you've chosen as a placeholder for real-world ones. What this example substitution does is to replace all characters other than newnlines (regex .) with a verbatim . Therefore, do not run the commands below as-is on your files, unless you're prepared to have their characters turn into .
The direct translation of your sed command, which performs in-place updating of the input file, is (ForEach-Object is the name of the cmdlet that the built-in % alias refers to):
(Get-Content dist/index.html) |
ForEach-Object { $_ -replace '.', '.' } |
Set-Content dist/index.html -WhatIf
Note: The -WhatIf common parameter in the command above previews the operation. Remove -WhatIf once you're sure the operation will do what you want.
Or, more efficiently:
(Get-Content -ReadCount 0 dist/index.html) -replace '.', '.' | Set-Content dist/index.html -WhatIf
-ReadCount 0 reads the lines into a single array before outputting the result, instead of the default behavior of emitting each line one by one to the pipeline.
Or, even more efficiently, if line-by-line processing isn't required and the -replace operation can be applied to the entire file content, using the -Raw switch:
(Get-Content -Raw dist/index.html) -replace '.', '.' | Set-Content -NoNewLine dist/index.html -WhatIf
Note:
-replace, the regular-expression-based string replacement operator uses the syntax <input> -replace <regex>, <replacement> and invariably performs global replacements (as requested by the g option in your sed command), i.e. replaces all matches it finds.
Unlike sed's regular expressions, however, PowerShell's are case-insensitive by default; to make them case-sensitive, use the -creplace operator variant.
Note the required (...) around the Get-Content call, which ensures that the file is read into memory in full and closed again first, which is the prerequisite for being able to rewrite the file with Set-Content in the same pipeline.
Caveat: While unlikely, this approach can result in data loss, namely if the write operation that saves back to the input file gets interrupted.
You may need -Encoding with Set-Content to ensure that the rewritten file uses the same character encoding as the original content - Get-Content reads text files into .NET strings recognizing a variety of encodings, and no information is retained as to what encoding was encountered.
Except with the Get-Content -Raw / Set-Content -NoNewLine solution, which preserves the original newline format, the output file will use the platform-native newline format - CRLF (\r\n) on Windows, LF (\n) on Unix-like platforms - irrespective of which format the input file originally used.

Avoid non standard ASCII characters in rename command

I am using this command to find and rename files that have non capitalised filenames in a directory (I have left the -n flag for safety in case anyone copies and pastes from here):
rename -n 's/(?<![.'\''])\b\w*/\u$&/g' *
The problem is that it finds files that have non standard ASCII characters such as Noël and regards them as a problem that would need to be fixed.
Is there any way to avoid that happening?
Edit (20180701-1635):
I just realised that the command also 'fails' (tries to rename) if a filename contains a dash or an apostrophe too (it changes the character following to uppercase). Examples of wrong renames currently:
Alan's Filename.txt > Alan'S Filename.txt
File-name.txt > File-Name.txt
Your question is a bit diffuse but I think you mean something like:
for i in $(echo * | sed 's, YOUR_REG_EGP ,,g'); do
# your rename commmands on $i
done

How to select files in a directory begins with explicit names in bash?

I have a shell script as below
dcacheDirIn="/mypath/"
for files in `ls $dcacheDirIn | grep txt`
do
.....
done
I have some .txt files in this directory, some of them begins with Data2012*.txt and some of Data2011*.txt. How can I choose "Data2012" files?
EDIT: my bad I mixed up with my python file. This is shell script for sure.
You can try this
dcacheDirIn="/mypath/"
for files in `ls $dcacheDirIn | grep Data2012`
do
echo $files
done
To avoid directories with that name, try
ls $dcacheDirIn -p | grep -v / | grep Data2012
In Python you can use the glob library as follows:
import glob
for file2012 in glob.glob("/mypath/Data2012*.txt"):
print file2012
Tested using Python 2.7
You can use grep to achieve this directly:
dcacheDirIn="/mypath/"
for files in `ls $dcacheDirIn | grep -E 'Data2012.*\.txt'`
do
.....
done
grep uses regex to filter the output from ls. The regex I provided for grep will filter out files in the format Data2012*.txt, like you wanted.
The python glob library has that capability and it also supports regex expressions. So, for instance, you would do:
for file in glob.glob('*2012.txt'):
print file
and that would print the files matching that expression (assuming you're running it from the same directory). It has a heap-load more functionality though, you should dive deeper.
Edit: fixed indents, need more chars..
In bash the wildcards will do the work of ls for you.
Just use
dcacheDirIn="/mypath"
for file in $dcacheDirIn/Data2012*txt
do
echo "File $file"
done

WIndows Batch Script to remove multiple special characters from a text file

I have a text file which has lot of special characters. From that test file, I would like to remove three special characters(~ œ <). Can someone please provide me a script to address my need? I tried with some scripts but it doesn't seem to be working for the character ~.
You can use sed to execute this, download it here -
sed "s/[^a-zA-Z0-9]//g" file.txt
If you have latest, Windows 7 or higher version, you can do something like this in PowerShell
Get-Content file.txt | foreach { $_ -replace '[^\w\d]' } | Out-File -Encoding UTF8 file.new.txt
OR, Download Ruby for windows
C:\>ruby -ne 'print $_.gsub(/[~)œ\[\]<]/,"")' file
Thanks!

Windows PATH to posix path conversion in bash

How can I convert a Windows dir path (say c:/libs/Qt-static) to the correct POSIX dir path (/c/libs/Qt-static) by means of standard msys features? And vice versa?
Cygwin, Git Bash, and MSYS2 have a readymade utility called cygpath.exe just for doing that.
Output type options:
-d, --dos print DOS (short) form of NAMEs (C:\PROGRA~1\)
-m, --mixed like --windows, but with regular slashes (C:/WINNT)
-M, --mode report on mode of file (binmode or textmode)
-u, --unix (default) print Unix form of NAMEs (/cygdrive/c/winnt)
-w, --windows print Windows form of NAMEs (C:\WINNT)
-t, --type TYPE print TYPE form: 'dos', 'mixed', 'unix', or 'windows'
I don't know msys, but a quick google search showed me that it includes the sed utility. So, assuming it works similar in msys than it does on native Linux, here's one way how to do it:
From Windows to POSIX
You'll have to replace all backslashes with slashes, remove the first colon after the drive letter, and add a slash at the beginning:
echo "/$pth" | sed 's/\\/\//g' | sed 's/://'
or, as noted by xaizek,
echo "/$pth" | sed -e 's/\\/\//g' -e 's/://'
From POSIX to Windows
You'll have to add a semi-colon, remove the first slash and replace all slashes with backslashes:
echo "$pth" | sed 's/^\///' | sed 's/\//\\/g' | sed 's/^./\0:/'
or more efficiently,
echo "$pth" | sed -e 's/^\///' -e 's/\//\\/g' -e 's/^./\0:/'
where $pth is a variable storing the Windows or POSIX path, respectively.
Just use cygpath:
$ cygpath -w "/c/foo/bar"
-> C:\foo\bar
$ cygpath -u "C:\foo\bar"
-> /c/foo/bar
You may wonder: "Do I have cygpath installed?" Well,
If you're using the git-bash shell, then yes.
If you're in cygwin or MSYS2, then yes.
If you're in another shell, but you have installed git-bash before, then cygpath can be found at git-bash-install-folder\usr\bin\cygpath.exe.
Else: maybe not, but I'm sure you can find a way to installed it.
The "correct" way in MSYS is:
$ MSYS_NO_PATHCONV=1 taskkill /F /T /IM ssh-agent.exe
This avoids having to manually translate slashes. It simply de-activates the path conversion.
Here is my implementation (tested on git bash).
From POSIX to Windows
sed '
\,/$, !s,$,/,
\,^/, s,/,:/,2
s,^/,,
s,/,\\,g
' <<< "$#"
Works for:
/c/git
relative/dir
c:/git
~
.
..
/c
/c/
./relative/dir
/sd0/some/dir/
except
/
<path with space>
Explanation:
\,^/, s,/,:/,2 (converts /drive/dir/ to /drive:/dir/) is the heart of it and inserts : before the 2nd /. I use , for delim instead of / for readability. If starting with / (\,^/,), then replace / with :/ for the 2nd occurrence. I do not want to assume drive letter length of 1 so this works for /sd0/some/dir.
s,^/,, removes the leading / and s,/,\\,g converts all / to \.
\,/$, !s,$,/, is to handle the corner case of /c and ensure 2nd / (/c/) for the next command to work.
Note:
If here string <<< does not work in your shell then you can echo and pipe as
echo "$#" | sed ...
Errata
Here e script
just FYI - at least for my git version 2.26.2.windows.1
e.g. if I have a path like C:\dev\work_setup\msk, I can go directly to Git Bash and type
cd "C:\dev\work_setup\msk"
this will result in current folder being changed to /c/dev/work_setup/msk - so this type of conversion seems to be done automatically, as long as I put the Windows path inside double quotes. Unfortunately I don't have references to original documentation that would back that up.
My solution works with a list of folders/files and it's done in 2 steps.
Suppose you would like to replace a path from D:\example to /example for a list of file where this Windows path has been repetead.
The first step it changes the backlashes into slashes
grep -lr "D:\\\\example" /parent-folder | xargs -d'\n' sed -i 's+\\+\/+g'
Note that parent-folder could be root (/) or whatever you like and -d'\n' parameter is necessary if you have filenames or folder names with white spaces.
Second step it substitutes the D:/example into /example:
grep -lr "D:/example" /parent-folder | xargs -d'\n' sed -i 's+D:+/example+g'
I wanted to share this solution since it tooks me some time to make this 2 lines but it has been really helpfull job (I'm migrating a Windows App to a Linux Server with tons of Windows paths inside').
The answer of #hello_earth is misleading, due to Windows path must be double backslashed like:
cd "e:\\dir\\subdir\\path"
otherwise the shell will find escape-sequences.

Resources