Recursive find and replace on Command Line with Special Characters - bash

Im trying to recursively go through a folder structure and update a bunch of pom.xml files. I want to only update my version number so I'm trying to be as exact as possible. What I want to change is:
<version>5.1.1</version>
to
<version>5.2.0</version>
Im trying to include the version tags to be sure I dont replace any comments or dependencies this same version number may appear on.
I think the characters like '<,> or /' are causing issues.
I don't have much experience with escaping characters like this on the command line so any help is appreciated.
I am on a Windows 7 machine but have Git Bash and Cygwin installed.

I am either using a tool called "fart.exe" for this - if the replacement is simple. https://sourceforge.net/projects/fart-it/
If I need regex I use power shell.
Here is an example (mix of batch-file and power-shell) which replaces a version string in all XML files:
[replace.bat]:
SET version=1.2.3
for /r %%x in (*.xml) do (
powershell -Command "& {(Get-Content '%%x') | Foreach-Object { $_ -replace '(''version''\s?\:\s?'')(\d*\.\d*\.\d*)('')', '${1}%version%${3}' } | Set-Content '%%x'}"
)

Related

Batch renaming files to remove specific characters

How can I create a batch file that removes specific characters from the filenames only? Is it possible to do in a script? I'm looking to remove these symbols:
({[}])+-
Let me introduce you to Ruby, a scripting language far more powerful and easier to learn than the M$ alternative Powershell. The windows interpreter you can find at http://rubyinstaller.org/
I could give you the code in one line but for clarity here a 4 line program that does what you need.
Dir['c:/temp/*.package'].each do |file|
new_name = File.basename(file.gsub(/[({\[}\])+-]/, ''))
File.rename(file, new_name)
end
Let me explain, the Dir command enumerates the path with wildcards inside, in unix notation with slashes instead of backslashes and for each file it generates a new_name by taking the basename (filename only) of the path and using a Regular Expression to replace (gsub) the characters inside the /[]/ with the second parameter, '' (empty string). Regular Expressions are the way to go for such things, plenty of information to google if you want to know more about.
Finally I use the utility class File (yes Ruby is totally Object Oriented) to rename the file to the new name.
You could do this with any language but I bet not so concise and readable as in Ruby. Install the Ruby interpreter, save these lines in eg cleanup.rb, change the path to yours and fire it up with ruby cleanup, or just cleanup if your extension is correctly associated.
This renames
c:\temp\na({[}])+-me.package
to
c:\temp\name.package
And as a bonus: here the one line version that does the same in the current folder.
Dir['*'].each {|f| File.rename(f, File.basename(f.gsub(/[({\[}\])+-]/, '')))}
The Windows cmd.exe command shell's rename command lacks the power to do what you need. It can be done with the PowerShell Rename-Item command using the -Replace operator with a regular expression.
However running PowerShell scripts is restricted by security policies, so it is not quite as straightforward as a cmd batch script. A scripting language such as Python may be less problematic.
The Output
C:\Users\User>(
Set FName=test()[]{}+-
Set FName=!Fname:(=!
Set FName=!Fname:)=!
Set FName=!Fname:{=!
Set FName=!Fname:}=!
Set FName=!Fname:[=!
Set FName=!Fname:]=!
Set FName=!Fname:+=!
Set FName=!Fname:-=!
Echo Ren test()[]{}+-.asc !Fname!.*
)
Ren test()[]{}+-.asc test.*
The batchfile
SetLocal EnableDelayedExpansion
for %%A in (*.asc) Do (
Set FName=%%~nA
Set FName=!Fname:^(=!
Set FName=!Fname:^)=!
Set FName=!Fname:{=!
Set FName=!Fname:}=!
Set FName=!Fname:[=!
Set FName=!Fname:]=!
Set FName=!Fname:+=!
Set FName=!Fname:-=!
Echo Ren %%A !Fname!.*
)

How to do a "sed -" like operation on Windows?

I have a large 55 GB file in which there is a sentence on every line.
I want to check if there are any lines that have a dot "." at the end, and then if there is, I want to insert a space before the dot in that line.
Ex: I like that car.
Replace with: I like that car .
A space before the trailing dot on every line if there is a dot.
I don't have any cygwin or unix and I use a windows OS. Is there a sed like common that I can do on this 55GB! file?
I tried GetGNUWin32 but I am unable to determine the actual command there.
Install Perl. Strawberry Perl is probably the best distribution for Windows. http://strawberryperl.com/
To do what you're talking about in Perl, it would be this:
perl -p -i -e's/\.$/ ./' filename
You can install Cygwin and use sed from there. And here I found Sed for Windows
Edit:
Very Good Answers to your Question:
Is there any sed like utility for cmd.exe
(I always prefix stackoverfloew when I search on google. Same I did for you on google: sed on window stackoverflow, but that is different matter)
For your use case:
From PowerShell.exe (comes with Windows)
(Get-Content file.txt) -Replace '\.$', ' .' | Set-Content file.txt
I searched for hours and hours and had so much trouble trying to find a solution to my use case, so I hope adding this answer helps someone else in the same situation.
For those who got here to figure out git filter clean/smudge like I did, here's how I finally managed it:
In file: .gitconfig (global)
[filter "replacePassword"]
required = true
clean = "PowerShell -Command \"(Get-Content " %f ") -Replace 'this is a password', 'this is NOT a password'\""
smudge = "PowerShell -Command \"(Get-Content " %f ") -Replace 'this is NOT a password', 'this is a password'\""
Please note that this snippet doesn't change the original file (this is intended for my use case).
Additional search terms to help those looking: interpolation, interpolate

Convert file from Windows to UNIX through Powershell or Batch

I have a batch script that prompts a user for some input then outputs a couple of files I'm using in an AIX environment. These files need to be in UNIX format (which I believe is UTF8), but I'm looking for some direction on the SIMPLEST way of doing this.
I don't like to have to download extra software packages; Cygwin or GnuWin32. I don't mind coding this if it is possible, my coding options are Batch, Powershell and VBS. Does anyone know of a way to do this?
Alternatively could I create the files with Batch and call a Powershell script to reform these?
The idea here is a user would be prompted for some information, then I output a standard file which are basically prompt answers in AIX for a job. I'm using Batch initially, because I didn't know that I would run into this problem, but I'm kind of leaning towards redoing this in Powershell. because I had found some code on another forum that can do the conversion (below).
% foreach($i in ls -name DIR/*.txt) { \
get-content DIR/$i | \
out-file -encoding utf8 -filepath DIR2/$i \
}
Looking for some direction or some input on this.
You can't do this without external tools in batch files.
If all you need is the file encoding, then the snippet you gave should work. If you want to convert the files inline (instead of writing them to another place) you can do
Get-ChildItem *.txt | ForEach-Object { (Get-Content $_) | Out-File -Encoding UTF8 $_ }
(the parentheses around Get-Content are important) However, this will write the files in UTF-8 with a signature at the start (U+FEFF) which some Unix tools don't accept (even though it's technically legal, though discouraged to use).
Then there is the problem that line breaks are different between Windows and Unix. Unix uses only U+000A (LF) while Windows uses two characters for that: U+000D U+000A (CR+LF). So ideally you'd convert the line breaks, too. But that gets a little more complex:
Get-ChildItem *.txt | ForEach-Object {
# get the contents and replace line breaks by U+000A
$contents = [IO.File]::ReadAllText($_) -replace "`r`n?", "`n"
# create UTF-8 encoding without signature
$utf8 = New-Object System.Text.UTF8Encoding $false
# write the text back
[IO.File]::WriteAllText($_, $contents, $utf8)
}
Try the overloaded version ReadAllText(String, Encoding) if you are using ANSI characters and not only ASCII ones.
$contents = [IO.File]::ReadAllText($_, [Text.Encoding]::Default) -replace "`r`n", "`n"
https://msdn.microsoft.com/en-us/library/system.io.file.readalltext(v=vs.110).aspx
https://msdn.microsoft.com/en-us/library/system.text.encoding(v=vs.110).aspx
ASCII - Gets an encoding for the ASCII (7-bit) character set.
Default - Gets an encoding for the operating system's current ANSI code page.

an app or a batch file script to remove special characters from text

I love this online tool http://textmechanic.co/ but it lacks another important feature which is to delete special characters such as %, %, [, ), *, ?, ', etc.. except for _, -, and . from a large quantity of text.
I am looking for an online tool or a small windows utility or a batch script that can do this.
I think sed is the easiest choice here. You can download it for Windows here Furthermore, nearly every text editor should allow that (but most won't cope with files in the multi-GiB range well).
With sed you'd probably want something like this:
sed "s/[^a-zA-Z0-9_.-]//g" file.txt
Likewise, if you have a semi-recent Windows (i.e. Windows 7), then PowerShell comes preinstalled with it. The following one-liner will do that for you:
Get-Content file.txt | foreach { $_ -replace '[^\w\d_.-]' } | Out-File -Encoding UTF8 file.new.txt
This can easily adapted to multiple files as well. It could be that you also can output into the original file again, since I think Get-Content yields an array, not an enumerator (i.e. this pipeline cannot operate on the file as you read it). Similar problem due to that with very large files, though.
You can do regex with any tool/language that supports it. Here's a Ruby for Windows command
C:\work>ruby -ne 'print $_.gsub(/[%)?\[\]*]/,"")' file

Extracting a 7-Zip file "silently" - command line option

I want to extract a 7-Zip archive in a Python script. It works fine except that it spits out the extraction details (which is huge in my case).
Is there a way to avoid this verbose information while extracting? I did not find any "silent" command line option to 7z.exe.
My command is
7z.exe -o some_dir x some_archive.7z
I just came across this when searching for the same, but I solved it myself! Assuming the command is processed with Windows / DOS, a simpler solution is to change your command to:
7z.exe -o some_dir x some_archive.7z > nul
That is, direct the output to a null file rather than the screen.
Or you could pipe the output to the DOS "find" command to only output specific data, that is,
7z.exe -o some_dir x some_archive.7z | FIND "ing archive"
This would just result in the following output.
Creating archive some_archive.7z
or
Updating archive some_archive.7z**
My final solution was to change the command to
... some_archive.7z | FIND /V "ing "
Note double space after 'ing'. This resulted in the following output.
7-Zip 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
Scanning
Updating some_archive.7z
Everything is Ok
This removes the individual file processing, but produces a summary of the overall operation, regardless of the operation type.
One possibility would be to spawn the child process with popen, so its output will come back to the parent to be processed/displayed (if desired) or else completely ignored (create your popen object with stdout=PIPE and stderr=PIPE to be able to retrieve the output from the child).
Like they said, to hide most of the screen-filling messages you could use ... some_archive.7z | FIND /V "Compressing" but that "FIND" would also remove the error messages that had that word. You would not be warned. That "FIND" also may have to be changed because of a newer 7-zip version.
7-zip has a forced verbose output, no silence mode, mixes stderr and stdout(*), doesn't save Unix permissions, etc. Those anti-standards behaviors together put "7-zip" in a bad place when being compared to "tar+bzip2" or "zip", for example.
(*) "Upstream (Igor Pavlov) does not want to make different outputs for messages, even though he's been asked several times to do so :(" http://us.generation-nt.com/answer/bug-346463-p7zip-stdout-stderr-help-166693561.html - "Igor Pavlov does not want to change this behaviour" http://sourceforge.net/tracker/?func=detail&aid=1075294&group_id=111810&atid=660493
7zip does not have an explicit "quiet" or "silent" mode for command line extraction.
One possibility would be to spawn the child process with popen, so its output will come back to the parent to be processed/displayed (if desired) or else completely ignored (create your popen object with stdout=PIPE and stderr=PIPE to be able to retrieve the output from the child).
Otherwise Try doing this:
%COMSPEC% /c "%ProgramFiles%\7-Zip\7z.exe" ...
Expanding on #Matthew 's answer and this answer https://superuser.com/questions/194659/how-to-disable-the-output-of-7-zip
I'm using FINDSTR instead of find so I can chain multiple lines to exclude and blank lines as well:
7za.exe a test1.zip .\foldertozip | FINDSTR /V /R /C:"^Compressing " /C:"Igor Pavlov" /C:"^Scanning$" /C:"^$" /C:"^Everything is Ok$"
/V: exclude
/R: regex
/C:"^Compressing " : begining of line, Compressing, 2 spaces
/C:"^Scanning$" : the word Scanning on its own on a line (begining/end)
/C:"^$" : a begining and end without anything in between, ie, a blank line
I'm using /C so that a space is a space, otherwise it's a separator between multiple words to exlude as in this simpler version:
FINDSTR /V "Compressing Pavlov Scanning Everytyhing"
(the same caveats exist, if the wording changes in a new version, or if a useful line starts with the word "Compressing ", it will not work as expected).
If you're running 7-zip.exe from Powershell, and you only want to see errors, then you could try something like this:
7-zip.exe u <Target> <Source> | Select-String "Error" -Context 10
This will only display the "Error" message line and the surrounding 10 lines (or whatever number) to capture the error specific output.
The | FIND is a good alternative to show what happened without displaying insignificant text.
Examining 7zip source I found hidden -ba switch that seems to do the trick. Unfortunately it is not finished. I managed to make it work with several modifications of sources but it's just a hack. If someone's interested, the option variable is called options.EnableHeaders and changes are required in CPP/7zip/UI/Console/Main.cpp file.
Alternatively you can poke 7Zip's author to finish the feature in tracker. There are several requests on this and one of them is here.
7-zip has not such an option. Plus the lines printed at each file compressed are supposed to display at the same spot without newline, erasing the previous one, which has a cool effect. Unfortunatly, in some contexts (Jenkins...) it produced several lines ☹️ flooding the console.
NUL (windows) is maybe one solution.
7-zip.exe -o some_dir x some_archive.7z>NUL
To show just the last 4 lines...
7z x -y some_archive.7z | tail -4
gives me:
Everything is Ok
Size: 917519
Compressed: 171589
The switch -y is to answer yes to everything (in my case to override existing files).
On Unix-like operating systems (Linux, BSD, etc.) the shell command 7z ... >/dev/null will discard all text written by 7z to standard output. That should cover all the status/informational messages written by 7z.
It seems that 7z writes error messages to standard error so if you do >/dev/null, error messages will still be shown.
As told by Fr0sT above, -ba switch outputs only valid things (at least in list option on which I was trying).
7z.exe l archive_name.zip
7z.exe l -ba archive_name.zip
made great difference, esp for parsing the output in scripts.
There is no need to modify anything, just use -ba switch in version19. This was also told bysomeone above. I'm putting as answer as I can't comment.
You can stop 7-Zip from displaying prompts by using the -y switch. This will answer yes to all prompts. Use this only when you are confident.

Resources