VBScript Regular Expression to remove all chars but digits and hyphen ("-") - vbscript

can someone help me to write RegExp to remove all characters except numbers and hyphen (minus sign or "-") between them.
string looks like:
C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls
it needs to be:
1586-10
only.
the number of digits before -10 unspecified (can be 4-6); -10 itslef can be any two-digit number...
to make it easier. here is the function i found:
Public Function strClean (strtoclean)
Dim objRegExp, outputStr
Set objRegExp = New Regexp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.Pattern = "((?![a-zA-Z0-9]).)+"
outputStr = objRegExp.Replace(strtoclean, "-")
objRegExp.Pattern = "\-+"
outputStr = objRegExp.Replace(outputStr, "-")
strClean = outputStr
End Function
the pattern currently makes this with file names:
C-Documents-and-Settings-Lena-Desktop-New-Folder-2-New-Folder-2-1588-11-sfiuhsgu-(fgRG75476)-skghgsiu.xls

\d?\d?\d\d\d\d-\d\d
\d is a digit
? means zero or one of the preceeding character
So \d? means that it can be 0 or 1 digit.
Edit: Added a sample of how to use it after the comments
Dim myRegExp
Set myRegExp = New RegExp
myRegExp.Pattern = "\d?\d?\d\d\d\d-\d\d"
Dim test
test = "C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls"
Set myMatches = myRegExp.Execute(Right(test, InStrRev(test, "\")))
WScript.Echo myMatches(0)
Edit2: Code snippet used to call your code
Dim test
test = "C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls"
test = StrClean(test)
WScript.Echo test

Related

Remove unnecessary data/Spaces in CSV

I need help on how to remove spaces/emtpy in data without compromising spaces on other data. Here's my sample data.
12345," ","abcde fgh",2017-06-06,09:00,AM," ", US
expected output:
12345,,"abcde fgh",2017-06-06,09:00,AM,, US
since " " should be considered as null.
I tried the Trim() function but it did not work. I also tried Regex pattern but still no use.
Here's my sample function.
Private Sub Transform(delimiter As String)
Dim sFullPath As String
Dim strBuff As String
Dim re As RegExp
Dim matches As Object
Dim m As Variant
If delimiter <> "," Then
strBuff = Replace(strBuff, delimiter, ",")
Else
With re
.Pattern = "(?!\B""[^""]*)" & delimiter & "(?![^""]*""\B)"
.IgnoreCase = False
.Global = True
End With
Set matches = re.Execute(strBuff)
For Each m In matches
strBuff = re.Replace(strBuff, ",")
Next
Set re = Nothing
Set matches = Nothing
End If
End Sub
I think you're on the right track. Try using this for your regular expression. The two double quotes in a row are how a single double quote is included in a string literal. Some people prefer to use Chr(34) to include double quotes inside a string.
\B(\s)(?!(?:[^""]*""[^""]*"")*[^""]*$)
Using that expression on your example string
12345," ","abcde fgh",2017-06-06,09:00,AM," ", US
yields
12345,"","abcde fgh",2017-06-06,09:00,AM,"", US
Example function
Private Function Transform(ByVal strLine As String) As String
Dim objRegEx As RegExp
On Error GoTo ErrTransForm
Set objRegEx = New RegExp
With objRegEx
.Pattern = "\B(\s)(?!(?:[^""]*""[^""]*"")*[^""]*$)"
.IgnoreCase = False
.Global = True
Transform = .Replace(strLine, "")
End With
ExitTransForm:
If Not objRegEx Is Nothing Then
Set objRegEx = Nothing
End If
Exit Function
ErrTransForm:
'error handling code here
GoTo ExitTransForm
End Function
And credit where credit is due. I used this answer, Regex Replace Whitespaces between single quotes as the basis for the expression here.
I would add an output string variable and have a conditional statement saying if the input is not empty, add it on to the output string. For example (VB console app format with the user being prompted to enter many inputs):
Dim input As String
Dim output As String
Do
input = console.ReadLine()
If Not input = " " Then
output += input
End If
Loop Until (end condition)
Console.WriteLine(output)
You can throw any inputs you don't want into the conditional to remove them from the output.
Your CSV file isn't correctly formatted.
Double quote shouldn't exists, then open your CSV with Notepad and replace them with a null string.
After this, you now have a real CSV file that you can import whitout problems.

What does these lines of code do in VBScript?

I am converting them to Jython script and I felt all it does is remove spaces at ends
function test (strField)
Dim re
Set re = New RegExp
re.Pattern = "^\s*"
re.MultiLine = False
strField = re.replace(strField,"")
End Function
It uses the RegExp object in VBScript to check for whitespace \s at the start of the variable passed into the Sub / Function called strField. Once it identifies the whitespace it uses the Replace() method to remove any matched characters from the start of the string.
As #ansgar-wiechers has mentioned in the comments it is just an all whitespace implementation of the LTrim() function.
I'm assuming this is meant to be a Function though (haven't tested but maybe VBScript accepts Fun as shorthand for Function, not something I'm familiar with personally) with that in mind it should return the modified strField value as the result of the function. Would also recommend using ByVal to stop the strField value after it is manipulated bleeding out of the function.
Function test(ByVal strField)
Dim re
Set re = New RegExp
re.Pattern = "^\s*"
re.MultiLine = False
strField = re.replace(strField,"")
test = strField
End Function
Usage in code:
Dim testin: testin = " some whitespace here"
Dim testout: testout = test(testin)
WScript.Echo """" & testin & """"
WScript.Echo """" & testout & """"
Output:
" some whitespace here"
"some whitespace here"

Search and divide numeric strings

I have an output.txt file which has following content:
Windows 6543765432
Linux 4534653463
MacOS 3564325
Ubuntu 8235646255
I want to create a VBScript which searches for all numeric values in output.txt and divide them by 1024 so that memory in KB can be changed into MB.
I have tried in batch here, but due to 2 GB limitation it's not working in above case.
Your top-level task is to modify a small file of structured text. The 'design pattern' for such a task is:
Get a FileSystemObject
Specify the full file path
Read the content
Modify the content
Write the modified content back
Your sub-task of modification involves computations on non-constant /varying parts; then a 'RegExp.Replace with Function' strategy should be used:
Define a RegExp (global, identify the parts to change)
.Replace(input, GetRef("function to do the computations on the parts"))
In your case, that function should convert the (string) parts to numbers, divide then, and return the result converted to strings.
In code:
Option Explicit
Const FSPEC = "..\testdata\txt\19556079.txt"
Dim oFS : Set oFS = CreateObject( "Scripting.FileSystemObject" )
Dim sAll : sAll = Modify(oFS.OpenTextFile(FSPEC).ReadAll())
oFS.CreateTextFile(FSPEC).Write sAll
WScript.Echo oFS.OpenTextFile(FSPEC).ReadAll()
Function Modify(s)
Dim re : Set re = New RegExp
re.Global = True
re.Pattern = "\d+"
Modify = re.Replace(s, GetRef("FiMoReFunc"))
End Function
Function FiMoReFunc(sM, sP, sS)
FiMoReFunc = CStr(CDbl(sM) / 1024)
End Function
For a more fancy output:
FiMoReFunc = Right(Space(20) & FormatNumber(CDbl(sM) / 1024, 1, True) & " Unit", 20)
output:
Windows 6,390,395.9 Unit
Linux 4,428,372.5 Unit
MacOS 3,480.8 Unit
Ubuntu 8,042,623.3 Unit
Try this
Option Explicit
Const FILE = "output.txt"
Dim fso
Dim ts,line
Dim match,matches
Dim os,mem
Set fso = CreateObject("Scripting.FilesystemObject")
Set ts = fso.OpenTextFile(FILE)
With New RegExp
.IgnoreCase = True
While Not ts.AtEndOfStream
line = ts.ReadLine
.Pattern = "[a-z]+"
Set matches = .Execute(line)
For Each match In matches
os = match.Value
Next
.Pattern = "\d+"
Set matches = .Execute(line)
For Each match In matches
mem = (CDbl(match.Value) / 1024)
Next
WScript.Echo os & vbTab & mem
Wend
End With
Set ts = Nothing
Set fso = Nothing
WScript.Quit

How to Remove Specific Special Characters

I have a string like X5BC8373XXX. Where X = a special character equals a Square.
I also have some special characters like \n but I remove them, but I can't remove the squares...
I'd like to know how to remove it.
I Found this method:
Dim Test As String
Test = Replace(Mscomm1.Input, Chr(160), Chr(64) 'Here I remove some of the special characters like \n
Test = Left$(Test, Len(Test) -2)
Test = Right$(Test, Len(Test) -2)
This method DOES remove those special characters, but it's also removing my first character 5.
I realize that this method just remove 2 characters from the left and the right,
but how could I work around this to remove these special characters ?
Also I saw something with vblF, CtrlF something like this, but I couldn't work with this ;\
You can use regular expressions. If you want to remove everything that's not a number or letter, you can use the code below. If there are other characters you want to keep, regular expressions are highly customizable, but can get a little confusing.
This also has the benefit of doing the whole string at once, instead of character by character.
You'll need to reference Microsoft VBScript Regular Expressions in your project.
Function AlphaNum(OldString As String)
Dim RE As New RegExp
RE.Pattern = "[^A-Za-z0-9]"
RE.Global = True
AlphaNum = RE.Replace(OldString, "")
End Function
Cleaning out non-printable characters is easy enough. One brute-force but easily customizable method might be:
Private Function Printable(ByVal Text As String) As String
Dim I As Long
Dim Char As String
Dim Count As Long
Printable = Text 'Allocate space, same width as original.
For I = 1 To Len(Text)
Char = Mid$(Text, I, 1)
If Char Like "[ -~]" Then
'Char was in the range " " through "~" so keep it.
Count = Count + 1
Mid$(Printable, Count, 1) = Char
End If
Next
Printable = Left$(Printable, Count)
End Function
Private Sub Test()
Dim S As String
S = vbVerticalTab & "ABC" & vbFormFeed & vbBack
Text1.Text = S 'Shows "boxes" or "?" depending on the font.
Text2.Text = Printable(S)
End Sub
This will remove control characters (below CHR(32))
Function CleanString(strBefore As String) As String
CleanString = ""
Dim strAfter As String
Dim intAscii As Integer
Dim strTest As String
Dim dblX As Double
Dim dblLen As Double
intLen = Len(strBefore)
For dblX = 1 To dblLen
strTest = Mid(strBefore, dblX, 1)
If Asc(strTest) < 32 Then
strTest = " "
End If
strAfter = strAfter & strTest
Next dblX
CleanString = strAfter
End Function

What is the best vbscript code to add decimal places to all numbers in a string?

Example
G76 I0.4779 J270 K7 C90
X20 Y30
If a number begins with I J K C X Y and it doesn't have a decimal then add decimal.
Above example should look like:
G76 I0.4779 J270 K7. C90.
X20. Y30.
Purpose of this code is to convert CNC code for an older Fanuc OPC controller
Set RegEx = New RegExp
RegEx.Global = True
RegEx.Pattern = "([IJKCXY]\d+)([^\.]|$)"
newVar = RegEx.Replace (oldString, "$1.$2")
Where oldString is the original string, and newVar is the string with the decimals added.
function convert(str)
Set RegEx = New RegExp
RegEx.Global = True
RegEx.Pattern = "([IJKCXY]\d*\.?\d*)"
Set Matches = regEx.Execute(str)
For Each Match in Matches
if instr(Match.value, ".") = 0 then
str = Replace(str, Match.value, Match.value & ".")
end if
Next
convert = str
end function
tloach still answer doesn't work
Waynes works but also puts a . after every occurrence of IJKCXY
I changed if instr(Match.value, ".") = 0 then
To be like if instr(Match.value, ".") = 0 and len(Match.value) > 1 then

Resources