How to remove characters from a string in VBscript - vbscript

I'm new to vbscript and Stack Overflow, and could really use some help.
Currently I'm trying to format a phone number that is read from an image and stored in a variable. Because the images are "dirty" extra characters find their way in, such as periods or parenthesis. I've already restricted the field as much as possible to help prevent picking up extra characters, but alas!
For example I want to turn ",123.4567890" into "123 456 7890" (not including the double quotes). Trouble is I won't know what extra characters could potentially be picked up, ruling out a simple replace.
My logic is remove any non numeric characters, start from the left, insert a space after the third number, insert a space after the sixth number.
Any help would be great, and please feel free to ask for more information if needed.

Welcome to Stack Overflow. You can remove non-digits using Regex, and concatenate the parts using Mid function.
For example:
Dim sTest
sTest = ",123.4567890"
With (New RegExp)
.Global = True
.Pattern = "\D" 'matches all non-digits
sTest = .Replace(sTest, "") 'all non-digits removed
End With
WScript.Echo Mid(sTest, 1, 3) & " "& Mid(sTest, 4, 3) & " "& Mid(sTest, 7, 4)
Or fully using Regex (via a second grouping pattern):
Dim sTest
sTest = ",123.4567890"
With (New RegExp)
.Global = True
.Pattern = "\D" 'matches all non-digits
sTest = .Replace(sTest, "") 'all non-digits removed
.Pattern = "(.{3})(.{3})(.{4})" 'grouping
sTest = .Replace(sTest, "$1 $2 $3") 'formatted
End With
WScript.Echo sTest

Use a first RegExp to clean non-digits from the input and second one using groups for the layout:
Function cleanPhoneNumber( sSrc )
Dim reDigit : Set reDigit = New RegExp
reDigit.Global = True
reDigit.Pattern = "\D"
Dim reStruct : Set reStruct = New RegExp
reStruct.Pattern = "(.{3})(.{3})(.+)"
cleanPhoneNumber = reStruct.Replace( reDigit.Replace( sSrc, "" ), "$1 $2 $3" )
End Function ' cleanPhoneNumber

Related

Finding String and Reformatting Accordingly

I've been working on a VBScript that finds all files with a specific extension, .dat, searches for a specific pattern, "Date Due:\d{8}", and shifts the string around in a specific format.
I am having two problems with the below code:
It is not reading the first line. Whenever I run the script it seems to jump immediately to the second line.
It is only using the first pattern it finds and replaces the following patterns with the first pattern in the newly formatted manner.
I hope this makes sense, it's a very specific script, but I am hoping for some help understanding the problem here.
Below is my code:
Set fso = CreateObject("Scripting.FileSystemObject")
'newtext = vbLf & "Date Due:" & sub_month & sub_day & sub_year 'the text replacing Date Due:
'the purpose of this script is to format the date after Date Due:, which is currently formatted as YYYYMMDD, to MM/DD/YYYY
'example: Date Date:20180605 should be Date Due:06/05/2018
Set re = New RegExp
re.Pattern = "(\nDate Due:\d{8})" 'Looking for line, Date Due: followed by 8 digits
Dim sub_str 'substring of date due, returns the 8 digits aka the date 12345678
Dim sub_month
Dim sub_day
Dim sub_year
Dim concat_full
re.Global = False
re.IgnoreCase = True
For Each f In fso.GetFolder("C:\Users\tgl\Desktop\TestFolder\").Files
If LCase(fso.GetExtensionName(f.Name)) = "dat" Then
text = f.OpenAsTextStream.ReadAll
sub_str = Mid(text, 10, 8) 'substring of the full line, outputs the 8 digit date
sub_month = Mid(sub_str, 5, 2) 'substring of the date, outputs the 2 digit month
sub_day = Mid(sub_str, 7, 2) 'substring of the date, outputs the 2 digit day
sub_year = Mid(sub_str, 1, 4) 'substring of the date, outputs the four digit year
newtext = vbLf & "Date Due:" & sub_month & "/" & sub_day & "/" & sub_year 'replaces the text pattern defined above and concatenates the substrings with slashes
'concat_full = (sub_month & sub_day & sub_year)
f.OpenAsTextStream(2).Write re.Replace(text, newtext)
End If
Next
EDIT: When changing re.Global to True it replaces each line with the one found pattern. It should be using each found pattern as it's own and not the first one it finds.
Make your regular expression more specific and use capturing groups for extracting the relevant submatches:
re.Pattern = "(\nDate Due:)(\d{4})(\d{2})(\d{2})"
then replace the matches like this:
re.Replace(text, "$1$4/$3/$2")
$1 through $4 in the replacement string are backreferences to the capturing groups in the pattern (i.e. they're replaced with the respective captured substring).

Is there a way to make special characters work when using InStr in VBScript?

A VBScript is in use to shorten the system path by replacing entries with the 8.3 versions since it gets cluttered with how much software is installed on our builds. I'm currently adding the ability to remove duplicates, but it's not working correctly.
Here is the relevant portion of code:
original = "apple;orange;apple;lemon\banana;lemon\banana"
shortArray=Split(original, ";")
shortened = shortArray(1) & ";"
For n=2 to Ubound(shortArray)
'If the shortArray element is not in in the shortened string, add it
If NOT (InStr(1, shortened, shortArray(n), 1)) THEN
shortened = shortened & ";" & shortArray(n)
ELSE
'If it already exists in the string, ignore the element
shortened=shortened
End If
Next
(Normally "original" is the system path, I'm just using fruit names to test...)
The output should be something like
apple;orange;lemon\banana
The issue is entries with punctuation, such as lemon\banana, seem to be skipped(?). I've tested it with other punctuation marks, still skips over it. Which is an issue, seeing how the system path has punctuation in every entry.
I know the basic structure works, since there are only one of each entry without punctuation. However, the real output is something like
apple;orange;lemon\banana;lemon\banana
I thought maybe it was just a character escape issue. But no. It still will not do anything with entries containing punctuation.
Is there something I am doing wrong, or is this just a "feature" of VBScript?
Thanks in advance.
This code:
original = "apple;orange;apple;lemon\banana;lemon\banana"
shortArray = Split(original, ";")
shortened = shortArray(0) ' array indices start with 0; & ";" not here
For n=1 to Ubound(shortArray)
'If the shortArray element is not in in the shortened string, add it
'i.e. If InStr() returns *number* 0; Not applied to a number will negate bitwise
' If 0 = InStr(1, shortened, shortArray(n), 1) THEN
If Not CBool(InStr(1, shortened, shortArray(n), 1)) THEN ' if you insist on Not
WScript.Echo "A", shortArray(n), shortened, InStr(1, shortened, shortArray(n), vbTextCompare)
shortened = shortened & ";" & shortArray(n)
End If
Next
WScript.Echo 0, original
WScript.Echo 1, shortened
WScript.Echo 2, Join(unique(shortArray), ";")
Function unique(a)
Dim d : Set d = CreateObject("Scripting.Dictionary")
Dim e
For Each e In a
d(e) = Empty
Next
unique = d.Keys()
End Function
output:
0 apple;orange;apple;lemon\banana;lemon\banana
1 apple;orange;lemon\banana
2 apple;orange;lemon\banana
demonstrates/explains your errors (indices, Not) and shows how to use the proper tool for uniqueness (dictionary).

Replace characters suing VB script

I want to replace a particular value followed by TRN* to some other value.How to do the coding for the same?..Please provide an example.
For example :
TRN*12345*34444~
This is a segment in my file(like this I have many TRN* segments in my file).I want to replace the segment after TRN* and before next * (ie.12345)with some other value.
Is there any way to do this by using vbscript?
Thanks in advance
The regular expression way of safetyOtter is the way to go, you only have to fiddle with the pattern and replacement string:
originalString = "TRN*12345*34444~"
replaceValue = "78910"
Set re = new RegExp
re.Pattern = "(.*TRN\*)([^*]+)(.*)"
re.IgnoreCase = False
' This keeps the first and last part between parenthesis, but replaces the middle part
newString = re.Replace(originalString, "$1" & replaceValue & "$3")
msgbox newString
' result: TRN*78910*34444~
Something like this should work, may have to tweak, can't test it at the moment
Set regEx = New RegExp
regEx.Pattern = "TRN\*.*?\*"
regEx.IgnoreCase = True
replStr = "TRN*" & SomeOtherValue& "*"
ReplaceTest = regEx.Replace(YourStringHere, replStr)
Edit:
if you are looking to replace a specific number with another, a simple:
YourStringHere = Replace(YourStringHere,"TRN*12345*","TRN*67890*")
Is enough

vbscript - Replace all spaces

I have 6400+ records which I am looping through. For each of these: I check that the address is valid by testing it against something similar to what the Post Office uses (find address). I need to double check that the postcode I have pulled back matches.
The only problem is that the postcode may have been inputted in a number of different formats for example:
OP6 6YH
OP66YH
OP6 6YH.
If Replace(strPostcode," ","") = Replace(xmlAddress.selectSingleNode("//postcode").text," ","") Then
I want to remove all spaces from the string. If I do the Replace above, it removes the space for the first example but leave one for the third.
I know that I can remove these using a loop statement, but believe this will make the script run really slow as it will have to loop through 6400+ records to remove the spaces.
Is there another way?
I didn't realise you had to add -1 to remove all spaces
Replace(strPostcode," ","",1,-1)
Personally I've just done a loop like this:
Dim sLast
Do
sLast = strPostcode
strPostcode = Replace(strPostcode, " ", "")
If sLast = strPostcode Then Exit Do
Loop
However you may want to use a regular expression replace instead:
Dim re : Set re = New RegExp
re.Global = True
re.Pattern = " +" ' Match one or more spaces
WScript.Echo re.Replace("OP6 6YH.", "")
WScript.Echo re.Replace("OP6 6YH.", "")
WScript.Echo re.Replace("O P 6 6 Y H.", "")
Set re = Nothing
The output of the latter is:
D:\Development>cscript replace.vbs
OP66YH.
OP66YH.
OP66YH.
D:\Development>
This is the syntax Replace(expression, find, replacewith[, start[, count[, compare]]])
it will default to -1 for count and 1 for start. May be some dll is corrupt changing the defaults of Replace function.
String.Join("", YourString.Split({" "}, StringSplitOptions.RemoveEmptyEntries))
Because you get all strings without spaces and you join them with separator "".

VB6 getting ride of large spaces

Hey all, i am trying to replace large spaces between text with just one. My output looks like this right now:
5964215">
This is just the first example of the spaces
5964478">
This would be the 2nd example of showing how many spaces this thing has in each sentence.
5964494">
That comes from a textbox that has multi-line to true. Here is what it looks like when it doesn't have multi-line to true.
http://www.june3rdsoftware.com/forums/vb6.jpg
I can not seem to get the spaces to go away! BTW, this text is from a webpage if that makes any difference.
David
According to the suggestion of MvanGeest, here is some VB code to replace blocks of white spaces:
Sub test()
Dim x As String, y As String
x = "abcd defg 1233"
Dim re As New RegExp
re.Pattern = "\s+"
re.Global = True
y = re.Replace(x, " ")
Debug.Print y
End Sub
To make this work, you will have to add a reference to "Microsoft VBScript Regular Expresssions" to your project.
Assuming no regex support, why not set up a simple state machine that will set the state=1 when a space is found and set state=0 once a non-space is encountered. You can move char by char when state=0 (thus copying over only 1 space per series of spaces).
Also assuming no regex, you could try something like
str = "long text with spaces "
i = LenB(str)
str = Replace(str, " ", " ")
Do While LenB(str) <> i
i = LenB(str)
str = Replace(str, " ", " ")
Loop
Of course this code could be optimized for long space sequences but it might be all you need as well

Resources