Finding String and Reformatting Accordingly - vbscript

I've been working on a VBScript that finds all files with a specific extension, .dat, searches for a specific pattern, "Date Due:\d{8}", and shifts the string around in a specific format.
I am having two problems with the below code:
It is not reading the first line. Whenever I run the script it seems to jump immediately to the second line.
It is only using the first pattern it finds and replaces the following patterns with the first pattern in the newly formatted manner.
I hope this makes sense, it's a very specific script, but I am hoping for some help understanding the problem here.
Below is my code:
Set fso = CreateObject("Scripting.FileSystemObject")
'newtext = vbLf & "Date Due:" & sub_month & sub_day & sub_year 'the text replacing Date Due:
'the purpose of this script is to format the date after Date Due:, which is currently formatted as YYYYMMDD, to MM/DD/YYYY
'example: Date Date:20180605 should be Date Due:06/05/2018
Set re = New RegExp
re.Pattern = "(\nDate Due:\d{8})" 'Looking for line, Date Due: followed by 8 digits
Dim sub_str 'substring of date due, returns the 8 digits aka the date 12345678
Dim sub_month
Dim sub_day
Dim sub_year
Dim concat_full
re.Global = False
re.IgnoreCase = True
For Each f In fso.GetFolder("C:\Users\tgl\Desktop\TestFolder\").Files
If LCase(fso.GetExtensionName(f.Name)) = "dat" Then
text = f.OpenAsTextStream.ReadAll
sub_str = Mid(text, 10, 8) 'substring of the full line, outputs the 8 digit date
sub_month = Mid(sub_str, 5, 2) 'substring of the date, outputs the 2 digit month
sub_day = Mid(sub_str, 7, 2) 'substring of the date, outputs the 2 digit day
sub_year = Mid(sub_str, 1, 4) 'substring of the date, outputs the four digit year
newtext = vbLf & "Date Due:" & sub_month & "/" & sub_day & "/" & sub_year 'replaces the text pattern defined above and concatenates the substrings with slashes
'concat_full = (sub_month & sub_day & sub_year)
f.OpenAsTextStream(2).Write re.Replace(text, newtext)
End If
Next
EDIT: When changing re.Global to True it replaces each line with the one found pattern. It should be using each found pattern as it's own and not the first one it finds.

Make your regular expression more specific and use capturing groups for extracting the relevant submatches:
re.Pattern = "(\nDate Due:)(\d{4})(\d{2})(\d{2})"
then replace the matches like this:
re.Replace(text, "$1$4/$3/$2")
$1 through $4 in the replacement string are backreferences to the capturing groups in the pattern (i.e. they're replaced with the respective captured substring).

Related

VBScript - Conditional Replacement Regex

In a text file, I want to remove all Carriage Returns (vbCrLf) except those vbCrLf that are preceded by a period (.)
Honestly, no regex is needed.
Open the text file, split the entire text contents on what you want to retain (to create an array), then loop through the array and replace want you don't want during the file rewrite, ensuring to append what you wanted to retain.
For example:
Dim retain : retain = "." & vbCrLf
Dim toss : toss = vbCrLf
With CreateObject("Scripting.FileSystemObject")
Dim txtFile
With .OpenTextFile("C:\TestFile.txt", 1, False)
txtFile = Split(.ReadAll(), retain)
.Close
End With
With .OpenTextFile("C:\NewTestFile.txt", 2, True)
Dim row
For Each row In txtFile
.Write Replace(row, toss, " ") & retain
Next
.Close
End With
End With
Truth be told, VBScript's RegExp Pattern syntax may not be robust enough -- due to lack of look-ahead patterns -- to handle your conditional criteria cleanly with a single pass.

Converting Time-Stamp String to Another Date Format in VBScript

I have datetime stamp (MMDDYYYYHHMMSS) extracted from a file Name as
Filedate = "10212015140108"
How can I convert it into datetime format mm/dd/yyyy hh:mm:ss.
Can someone help to get it resolved?
From what I can gather from the question it looks as though the Filedate value is just a string representation of date (mmddyyyyhhnnss), with that in mind did a quick test to see if I could parse it into the format required.
There are other ways of approaching this like using a RegExp object to build up a list of matches then use those to build the output.
Worth noting that this example will only work if the structured string is always in the same order and the same number of characters.
Function ParseTimeStamp(ts)
Dim df(5), bk, ds, i
'Holds the map to show how the string breaks down, each element
'is the length of the given part of the timestamp.
bk = Array(2,2,4,2,2,2)
pos = 1
For i = 0 To UBound(bk)
df(i) = Mid(ts, pos, bk(i))
pos = pos + bk(i)
Next
'Once we have all the parts stitch them back together.
'Use the mm/dd/yyyy hh:nn:ss format
ds = df(0) & "/" & df(1) & "/" & df(2) & " " & df(3) & ":" & df(4) & ":" & df(5)
ParseTimeStamp = ds
End Function
Dim Filedate, parsedDate
Filedate = "10212015140108"
parsedDate = ParseTimeStamp(FileDate)
WScript.Echo parsedDate
Output:
10/21/2015 14:01:08

vbscript - Replace all spaces

I have 6400+ records which I am looping through. For each of these: I check that the address is valid by testing it against something similar to what the Post Office uses (find address). I need to double check that the postcode I have pulled back matches.
The only problem is that the postcode may have been inputted in a number of different formats for example:
OP6 6YH
OP66YH
OP6 6YH.
If Replace(strPostcode," ","") = Replace(xmlAddress.selectSingleNode("//postcode").text," ","") Then
I want to remove all spaces from the string. If I do the Replace above, it removes the space for the first example but leave one for the third.
I know that I can remove these using a loop statement, but believe this will make the script run really slow as it will have to loop through 6400+ records to remove the spaces.
Is there another way?
I didn't realise you had to add -1 to remove all spaces
Replace(strPostcode," ","",1,-1)
Personally I've just done a loop like this:
Dim sLast
Do
sLast = strPostcode
strPostcode = Replace(strPostcode, " ", "")
If sLast = strPostcode Then Exit Do
Loop
However you may want to use a regular expression replace instead:
Dim re : Set re = New RegExp
re.Global = True
re.Pattern = " +" ' Match one or more spaces
WScript.Echo re.Replace("OP6 6YH.", "")
WScript.Echo re.Replace("OP6 6YH.", "")
WScript.Echo re.Replace("O P 6 6 Y H.", "")
Set re = Nothing
The output of the latter is:
D:\Development>cscript replace.vbs
OP66YH.
OP66YH.
OP66YH.
D:\Development>
This is the syntax Replace(expression, find, replacewith[, start[, count[, compare]]])
it will default to -1 for count and 1 for start. May be some dll is corrupt changing the defaults of Replace function.
String.Join("", YourString.Split({" "}, StringSplitOptions.RemoveEmptyEntries))
Because you get all strings without spaces and you join them with separator "".

How to remove characters from a string in VBscript

I'm new to vbscript and Stack Overflow, and could really use some help.
Currently I'm trying to format a phone number that is read from an image and stored in a variable. Because the images are "dirty" extra characters find their way in, such as periods or parenthesis. I've already restricted the field as much as possible to help prevent picking up extra characters, but alas!
For example I want to turn ",123.4567890" into "123 456 7890" (not including the double quotes). Trouble is I won't know what extra characters could potentially be picked up, ruling out a simple replace.
My logic is remove any non numeric characters, start from the left, insert a space after the third number, insert a space after the sixth number.
Any help would be great, and please feel free to ask for more information if needed.
Welcome to Stack Overflow. You can remove non-digits using Regex, and concatenate the parts using Mid function.
For example:
Dim sTest
sTest = ",123.4567890"
With (New RegExp)
.Global = True
.Pattern = "\D" 'matches all non-digits
sTest = .Replace(sTest, "") 'all non-digits removed
End With
WScript.Echo Mid(sTest, 1, 3) & " "& Mid(sTest, 4, 3) & " "& Mid(sTest, 7, 4)
Or fully using Regex (via a second grouping pattern):
Dim sTest
sTest = ",123.4567890"
With (New RegExp)
.Global = True
.Pattern = "\D" 'matches all non-digits
sTest = .Replace(sTest, "") 'all non-digits removed
.Pattern = "(.{3})(.{3})(.{4})" 'grouping
sTest = .Replace(sTest, "$1 $2 $3") 'formatted
End With
WScript.Echo sTest
Use a first RegExp to clean non-digits from the input and second one using groups for the layout:
Function cleanPhoneNumber( sSrc )
Dim reDigit : Set reDigit = New RegExp
reDigit.Global = True
reDigit.Pattern = "\D"
Dim reStruct : Set reStruct = New RegExp
reStruct.Pattern = "(.{3})(.{3})(.+)"
cleanPhoneNumber = reStruct.Replace( reDigit.Replace( sSrc, "" ), "$1 $2 $3" )
End Function ' cleanPhoneNumber

batch file renaming in vbscript

I have a group of files that are named as such (word can be any word or numbers):
Word word-word word word Floor B2342 Word Word-word.pdf
Word word-word word Floor: B-2342 Word Word-word.pdf
Word word- Floor C43 Word Word.pdf
Word word word- Floor- E2AD342 Word Word.pdf
I want to rename everything in the folder to only have the group that follows Floor... You can count on Floor always being in the file name and what I want to keep following floor.
B2342.pdf
B-2342.pdf
C43.pdf
E2AD342.pdf
Pass the path of the folder you want to process as the first argument to this script. You might have to tweak the regular expression for your input.
Set expr = New RegExp
Set fs = CreateObject("Scripting.FileSystemObject")
Set fpath = fs.GetFolder(WScript.Arguments(0))
expr.Pattern = "Floor\S*\s+([^\s.]*)"
For Each fspec In fpath.Files
Set matches = expr.Execute(fspec.Name)
If matches.Count = 0 Then
WScript.StdErr.WriteLine "Invalid file name " & fspec.Name
Else
fspec.Move fspec.ParentFolder & "\" & matches(0).Submatches(0) & ".pdf"
End If
Next

Resources