This question already has answers here:
Regular Expression - How to find a match within a match?
(2 answers)
Closed 3 years ago.
I have a text file which has a single line of text containing 1 of the strings:
--Result=PASS:Passed
--Result=FAIL:Failed
Am trying to fetch the value PASS or FAIL using the pattern matching concept in VBScript but till now have been able to just match the string and retrieve the entire line. Please find below the code that am using:
Dim oRE, oMatches
Set oRE = New RegExp
oRE.Pattern = "--Result=(PASS|FAIL).*"
Set objFileToRead = CreateObject("Scripting.FileSystemObject").OpenTextFile("C:\tmp\resultfile.txt",1)
Dim strline
do while not objFileToRead.AtEndOfStream
strline = objFileToRead.ReadLine()
Set oMatches = oRE.Execute(strline)
For Each oMatch in oMatches
result = oMatch.Value
Next
The result that I get now is the entire matching line. Is it possible to fetch just the PASS or FAIL substring from the text file instead of the entire line?
The Match is always the entire string that matched the pattern, you are looking for the 'Groups', which you get in vbscript through SubMatches.
if oMatches.Count > 0 then
result = oMatches(0).SubMatches(0)
end if
If you use multiple braces in the pattern, you can find these here through .SubMatches(1) etc.
Btw. your pattern does not have to match the entire input string (you don't use anchors ^$ anyway), you could just use (PASS|FAIL) as pattern, or maybe =(PASS|FAIL):.
Related
regex in VBscript has 3 methods, Test, Extract and Replace, but I can only seem to turn capture groups from Extract into variable.
However what I want to do is use capturing groups from 'Replace' as a variable. I can get a Regex.Replace working with no problems using $1 $2 etc for capturing groups, however I want multiply one of the capture groups.
In an xml file, I want to extract a value, times it by 15, and insert it back in. In this example the tag.
e.q.
strText = "
<rte>
<name>gpx.studio church 2 reduced.gpx</name>
<rtept lat='-33.482652' lon='150.159134'>
<ele>938.4</ele>
<desc>076</desc>
</rtept>
<rtept lat='-33.4825698175265' lon='150.159515440464'>
<ele>942.3</ele>
<desc>162</desc>
</rtept>
<rtept lat='-33.4828785376496' lon='150.159633457661'>
<ele>943.4</ele>
<desc>098</desc>
</rtept>
</rte>
</gpx>"
Dim oRegExp
Set oRegExp = New RegExp
oRegExp.Global=True
oRegExp.Multiline = True
oRegExp.Pattern = strPattern
strPattern = "(<rtept(?:(?:.|\n|\r)*?))<desc>(.*?)<\/desc>((?:(?:.|\n|\r)*?)<\/rtept>)"
strReplace = "$1<desc>$2<\/desc>$3"
' so on this line above, I want to turn the $2 into an integer and multiply it by 15 before putting back into replace.
' I have not done it here because I know it doesnt work as "$2"x1000
strNewText = oRegExp.Replace(strText, strReplace)
I want to turn the $2 into an integer and multiply it by 15 before putting back into replace.
I have tried to get the capture groups as SubMatches(1) which work with Regex.Extract method but it doesnt seem to work in Regex.Replace method, unless I am missing something....
help appreciated
This question already has an answer here:
vbscript split string with colon delimiter
(1 answer)
Closed 1 year ago.
I am trying to come up with a method for extracting information from a file heading. The overall naming convention of the file heading will remain the same but portions of the heading will vary in character length. Below are two possible examples of such file headings:
012345678-012345-xxxx-yyyyy.txt
012345678-012345-xxx-yyyyyy.txt
Is there a way to extract values from these file headings such that it returns whatever appears between the second and third hyphen? Using the examples above it would return:
xxxx
xxx
Furthermore, is it possible to extract the values between the final hyphen and the period? Using the example above it would return:
yyyyy
yyyyyy
Extracting values is trivial when the character lengths are fixed, but I don't know if it's possible to do a similar extraction when the character lengths vary. I would normally use something like this to extract the information from a fixed-length naming convention but don't know how to adapt it to something where the character lengths change. For example, the snippet below is a function which extract the first nine characters in a file heading (in this case it would extract '012').
Function getthething(foo)
getthething = Mid(foo,1,3)
End Function
Any guidance would be very appreciated. Thank you.
You can do all of this using the Split function. Here's a wrapper function that simplifies things:
Function GetField(p_sText, p_sDelimiter, p_iIndex)
Dim arrFields
arrFields = Split(p_sText, p_sDelimiter)
If UBound(arrFields) >= (p_iIndex - 1) Then
GetField = arrFields(p_iIndex - 1)
Else
GetField = ""
End If
End Function
You can use this function like this:
Dim sFileName
Dim sYs
sFileName = GetField("012345678-012345-xxxx-yyyyy.txt", ".", 1)
sYs = GetField(sFileName, "-", 4)
MsgBox sYs
or simply:
MsgBox GetField(GetField("012345678-012345-xxxx-yyyyy.txt", ".", 1), "-", 4)
I have a file "D:\test.log" that has either one of two styles. This will appear if the user is offline when the user received the message:
[02:19:47] Brother Aimbot (adama900): (Saved Thu Mar 31 05:15:09 2016)This is a test line
It will be like this if the user is online when the user received the message:
[02:19:47] Brother Aimbot (adama900): This is a test line
What I would like this to do is cut out the excess parts so it would look like this if it's either the first or second style:
Brother Aimbot (adama900) This is a test line
then place it into a message box.
Here is my code:
Sub main()
filename = "D:\Test.txt"
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(filename)
LNEVAL = f.ReadLine
LNENUM = 0
Do Until f.AtEndOfStream
For i = 1 To LNENUMs
f.ReadLine
Next
If InStr(LNEVAL, "(S") Then
LNEVAL = Left(LNEVAL, (Len("(S")+4))
MsgBox = LNEVAL
End If
Loop
f.Close
End Sub
This is what I have so far.
It's fairly simple to do what you want with a regular expression replacement. Basically what you want to do is remove three things from each line:
a substring between square brackets from the beginning of the string,
the colon separating the name from the message, and
an optional substring between parentheses after that colon.
A regular expression ^\[.*?\] matches an opening square bracket at the beginning of a string and the shortest number of characters up to a closing square bracket.
A regular expression \(Saved.*?\) matches an opening parenthesis followed by the word Saved and the shortest number of characters up to a closing parenthesis. However, since this part is optional you need to indicate that the expression can occur zero or one time by putting it in a non-capturing group and appending the ? modifier to it ((?:...)?).
Put the submatches that you do want to preserve in parentheses to create capturing groups
^\[.*?\] (.*?): (?:\(Saved.*?\))?(.*)
and replace each matching line with just the captured groups:
Set re = New RegExp
re.Pattern = ...
Set f = fso.OpenTextFile(filename)
Do Until f.AtEndOfStream
MsgBox re.Replace(f.ReadLine, "$1 $2")
Loop
f.Close
Some comments on your existing code:
For i = 1 To LNENUMs: this loop is always skipped over, because you set LNEUMs to 0. Since you only do f.ReadLine inside that For loop your outer Do loop becomes an infinite loop, since you never read the file to the end.
Len("(S")+4 always evaluates to 6, because the length of the string (S is not going to change, so you could just replace the expression with the numeric value.
MsgBox = LNEVAL: The MsgBox function doesn't work that way. Remove the = between function name and message.
I was googling around but didn't find the right answer, perhaps people from here are willingly and able to help me.
I'm very new to VBS or WSH and I like to have a solution for this problem:
I'm searching for textstrings within a file without a line break (only one line). The textstrings I'm looking for start always with the same content "jpgline" and ends with the three letters "qbm". How can we extract each sentence (the strings are always 64 chars long) containg "jpgline....qbm" into a separate file.
I'm looking for a solution in Visual Basic Script as I use Windows 7.
Thanks in advance
M i k e
Use a regular expression:
Set re = New RegExp
re.Pattern = "^jpgline.*qbm$"
re.IgnoreCase = True
Set fso = CreateObject("Scripting.FileSystemObject")
Set inFile = fso.OpenTextFile("C:\path\to\input.txt")
Set outFile = fso.OpenTextFile("C:\path\to\output.txt", 2, True)
Do Until inFile.AtEndOfStream
line = inFile.ReadLine
If re.Test(line) Then outFile.WriteLine line
Loop
inFile.Close
outFile.Close
As your input file has no lines, use .ReadAll() to load its entire content into a string variable. Apply a RegExp to get all parts (Matches) defined by the pattern "jpgline.{N}qbm" where N is either 64 or 64 - the length of the pre/suffix. Ansgar has shown how to open and write to the output file.
Use the RegExp Docs to learn about .Execute and how to loop over the resulting match collection. The docs will tell you about .Test too.
I have 6400+ records which I am looping through. For each of these: I check that the address is valid by testing it against something similar to what the Post Office uses (find address). I need to double check that the postcode I have pulled back matches.
The only problem is that the postcode may have been inputted in a number of different formats for example:
OP6 6YH
OP66YH
OP6 6YH.
If Replace(strPostcode," ","") = Replace(xmlAddress.selectSingleNode("//postcode").text," ","") Then
I want to remove all spaces from the string. If I do the Replace above, it removes the space for the first example but leave one for the third.
I know that I can remove these using a loop statement, but believe this will make the script run really slow as it will have to loop through 6400+ records to remove the spaces.
Is there another way?
I didn't realise you had to add -1 to remove all spaces
Replace(strPostcode," ","",1,-1)
Personally I've just done a loop like this:
Dim sLast
Do
sLast = strPostcode
strPostcode = Replace(strPostcode, " ", "")
If sLast = strPostcode Then Exit Do
Loop
However you may want to use a regular expression replace instead:
Dim re : Set re = New RegExp
re.Global = True
re.Pattern = " +" ' Match one or more spaces
WScript.Echo re.Replace("OP6 6YH.", "")
WScript.Echo re.Replace("OP6 6YH.", "")
WScript.Echo re.Replace("O P 6 6 Y H.", "")
Set re = Nothing
The output of the latter is:
D:\Development>cscript replace.vbs
OP66YH.
OP66YH.
OP66YH.
D:\Development>
This is the syntax Replace(expression, find, replacewith[, start[, count[, compare]]])
it will default to -1 for count and 1 for start. May be some dll is corrupt changing the defaults of Replace function.
String.Join("", YourString.Split({" "}, StringSplitOptions.RemoveEmptyEntries))
Because you get all strings without spaces and you join them with separator "".