String between a line - vbscript

By some google resources I could write a program to extract the string between two specific strings. I could not print value or store the value into a variable using regular expressions. Here is my code.
prgName = InStr(vText, "Program")
sub_prgName = Mid(vText, prgName, 100)
MsgBox sub_prgName, vbInformation
Dim RegEx: Set RegEx = New RegExp
RegEx.IgnoreCase = True
RegEx.Pattern = "Program(.*)?Variant"
Set RegEx = RegEx.Execute(prgName)
MsgBox RegEx.Value, vbInformation
I want to get the string b/w Program and Variant. When try to see the output it says
Run time error 438, object doesn't support this property.
This is the value that I want to parse using RegEx:
Program
sqlplus $SOPS_MIPO_USER/$SOPS_MIPO_PASSWORD#$DB_SID #mipo_pruning.sql
Variant

When in doubt, read the documentation. The Execute method returns a Matches collection, so you need to iterate over that collection to get the desired result (in your case the first submatch).
For Each m In RegEx.Execute(prgName)
MsgBox m.SubMatches(0), vbInformation
Next
Demonstration:
>>> s = "Program sqlplus $SOPS_MIPO_USER/$SOPS_MIPO_PASSWORD#$DB_SID #mipo_pruning.sql Variant N/A"
>>> Set re = New RegExp
>>> re.Pattern = "Program(.*)?Variant"
>>> re.IgnoreCase = True
>>> For Each m In re.Execute(s) : WScript.Echo m.SubMatches(0) : Next
sqlplus $SOPS_MIPO_USER/$SOPS_MIPO_PASSWORD#$DB_SID #mipo_pruning.sql
Theoretically you could also do this without the loop:
Set m = RegEx.Execute(prgName)(0)
MsgBox m.SubMatches(0), vbInformation
However, RegEx.Execute(prgName)(0) would raise an error if the regular expression didn't find a match, so evaluating the results in a loop is the safer approach.
I'd remove the ? in your regular expression, though, because it'll make the group optional, so you're not guaranteed to have a SubMatches(0) item. Simply use Program(.*)Variant instead. If there's no text between "Program" and "Variant" you'll get a zero-length string as the first submatch. Or you could put it inside the parentheses right after the asterisk (Program(.*?)Variant) to make the match non-greedy (shortest match instead of longest match).
If your input string contains newlines you need to use [\s\S] instead of ., because the dot in regular expressions matches any character except newlines.
Demonstration:
>>> s = "Program" & vbNewLine & vbNewLine _
& "sqlplus $SOPS_MIPO_USER/$SOPS_MIPO_PASSWORD#$DB_SID #mipo_pruning.sql" _
& vbNewLine & vbNewLine & "Variant"
>>> WScript.Echo s
Program
sqlplus $SOPS_MIPO_USER/$SOPS_MIPO_PASSWORD#$DB_SID #mipo_pruning.sql
Variant
>>> Set re = New RegExp
>>> re.Pattern = "Program(.*)?Variant"
>>> re.IgnoreCase = True
>>> For Each m In re.Execute(s) : WScript.Echo m.SubMatches(0) : Next
>>> re.Pattern = "Program([\s\S]*)?Variant"
>>> For Each m In re.Execute(s) : WScript.Echo m.SubMatches(0) : Next
sqlplus $SOPS_MIPO_USER/$SOPS_MIPO_PASSWORD#$DB_SID #mipo_pruning.sql
As a side note, you shouldn't replace your regular expression object with the result of the Execute method.
Set RegEx = RegEx.Execute(prgName) '<-- NEVER do this!
Re-using variables is a no-no, so don't do it.

.Execute returns a collection of Match objects. This collection has a .Count but no .Value property. You can use MsgBox RegEx(0).Value to get the .Value of the first match.
Evidence:
>> sInp = "XXXProgramYYYVariantZZZ"
>> Set r = New RegExp
>> r.Pattern = "Program(.*?)Variant"
>> WScript.Echo r.Execute(sInp)(0).Value
>> WScript.Echo r.Execute(sInp)(0).SubMatches(0)
>>
ProgramYYYVariant
YYY
You should publish your input(s) and expected result(s).

Related

Remove unnecessary data/Spaces in CSV

I need help on how to remove spaces/emtpy in data without compromising spaces on other data. Here's my sample data.
12345," ","abcde fgh",2017-06-06,09:00,AM," ", US
expected output:
12345,,"abcde fgh",2017-06-06,09:00,AM,, US
since " " should be considered as null.
I tried the Trim() function but it did not work. I also tried Regex pattern but still no use.
Here's my sample function.
Private Sub Transform(delimiter As String)
Dim sFullPath As String
Dim strBuff As String
Dim re As RegExp
Dim matches As Object
Dim m As Variant
If delimiter <> "," Then
strBuff = Replace(strBuff, delimiter, ",")
Else
With re
.Pattern = "(?!\B""[^""]*)" & delimiter & "(?![^""]*""\B)"
.IgnoreCase = False
.Global = True
End With
Set matches = re.Execute(strBuff)
For Each m In matches
strBuff = re.Replace(strBuff, ",")
Next
Set re = Nothing
Set matches = Nothing
End If
End Sub
I think you're on the right track. Try using this for your regular expression. The two double quotes in a row are how a single double quote is included in a string literal. Some people prefer to use Chr(34) to include double quotes inside a string.
\B(\s)(?!(?:[^""]*""[^""]*"")*[^""]*$)
Using that expression on your example string
12345," ","abcde fgh",2017-06-06,09:00,AM," ", US
yields
12345,"","abcde fgh",2017-06-06,09:00,AM,"", US
Example function
Private Function Transform(ByVal strLine As String) As String
Dim objRegEx As RegExp
On Error GoTo ErrTransForm
Set objRegEx = New RegExp
With objRegEx
.Pattern = "\B(\s)(?!(?:[^""]*""[^""]*"")*[^""]*$)"
.IgnoreCase = False
.Global = True
Transform = .Replace(strLine, "")
End With
ExitTransForm:
If Not objRegEx Is Nothing Then
Set objRegEx = Nothing
End If
Exit Function
ErrTransForm:
'error handling code here
GoTo ExitTransForm
End Function
And credit where credit is due. I used this answer, Regex Replace Whitespaces between single quotes as the basis for the expression here.
I would add an output string variable and have a conditional statement saying if the input is not empty, add it on to the output string. For example (VB console app format with the user being prompted to enter many inputs):
Dim input As String
Dim output As String
Do
input = console.ReadLine()
If Not input = " " Then
output += input
End If
Loop Until (end condition)
Console.WriteLine(output)
You can throw any inputs you don't want into the conditional to remove them from the output.
Your CSV file isn't correctly formatted.
Double quote shouldn't exists, then open your CSV with Notepad and replace them with a null string.
After this, you now have a real CSV file that you can import whitout problems.

What does these lines of code do in VBScript?

I am converting them to Jython script and I felt all it does is remove spaces at ends
function test (strField)
Dim re
Set re = New RegExp
re.Pattern = "^\s*"
re.MultiLine = False
strField = re.replace(strField,"")
End Function
It uses the RegExp object in VBScript to check for whitespace \s at the start of the variable passed into the Sub / Function called strField. Once it identifies the whitespace it uses the Replace() method to remove any matched characters from the start of the string.
As #ansgar-wiechers has mentioned in the comments it is just an all whitespace implementation of the LTrim() function.
I'm assuming this is meant to be a Function though (haven't tested but maybe VBScript accepts Fun as shorthand for Function, not something I'm familiar with personally) with that in mind it should return the modified strField value as the result of the function. Would also recommend using ByVal to stop the strField value after it is manipulated bleeding out of the function.
Function test(ByVal strField)
Dim re
Set re = New RegExp
re.Pattern = "^\s*"
re.MultiLine = False
strField = re.replace(strField,"")
test = strField
End Function
Usage in code:
Dim testin: testin = " some whitespace here"
Dim testout: testout = test(testin)
WScript.Echo """" & testin & """"
WScript.Echo """" & testout & """"
Output:
" some whitespace here"
"some whitespace here"

search Whole word only in VBscript

I am trying to implement search whole word only in VBScript, I tried appeding characters like space, /, ],) etc. as these characters means end of word. I need to do as many search as the number of characters I want to include using or operator. Is there any way to do it easily in VBScript.
Currently I am doing :-
w_seachString =
searchString & " " or
searchString & "/" or
searchString & "]" or
searchString & ")" or
searchString & "}" or
searchString & "," or
searchString & "."
So eventually I am comparing with lots of combination and looking for an effective way to make my variable w_seachString able to search for whole word only.
Use a regular expression with a word boundary anchor. Demo:
Option Explicit
Function qq(s) : qq = """" & s & """" : End Function
Dim r : Set r = New RegExp
r.Pattern = "\bx\b"
Dim s
For Each s In Split("axb| x |ax|x|\x/|", "|")
WScript.Echo qq(s), CStr(r.Test(s))
Next
output:
cscript 36443611.vbs
"axb" False
" x " True
"ax" False
"x" True
"\x/" True
"" False

Remove parts of a string and copy the rest back to a file with vbscript

I would like to remove the unwanted text from each string in a file.
the input string looks like this
username^time stamp^don't need this printed on printer name more useless info pages printed:some number
I want to remove everything else but keep the username,time stamp,printer name and some number.Then write each line to a file so the output looks like this
username timestamp printername some number
This is the code I'm working with
Set fs = CreateObject("Scripting.FileSystemObject")
sf = "C:\test.txt"
Set f = fs.OpenTextFile(sf, 1) ''1=for reading
s = f.ReadAll
segments = Split(s,"^",-1)
s= segments(1,)
f.Close
Set f = fs.OpenTextFile(sf, 2) ''2=ForWriting
f.Write s
f.Close
There's always a moment that somebody asks "Why not use a regular expression?". This is that moment.
Try this:
Dim re, s, match, matches
s = "Chuck Norris^12-12-2012^don't need this printed on HAL9000 more useless info pages printed:42 "
Set re = new regexp
re.pattern = "(.*)\^(.*)\^.*printed on (\w+).*pages printed:(\d+).*"
re.Global = True
Set matches = re.Execute(s)
Set match = matches(0)
msgbox "username=" & match.submatches(0)
msgbox "time stamp=" & match.submatches(1)
msgbox "printer=" & match.submatches(2)
msgbox "pages printed=" & match.submatches(3)
Neat huh? And I bet you'll figure out how to implement it in your existing code.
Code:
Const csSep = "^"
'username^time^(other arbitrary junk)^printer name^(other arbitrary junk)^page count
Dim sJunk : sJunk = "kurt^01:02:03^some junk^nec p7^nix^123"
WScript.Echo sJunk
Dim aParts : aParts = Split(sJunk, csSep)
Dim sNetto : sNetto = Join(Array(aParts(0),aParts(1),aParts(3),aParts(5)), csSep)
WScript.Echo sNetto
output:
kurt^01:02:03^some junk^nec p7^nix^123
kurt^01:02:03^nec p7^123

VBScript Regular Expression to remove all chars but digits and hyphen ("-")

can someone help me to write RegExp to remove all characters except numbers and hyphen (minus sign or "-") between them.
string looks like:
C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls
it needs to be:
1586-10
only.
the number of digits before -10 unspecified (can be 4-6); -10 itslef can be any two-digit number...
to make it easier. here is the function i found:
Public Function strClean (strtoclean)
Dim objRegExp, outputStr
Set objRegExp = New Regexp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.Pattern = "((?![a-zA-Z0-9]).)+"
outputStr = objRegExp.Replace(strtoclean, "-")
objRegExp.Pattern = "\-+"
outputStr = objRegExp.Replace(outputStr, "-")
strClean = outputStr
End Function
the pattern currently makes this with file names:
C-Documents-and-Settings-Lena-Desktop-New-Folder-2-New-Folder-2-1588-11-sfiuhsgu-(fgRG75476)-skghgsiu.xls
\d?\d?\d\d\d\d-\d\d
\d is a digit
? means zero or one of the preceeding character
So \d? means that it can be 0 or 1 digit.
Edit: Added a sample of how to use it after the comments
Dim myRegExp
Set myRegExp = New RegExp
myRegExp.Pattern = "\d?\d?\d\d\d\d-\d\d"
Dim test
test = "C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls"
Set myMatches = myRegExp.Execute(Right(test, InStrRev(test, "\")))
WScript.Echo myMatches(0)
Edit2: Code snippet used to call your code
Dim test
test = "C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls"
test = StrClean(test)
WScript.Echo test

Resources