Read text file specific string through vbscript - vbscript

I have text file with multiple line of different jobs jobs status .last run date..etc has been given as below
Jobname=FC;lastdate=12032015;lastresult=0
I need to write out the jobname and lastresult status with "success" for 0 and "fail" for other cases.

As you iterating through the lines of the file (where readline is the variable holding the line we are reading at the time):
jobname= Split(Split(readLine, ";")(0), "=")(1)
if Split(Split(readLine, ";")(2), "=")(1) = 0 Then
lastresult="Success"
else
lastresult="Failure"
end if
Something like this should capture your jobname and lastresult. We are just using SPLIT() to split the string by a delimiter ";" and grabbing the token we need (and then splitting that as well).

Use a Regexp looking for = followed by a sequence of non-; and an array indexed by the comparison between "=0" and the last part of the input line - as in:
>> Set r = New RegExp
>> r.Global = True
>> r.Pattern = "=[^;]+"
>> a = Split("success fail")
>> s = "Jobname=FC;lastdate=12032015;lastresult=0|Jobname=Other;lastdate=12032015;lastresult=Else"
>> For Each s In Split(s, "|")
>> Set ms = r.Execute(s)
>> WScript.Echo Mid(ms(0).Value,2), a(1 + ("=0" = ms(2)))
>> Next
>>
FC success
Other fail

Related

Read a file's data within a specified range using VB script. Is it possible?

This is the middle of the code I'm trying to work with. Is there a way to make the file it's reading open and read from line 2 to line 97? Where I need the correction is starred (****). What I'm trying to do is get the data from lines 2 through 97 to compare to another file I'll have to open from the same lines. The beginning and ends of each file are different but the middle information should match thus I need these specific lines.
' Build Aliquot file name
strFile = aBarcodeExportDir & "A-" & yearStr & "-" & splitStr2(0) & ".csv"
'msgbox("open file: " & strFile)
If (objFS.FileExists(strFile)) Then
' Open A file
Set objFile = objFS.OpenTextFile(strFile)
' Build string with file name minus extension - used later to determine EOF
strFileNameNoExtension = "A-" & yearStr & "-" & splitStr2(0)
' Create dictionary to hold key/value pairs - key = position; value = barcode
Set dictA = CreateObject("Scripting.Dictionary")
' Begin processing A file
Do Until objFile.AtEndOfStream(*****)
' Read a line
strLine = objFile.ReadLine(*****)
' Split on semi-colons
splitStr = Split(strLine, ";")
' If splitStr array contains more than 1 element then continue
If(UBound(splitStr) > 0) Then
' If barcode field is equal to file name then EOF
If(splitStr(6) = strFileNameNoExtension) Then
' End of file - exit loop
Exit Do
Else
' Add to dictionary
' To calculate position
' A = element(2) = position in row (1-16)
compA = splitStr(2)
' B = element(4) = row
compB = splitStr(4)
' C = element(5.1) = number of max positions in row
splitElement5 = Split(splitStr(5), "/")
compC = splitElement5(0)
' position = C * (B - 1) + A
position = compC * (compB - 1) + compA
barcode = splitStr(6) & ";" & splitStr(0) & ";" & splitStr(1) & ";" & splitStr(2)
'msgbox(position & ":" & barcode)
' Add to dictionary
dictA.Add CStr(position), barcode
End If
End If
Loop
' Close A file
objFile.Close
To give the exact answer, we may have to look at your text files(I mean with all the split functions you are using). But, If you just want to compare lines 2-97 of two text files, you can get a hint from the following piece of code:
strPath1 = "C:\Users\gr.singh\Desktop\abc\file1.txt" 'Replace with your File1 Path
strPath2 = "C:\Users\gr.singh\Desktop\abc\file2.txt" 'Replace with your File2 Path
Set objFso = CreateObject("Scripting.FileSystemObject")
Set objFile1 = objFso.OpenTextFile(strPath1,1)
Set objFile2 = objFso.OpenTextFile(strPath2,1)
blnMatchFailed = False
Do Until objFile1.AtEndOfStream
If objFile1.Line=1 Then
objFile1.SkipLine() 'Skips the 1st line of both the files
objFile2.SkipLine()
ElseIf objFile1.Line>=2 And objFile1.Line<=97 Then
strFile1 = objFile1.ReadLine()
strFile2 = objFile2.ReadLine()
If StrComp(strFile1,strFile2,1)<>0 Then 'textual comparison. Change 1 to 0, if you want binary comparison of both lines
blnMatchFailed = True
intFailedLine = objFile1.Line
Exit Do 'As soon as match fails, exit the Do while Loop
Else
blnMatchFailed = False
End If
Else
Exit Do
End If
Loop
If blnMatchFailed Then
MsgBox "Comparison Failed at line "&intFailedLine
Else
MsgBox "Comparison Passed"
End If
objFile1.Close
objFile2.Close
Set objFile1 = Nothing
Set objFile2 = Nothing
Set objFso = Nothing

each_line sanitize newlines between quotations marks [duplicate]

I'm processing data from government sources (FEC, state voter databases, etc). It's inconsistently malformed, which breaks my CSV parser in all sorts of delightful ways.
It's externally sourced and authoritative. I must parse it, and I cannot have it re-input, validated on input, or the like. It is what it is; I don't control the input.
Properties:
Fields contain malformed UTF-8 (e.g. Foo \xAB bar)
The first field of a line specifies the record type from a known set. Knowing the record type, you know how many fields there are and their respective data types, but not until you do.
Any given line within a file might use quoted strings ("foo",123,"bar") or unquoted (foo,123,bar). I haven't yet encountered any where it's mixed within a given line (i.e. "foo",123,bar) but it's probably in there.
Strings may include internal newline, quote, and/or comma character(s).
Strings may include comma separated numbers.
Data files can be very large (millions of rows), so this needs to still be reasonably fast.
I'm using Ruby FasterCSV (known as just CSV in 1.9), but the question should be language-agnostic.
My guess is that a solution will require preprocessing substitution with unambiguous record separator / quote characters (eg ASCII RS, STX). I've started a bit here but it doesn't work for everything I get.
How can I process this kind of dirty data robustly?
ETA: Here's a simplified example of what may be in single file:
"this","is",123,"a","normal","line"
"line","with "an" internal","quote"
"short line","with
an
"internal quote", 1 comma and
linebreaks"
un "quot" ed,text,with,1,2,3,numbers
"quoted","number","series","1,2,3"
"invalid \xAB utf-8"
It is possible to subclass Ruby's File to process each line of the the CSV file before it is passed to the Ruby's CSV parser. For example, here's how I used this trick to replace non-standard backslash-escaped quotes \" with standard double-quotes ""
class MyFile < File
def gets(*args)
line = super
if line != nil
line.gsub!('\\"','""') # fix the \" that would otherwise cause a parse error
end
line
end
end
infile = MyFile.open(filename)
incsv = CSV.new(infile)
while row = incsv.shift
# process each row here
end
You could in principle do all sorts of additional processing, e.g. UTF-8 cleanups. The nice thing about this approach is you handle the file on a line by line basis, so you don't need to load it all into memory or create an intermediate file.
First, here is a rather naive attempt: http://rubular.com/r/gvh3BJaNTc
/"(.*?)"(?=[\r\n,]|$)|([^,"\s].*?)(?=[\r\n,]|$)/m
The assumptions here are:
A field may start with quotes. In which case, it should end with a quote that is either:
before a comma
before a new line (if it is last field on its line)
before the end of the file (if it is last field on the last line)
Or, its first character is not a quote, so it contains characters until the same condition as before is met.
This almost does what you want, but fails on these fields:
1 comma and
linebreaks"
As TC had pointed out in the comments, your text is ambiguous. I'm sure you already know it, but for completeness:
"a" - is that a or "a"? How do you represent a value that you want to be wrapped in quotes?
"1","2" - might be parsed as 1,2, or as 1","2 - both are legal.
,1 \n 2, - End of line, or newline in the value? You cannot tell, specially if this is supposed to be the last value of its line.
1 \n 2 \n 3 - One value with newlines? Two values (1\n2,3 or 1,2\n3)? Three values?
You may be able to get some clues if you examine the first value on each row, which as you have said, should tell you the number of columns and their types - this can give you the additional information you are missing to parse the file (for example, if you know there should another field in this line, then all newlines belong in the current value). Even then though, it looks like there are serious problems here...
I made an app to reformat CSV files, doubling the single quotes inside fields and replacing the new lines inside them with a string like '\n'.
Once the data is inside the database we can replace back the '\n' to new lines.
I needed to do this because the apps I had to process CSV does not deal correctly with new lines.
Feel free to use and change.
In python:
import sys
def ProcessCSV(filename):
file1 = open(filename, 'r')
filename2 = filename + '.out'
file2 = open(filename2, 'w')
print 'Reformatting {0} to {1}...', filename, filename2
line1 = file1.readline()
while (len(line1) > 0):
line1 = line1.rstrip('\r\n')
line2 = ''
count = 0
lastField = ( len(line1) == 0 )
while not lastField:
lastField = (line1.find('","') == -1)
res = line1.partition('","')
field = res[0]
line1 = res[2]
count = count + 1
hasStart = False
hasEnd = False
if ( count == 1 ) and ( field[:1] == '"' ) :
field = field[1:]
hasStart = True
elif count > 1:
hasStart = True
while (True):
if ( lastField == True ) and ( field[-1:] == '"' ) :
field = field[:-1]
hasEnd = True
elif not lastField:
hasEnd = True
if lastField and not hasEnd:
line1 = file1.readline()
if (len(line1) == 0): break
line1 = line1.rstrip('\r\n')
lastField = (line1.find('","') == -1)
res = line1.partition('","')
field = field + '\\n' + res[0]
line1 = res[2]
else:
break
field = field.replace('"', '""')
line2 = line2 + iif(count > 1, ',', '') + iif(hasStart, '"', '') + field + iif(hasEnd, '"', '')
if len(line2) > 0:
file2.write(line2)
file2.write('\n')
line1 = file1.readline()
file1.close()
file2.close()
print 'Done'
def iif(st, v1, v2):
if st:
return v1
else:
return v2
filename = sys.argv[1]
if len(filename) == 0:
print 'You must specify the input file'
else:
ProcessCSV(filename)
In VB.net:
Module Module1
Sub Main()
Dim FileName As String
FileName = Command()
If FileName.Length = 0 Then
Console.WriteLine("You must specify the input file")
Else
ProcessCSV(FileName)
End If
End Sub
Sub ProcessCSV(ByVal FileName As String)
Dim File1 As Integer, File2 As Integer
Dim Line1 As String, Line2 As String
Dim Field As String, Count As Long
Dim HasStart As Boolean, HasEnd As Boolean
Dim FileName2 As String, LastField As Boolean
On Error GoTo locError
File1 = FreeFile()
FileOpen(File1, FileName, OpenMode.Input, OpenAccess.Read)
FileName2 = FileName & ".out"
File2 = FreeFile()
FileOpen(File2, FileName2, OpenMode.Output)
Console.WriteLine("Reformatting {0} to {1}...", FileName, FileName2)
Do Until EOF(File1)
Line1 = LineInput(File1)
'
Line2 = ""
Count = 0
LastField = (Len(Line1) = 0)
Do Until LastField
LastField = (InStr(Line1, """,""") = 0)
Field = Strip(Line1, """,""")
Count = Count + 1
HasStart = False
HasEnd = False
'
If (Count = 1) And (Left$(Field, 1) = """") Then
Field = Mid$(Field, 2)
HasStart = True
ElseIf Count > 1 Then
HasStart = True
End If
'
locFinal:
If (LastField) And (Right$(Field, 1) = """") Then
Field = Left$(Field, Len(Field) - 1)
HasEnd = True
ElseIf Not LastField Then
HasEnd = True
End If
'
If LastField And Not HasEnd And Not EOF(File1) Then
Line1 = LineInput(File1)
LastField = (InStr(Line1, """,""") = 0)
Field = Field & "\n" & Strip(Line1, """,""")
GoTo locFinal
End If
'
Field = Replace(Field, """", """""")
'
Line2 = Line2 & IIf(Count > 1, ",", "") & IIf(HasStart, """", "") & Field & IIf(HasEnd, """", "")
Loop
'
If Len(Line2) > 0 Then
PrintLine(File2, Line2)
End If
Loop
FileClose(File1, File2)
Console.WriteLine("Done")
Exit Sub
locError:
Console.WriteLine("Error: " & Err.Description)
End Sub
Function Strip(ByRef Text As String, ByRef Separator As String) As String
Dim nPos As Long
nPos = InStr(Text, Separator)
If nPos > 0 Then
Strip = Left$(Text, nPos - 1)
Text = Mid$(Text, nPos + Len(Separator))
Else
Strip = Text
Text = ""
End If
End Function
End Module

This topic is on VBscript used for reading result from a file

I understand that the FSO does not know to read the lines from the last in a file.
My scenario here is to validate the last but 1 line and get the result out of it.
Assume, if i need to get the result as PASS or FAIL in the last but 1 line. Since i go through from the first line, the scenario of me getting the correct result is limited because there is a probability of PASS or FAIL appearing in the file earlier.
My last 2 lines in the file is
Failed
Done!!!!
OR
Passed
Done!!!!
to get the actual i am using a NESTED IF validation to get the result. Below is the snippet of the same.
str1 = "Passed"
str2 = "Failed"
str3="Done!!!!"
Do Until objFile.AtEndOfStream
str=objFile.ReadLine
if StrComp(str, str1) = 0 Then
str=objFile.ReadLine
if StrComp(str,str3) = 0 Then
result="PASS"
End if
elseif StrComp(str, str2) = 0 Then
str = objFile.ReadLine
if StrComp(str,str3) = 0 Then
result="FAIL"
End if
End if
Loop
This affects the performance. Is there any alternative to get this implementation in a better manner?
Here is a function which takes a file name and returns the second to last line read:
Function PenultimateLine(fname)
Dim fso, ts, line1, line2
Set fso = CreateObject("Scripting.FileSystemObject")
Set ts = fso.OpenTextFile(fname)
Do Until ts.AtEndOfStream
line1 = line2
line2 = ts.ReadLine
Loop
ts.Close
PenultimateLine = line1
End Function
You can use this function to extract the line and then test it against "PASS" or "FAIL" (which, by the way, can be done simply with = rather than StrCmp)
A = Split(objfile.readall, vbcrlf)
B = A(ubound(A)-2)
This uses memory and is unsuitable on very large files.

VB Script to extract text from file using delimiters

I tried with batch to extract required code, but this doesn't work especially for big files. I'm wondering if this is possible with VB script. So,
I need to extract text from file between 2 delimiters and copy it to TXT file. This text looks like XML code, instead delimiters <string> text... </string>, I have :::SOURCE text .... ::::SOURCE. As you see in first delimiter are 3x of ':' and in second are 4x of ':'
Most important is that there are multiple lines between these 2 delimiters.
Example of text:
text&compiled unreadable characters
text&compiled unreadable characters
:::SOURCE
just this code
just this code
...
just this code
::::SOURCE text&compiled unreadable characters
text&compiled unreadable characters
Desired output:
just this code
just this code
...
just this code
Maybe you can try somethig like this:
filePath = "D:\Temp\test.txt"
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(filePath)
startTag = ":::SOURCE"
endTag = "::::SOURCE"
startTagFound = false
endTagFound = false
outputStr = ""
Do Until f.AtEndOfStream
lineStr = f.ReadLine
startTagPosition = InStr(lineStr, startTag)
endTagPosition = InStr(lineStr, endTag)
If (startTagFound) Then
If (endTagPosition >= 1) Then
outputStr = outputStr + Mid(lineStr, 1, endTagPosition - 1)
Exit Do
Else
outputStr = outputStr + lineStr + vbCrlf
End If
ElseIf (startTagPosition >= 1) Then
If (endTagPosition >= 1) Then
outputStr = Mid(lineStr, startTagPosition + Len(startTag), endTagPosition - startTagPosition - Len(startTag) - 1)
Exit Do
Else
startTagFound = true
outputStr = Mid(lineStr, startTagPosition + Len(startTag)) + vbCrlf
End If
End If
Loop
WScript.Echo outputStr
f.Close
I've made the assumption that start and end tag can be anywhere inside the file, not only at start of lines. Maybe you can simplify the code if you have more information on the "encoding".

Script won't split line at "=" Delimeter

The script below works in finding duplicates.
But most of the files i'm reading follow this format:
ServerName(1) = "Example1"
ServerName(2) = "Example1"
ServerName(3) = "Example3"
ServerName(4) = "Example4"
ServerName(5) = "Example5"
The 'cut' variable in the code below is supposed to cut the string at the "=" delimiter and return the value that comes after the "=" delimeter.
It should write to the duplicate file "Example1" but instead writes nothing. How would I make it so that the script below reads a file and only finds the duplicate in values after the "=" delimeter.
Const ForReading = 1
Const ForWriting = 2
Set objFSO = CreateObject("Scripting.FileSystemObject")
FileName = "Test.txt"
PathToSave = "C:"
Path = (PathToSave & FileName)
Set objFile = objFSO.OpenTextFile(Path, ForReading)
Set objOutputFile = objFSO.OpenTextFile(PathToSave & "Noduplicates.txt", 2, True)
Set objOutputFile2 = objFSO.OpenTextFile(PathToSave & "Duplicates.txt", 2, True)
objOutputFile.WriteLine ("This document contains the " & path & " file without duplicates" & vbcrlf)
objOutputFile2.WriteLine ("This document contains the duplicates found. Each line listed below had a duplicate in " & Path & vbcrlf)
Dim DuplicateCount
DuplicateCount = 0
Set Dict = CreateObject("Scripting.Dictionary")
Do until objFile.atEndOfStream
strCurrentLine = LCase(Trim(objFile.ReadLine))
Cut = Split(strCurrentline,"=")
If not Dict.Exists(LCase(Trim(cut(strCurrentLine)))) then
objOutputFile.WriteLine strCurrentLine
Dict.Add strCurrentLine,strCurrentLine
Else Dict.Exists(LCase(Trim(cut(strCurrentLine))))
objOutputFile2.WriteLine strCurrentLine
DuplicateCount = DuplicateCount + 1
End if
Loop
If DuplicateCount > 0 then
wscript.echo ("Number of Duplicates Found: " & DuplicateCount)
Else
wscript.echo "No Duplicates found"
End if
Cut is your array, so Cut(1) is the portion after the =. So that's what you should test for in your dictionary.
If InStr(strCurrentline, "=") > 0 Then
Cut = Split(strCurrentline,"=")
If Not Dict.Exists(Cut(1)) then
objOutputFile.WriteLine strCurrentLine
Dict.Add Cut(1), Cut(1)
Else
objOutputFile2.WriteLine strCurrentLine
DuplicateCount = DuplicateCount + 1
End if
End If
I makes no sense at all to ask Split to return an array with one element by setting the 3rd parameter to 1, as in
Cut = Split(strCurrentline,"=",1)
Evidence:
>> WScript.Echo Join(Split("a=b", "=", 1), "*")
>>
a=b
>> WScript.Echo Join(Split("a=b", "="), "*")
>>
a*b
BTW: ServerName(5) = "Example5" should be splitted on " = "; further thought about the quotes may be advisable.
Update wrt comments (and downvotes):
The semantics of the count parameter according to the docs:
count
Optional. Number of substrings to be returned; -1 indicates that all substrings are returned. If omitted, all substrings are returned.
Asking for one element (not an UBound!) results in one element containing the input.
Evidence wrt the type mismatch error:
>> cut = Split("a=b", "=", 1)
>> WScript.Echo cut
>>
Error Number: 13
Error Description: Type mismatch
>>
So please think twice.

Resources