I need help on how to remove spaces/emtpy in data without compromising spaces on other data. Here's my sample data.
12345," ","abcde fgh",2017-06-06,09:00,AM," ", US
expected output:
12345,,"abcde fgh",2017-06-06,09:00,AM,, US
since " " should be considered as null.
I tried the Trim() function but it did not work. I also tried Regex pattern but still no use.
Here's my sample function.
Private Sub Transform(delimiter As String)
Dim sFullPath As String
Dim strBuff As String
Dim re As RegExp
Dim matches As Object
Dim m As Variant
If delimiter <> "," Then
strBuff = Replace(strBuff, delimiter, ",")
Else
With re
.Pattern = "(?!\B""[^""]*)" & delimiter & "(?![^""]*""\B)"
.IgnoreCase = False
.Global = True
End With
Set matches = re.Execute(strBuff)
For Each m In matches
strBuff = re.Replace(strBuff, ",")
Next
Set re = Nothing
Set matches = Nothing
End If
End Sub
I think you're on the right track. Try using this for your regular expression. The two double quotes in a row are how a single double quote is included in a string literal. Some people prefer to use Chr(34) to include double quotes inside a string.
\B(\s)(?!(?:[^""]*""[^""]*"")*[^""]*$)
Using that expression on your example string
12345," ","abcde fgh",2017-06-06,09:00,AM," ", US
yields
12345,"","abcde fgh",2017-06-06,09:00,AM,"", US
Example function
Private Function Transform(ByVal strLine As String) As String
Dim objRegEx As RegExp
On Error GoTo ErrTransForm
Set objRegEx = New RegExp
With objRegEx
.Pattern = "\B(\s)(?!(?:[^""]*""[^""]*"")*[^""]*$)"
.IgnoreCase = False
.Global = True
Transform = .Replace(strLine, "")
End With
ExitTransForm:
If Not objRegEx Is Nothing Then
Set objRegEx = Nothing
End If
Exit Function
ErrTransForm:
'error handling code here
GoTo ExitTransForm
End Function
And credit where credit is due. I used this answer, Regex Replace Whitespaces between single quotes as the basis for the expression here.
I would add an output string variable and have a conditional statement saying if the input is not empty, add it on to the output string. For example (VB console app format with the user being prompted to enter many inputs):
Dim input As String
Dim output As String
Do
input = console.ReadLine()
If Not input = " " Then
output += input
End If
Loop Until (end condition)
Console.WriteLine(output)
You can throw any inputs you don't want into the conditional to remove them from the output.
Your CSV file isn't correctly formatted.
Double quote shouldn't exists, then open your CSV with Notepad and replace them with a null string.
After this, you now have a real CSV file that you can import whitout problems.
Related
I am converting them to Jython script and I felt all it does is remove spaces at ends
function test (strField)
Dim re
Set re = New RegExp
re.Pattern = "^\s*"
re.MultiLine = False
strField = re.replace(strField,"")
End Function
It uses the RegExp object in VBScript to check for whitespace \s at the start of the variable passed into the Sub / Function called strField. Once it identifies the whitespace it uses the Replace() method to remove any matched characters from the start of the string.
As #ansgar-wiechers has mentioned in the comments it is just an all whitespace implementation of the LTrim() function.
I'm assuming this is meant to be a Function though (haven't tested but maybe VBScript accepts Fun as shorthand for Function, not something I'm familiar with personally) with that in mind it should return the modified strField value as the result of the function. Would also recommend using ByVal to stop the strField value after it is manipulated bleeding out of the function.
Function test(ByVal strField)
Dim re
Set re = New RegExp
re.Pattern = "^\s*"
re.MultiLine = False
strField = re.replace(strField,"")
test = strField
End Function
Usage in code:
Dim testin: testin = " some whitespace here"
Dim testout: testout = test(testin)
WScript.Echo """" & testin & """"
WScript.Echo """" & testout & """"
Output:
" some whitespace here"
"some whitespace here"
I have a script that reads in a comma delimited text file, however whenever I use Trim(str) on one of the values I have extracted in the file, it won't work...
My Text File:
some string, anotherstring, onelaststring
some string, anotherstring, onelaststring
some string, anotherstring, onelaststring
some string, anotherstring, onelaststring
My Script:
Dim fso, myTxtFile
Set fso = CreateObject("Scripting.FileSystemObject")
Set myTxtFile = fso.OpenTextFile("mytxt.txt")
Dim str, myTxtArr
txtContents myTxtFile.ReadAll
myTxtFile.close
myTxtArr = Split(txtContents, vbNewLine)
For each line in myTxtArr
tLine = Split(tLine, ",")
Trim(tLine(1))
If tLine(1) = "anotherstring" Then
MsgBox "match"
End If
Next
My script never reaches "match" and I'm not sure why.
Trim() is a function that returns the trimmed string. Your code uses it improperly. You need to use the returned value:
myTxtArr(1) = Trim(myTxtArr(1))
or use another variable to store the value, and use that separate variable in the comparison,
trimmedStr = Trim(myTxtArr(1))
If trimmedStr = "anotherstring" Then
or you can just use the function return value directly in the comparison,
If Trim(myTxtArr(1)) = "anotherstring" Then
Here's a corrected version of that portion of your code:
For each line in myTxtArr
tLine = Split(line, ",")
tLine(1) = Trim(tLine(1))
If tLine(1) = "anotherstring" Then
MsgBox "match"
End If
Next
What are the rules that vb6 uses to find the apostrophe that marks the beginning of a commented
out portion of a line(s)?
I don't feel confident about my ability to define such rules because of :
apostrophes inside string literals
apostrophes inside nested double quotes in string operations
the fact that double quotes can occur in comments
the fact that a line can be continued over multiple lines
colons breaking up a single line
possible other rules that I'm unaware of
If you really want to know, the VBA language specification is online on MSDN with a BNF grammar. VBA is 99% equivalent to VB6 - certainly the rules about identifying comments must be identical.
And if you are trying to eliminate dead code - just get MZ-Tools! It's free. Use the code review tool. "MZ-Tools can review your source code at project-group, project or file level (through context menus) detecting unused variables, constants, parameters, private procedures, and so on." To eliminate unused subs and functions, use the MZ-Tools feature that lists all callers.
EDIT There is discussion in the comments about how to eliminate items which have been unecessarily declared as Public: MZ Tools does not help with this.
Apostrophes in string literals are ignored. Apostrophes in comments also ignored (because they are already in a comment). You cannot put as apostrophe in a multi-line statement unless it is on the last line of the statement -- in which case it just follows the normal rules.
Why are you so concerned?
This seems to work on any examples I tried
The only rule it's using is to ignore apostrophes inside string literals
and it doesn't account for the use of the word rem to start a comment.
To my shame I never realised until yesterday that """" (4 double quotes) was a string literal containing a single double quote.
Public Function CommentStripOut(ByVal strLine As String) As String
Dim InLiteral As Boolean
Dim strReturn As String
Dim LenLine As Long
Dim counter As Long
Dim s1 As String
Dim s2 As String
strReturn = strLine
LenLine = Len(strLine)
InLiteral = False
For counter = 1 To LenLine
s1 = Mid$(strLine, counter, 1)
If counter < LenLine Then
s2 = Mid$(strLine, counter + 1, 1)
Else
s2 = ""
End If
If s1 = """" Then
If Not InLiteral Then
InLiteral = True
Else
If s2 = """" Then
counter = counter + 1
'skip on by 1 because
'want to treat escaped
'double quote as a single
'character
Else
InLiteral = False
End If
End If
Else
If Not InLiteral Then
If s1 = "'" Then
strReturn = Left$(strLine, counter - 1)
Exit For
End If
End If
End If
Next counter
CommentStripOut = strReturn
End Function
I have a string like X5BC8373XXX. Where X = a special character equals a Square.
I also have some special characters like \n but I remove them, but I can't remove the squares...
I'd like to know how to remove it.
I Found this method:
Dim Test As String
Test = Replace(Mscomm1.Input, Chr(160), Chr(64) 'Here I remove some of the special characters like \n
Test = Left$(Test, Len(Test) -2)
Test = Right$(Test, Len(Test) -2)
This method DOES remove those special characters, but it's also removing my first character 5.
I realize that this method just remove 2 characters from the left and the right,
but how could I work around this to remove these special characters ?
Also I saw something with vblF, CtrlF something like this, but I couldn't work with this ;\
You can use regular expressions. If you want to remove everything that's not a number or letter, you can use the code below. If there are other characters you want to keep, regular expressions are highly customizable, but can get a little confusing.
This also has the benefit of doing the whole string at once, instead of character by character.
You'll need to reference Microsoft VBScript Regular Expressions in your project.
Function AlphaNum(OldString As String)
Dim RE As New RegExp
RE.Pattern = "[^A-Za-z0-9]"
RE.Global = True
AlphaNum = RE.Replace(OldString, "")
End Function
Cleaning out non-printable characters is easy enough. One brute-force but easily customizable method might be:
Private Function Printable(ByVal Text As String) As String
Dim I As Long
Dim Char As String
Dim Count As Long
Printable = Text 'Allocate space, same width as original.
For I = 1 To Len(Text)
Char = Mid$(Text, I, 1)
If Char Like "[ -~]" Then
'Char was in the range " " through "~" so keep it.
Count = Count + 1
Mid$(Printable, Count, 1) = Char
End If
Next
Printable = Left$(Printable, Count)
End Function
Private Sub Test()
Dim S As String
S = vbVerticalTab & "ABC" & vbFormFeed & vbBack
Text1.Text = S 'Shows "boxes" or "?" depending on the font.
Text2.Text = Printable(S)
End Sub
This will remove control characters (below CHR(32))
Function CleanString(strBefore As String) As String
CleanString = ""
Dim strAfter As String
Dim intAscii As Integer
Dim strTest As String
Dim dblX As Double
Dim dblLen As Double
intLen = Len(strBefore)
For dblX = 1 To dblLen
strTest = Mid(strBefore, dblX, 1)
If Asc(strTest) < 32 Then
strTest = " "
End If
strAfter = strAfter & strTest
Next dblX
CleanString = strAfter
End Function
can someone help me to write RegExp to remove all characters except numbers and hyphen (minus sign or "-") between them.
string looks like:
C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls
it needs to be:
1586-10
only.
the number of digits before -10 unspecified (can be 4-6); -10 itslef can be any two-digit number...
to make it easier. here is the function i found:
Public Function strClean (strtoclean)
Dim objRegExp, outputStr
Set objRegExp = New Regexp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.Pattern = "((?![a-zA-Z0-9]).)+"
outputStr = objRegExp.Replace(strtoclean, "-")
objRegExp.Pattern = "\-+"
outputStr = objRegExp.Replace(outputStr, "-")
strClean = outputStr
End Function
the pattern currently makes this with file names:
C-Documents-and-Settings-Lena-Desktop-New-Folder-2-New-Folder-2-1588-11-sfiuhsgu-(fgRG75476)-skghgsiu.xls
\d?\d?\d\d\d\d-\d\d
\d is a digit
? means zero or one of the preceeding character
So \d? means that it can be 0 or 1 digit.
Edit: Added a sample of how to use it after the comments
Dim myRegExp
Set myRegExp = New RegExp
myRegExp.Pattern = "\d?\d?\d\d\d\d-\d\d"
Dim test
test = "C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls"
Set myMatches = myRegExp.Execute(Right(test, InStrRev(test, "\")))
WScript.Echo myMatches(0)
Edit2: Code snippet used to call your code
Dim test
test = "C:\Documents and Settings\User\Desktop\New Folder 2\New\Folder\1586-10 bougsfiugUYG(jygf) hoihd.xls"
test = StrClean(test)
WScript.Echo test