Why is awk printing chinese looking characters

Why is awk printing chinese looking characters - windows

I've never had anything like this in all the years I've used AWK.
I've tried both gawk and mawk.
I've cut my awk script down to
{ print }
Should just echo every line. But every other line is printed as if it is on a different code page.
Files are created by exporting from Access, thusly:
Dim oApplication
Set oApplication = CreateObject("Access.Application")
oApplication.OpenAccessProject sFileName
For Each myObj In oApplication.CurrentProject.AllForms
WScript.Echo "Exporting FORM " & myObj.FullName
oApplication.SaveAsText acForm, myObj.FullName, sExportpath & "\" & myObj.FullName & ".form"
oApplication.DoCmd.Close acForm, myObj.FullName
dctDelete.Add "FO" & myObj.FullName, acForm
Next
Resulting source files look like
Operation =1
Option =0
Begin InputTables
Name ="Fee Types"
End
Begin OutputColumns
Expression ="[Fee Types].ID"
Expression ="[Fee Types].Type"
Expression ="[Fee Types].Category"
End
Output looks like
Operation =1
਍伀瀀琀椀漀渀 㴀　ഀഀ
Begin InputTables
਍    一愀洀攀 㴀∀䘀攀攀 吀礀瀀攀猀∀ഀഀ
End
਍䈀攀最椀渀 伀甀琀瀀甀琀䌀漀氀甀洀渀猀ഀഀ
Expression ="[Fee Types].ID"
਍    䔀砀瀀爀攀猀猀椀漀渀 㴀∀嬀䘀攀攀 吀礀瀀攀猀崀⸀吀礀瀀攀∀ഀഀ
Expression ="[Fee Types].Category"
਍䔀渀搀ഀഀ
਍
Executed with
gawk.exe -f "FilterBinary.awk" input.txt > output.txt

Hi I followed your steps and I got the same output and input:
Operation =1
Option =0
Begin InputTables
Name ="Fee Types"
End
Begin OutputColumns
Expression ="[Fee Types].ID"
Expression ="[Fee Types].Type"
Expression ="[Fee Types].Category"
End

This is still the weirdest thing I've ever seen.
Printing the ordinal value of every character showed that there was a literal null between every visible character and a couple with values of 254 and 255.
No idea why that only played havoc with every other line.
But it does explain why none of my matching was working.
Obviously, the solution was to filter it and only print characters with Ord values of 13, or above 31 and below 128.

Related

Clean up lines from log file

I have a file "D:\test.log" that has either one of two styles. This will appear if the user is offline when the user received the message:
[02:19:47] Brother Aimbot (adama900): (Saved Thu Mar 31 05:15:09 2016)This is a test line
It will be like this if the user is online when the user received the message:
[02:19:47] Brother Aimbot (adama900): This is a test line
What I would like this to do is cut out the excess parts so it would look like this if it's either the first or second style:
Brother Aimbot (adama900) This is a test line
then place it into a message box.
Here is my code:
Sub main()
filename = "D:\Test.txt"
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(filename)
LNEVAL = f.ReadLine
LNENUM = 0
Do Until f.AtEndOfStream
For i = 1 To LNENUMs
f.ReadLine
Next
If InStr(LNEVAL, "(S") Then
LNEVAL = Left(LNEVAL, (Len("(S")+4))
MsgBox = LNEVAL
End If
Loop
f.Close
End Sub
This is what I have so far.

It's fairly simple to do what you want with a regular expression replacement. Basically what you want to do is remove three things from each line:
a substring between square brackets from the beginning of the string,
the colon separating the name from the message, and
an optional substring between parentheses after that colon.
A regular expression ^\[.*?\] matches an opening square bracket at the beginning of a string and the shortest number of characters up to a closing square bracket.
A regular expression \(Saved.*?\) matches an opening parenthesis followed by the word Saved and the shortest number of characters up to a closing parenthesis. However, since this part is optional you need to indicate that the expression can occur zero or one time by putting it in a non-capturing group and appending the ? modifier to it ((?:...)?).
Put the submatches that you do want to preserve in parentheses to create capturing groups
^\[.*?\] (.*?): (?:\(Saved.*?\))?(.*)
and replace each matching line with just the captured groups:
Set re = New RegExp
re.Pattern = ...
Set f = fso.OpenTextFile(filename)
Do Until f.AtEndOfStream
MsgBox re.Replace(f.ReadLine, "$1 $2")
Loop
f.Close
Some comments on your existing code:
For i = 1 To LNENUMs: this loop is always skipped over, because you set LNEUMs to 0. Since you only do f.ReadLine inside that For loop your outer Do loop becomes an infinite loop, since you never read the file to the end.
Len("(S")+4 always evaluates to 6, because the length of the string (S is not going to change, so you could just replace the expression with the numeric value.
MsgBox = LNEVAL: The MsgBox function doesn't work that way. Remove the = between function name and message.

Write file into txt in VB6

Here is the way how I write file into txt
Dim FileName As String
FileName = "C:\Users\Coda\Desktop\Calendar\file\" & clickDate & ".txt"
Dim Str1 As String, Val1 As Long
Open FileName For Output As #1
Str1 = Text1.Text
MsgBox ("Save")
Write #1, Str1
Close #1
But it automatically add quote in the beginning and end.
Like this
"Test1
Test2
Test3"
Is there any way I can get rid of these quotes?

The short story - use Print instead of Write.
The Write # statement is used to write records with comma-delimited fields to a sequential file. The Write # statement will automatically enclose string fields in quotes and date fields in pound signs.
The Print # statement is used to write formatted strings of data to a sequential file.
To create records in the fixed-width format , you may use following code sample e.g.:
Print #intFooBar, strEmpName; Tab(21); Format$(intDeptNbr, "####"); _
Tab(30); strJobTitle; _
Tab(51); Format$(dtmHireDate, "m/d/yyyy"); _
Tab(61); Format$(Format$(sngHrlyRate, "#0.00"), "#####")

Eliminate hyphen from AppleScript's text item delimiters

I am having problems separating words from each other when it comes to equations, because I can't separate the equation into two parts if there is a negative variable involved.
set function to "-3x"
return word 1 of function
that would return "3x", because a hyphen is a text item delimiter, but I want it to return "-3x". Is there any way to remove the hyphen from the text item delimiters or any other way to include the hyphen into the string?

To give an idea, here's a simple tokenizer for a very simple Lisp-like language:
-- token types
property StartList : "START"
property EndList : "END"
property ANumber : "NUMBER"
property AWord : "WORD"
-- recognized token chars
property _startlist : "("
property _endlist : ")"
property _number : "+-.1234567890"
property _word : "abcdefghijklmnopqrstuvwxyz"
property _whitespace : space & tab & linefeed & return
to tokenizeCode(theCode)
considering diacriticals, hyphens, punctuation and white space but ignoring case and numeric strings
set i to 1
set l to theCode's length
set tokensList to {}
repeat while i ≤ l
set c to character i of theCode
if c is _startlist then
set end of tokensList to {tokenType:StartList, tokenText:c}
set i to i + 1
else if c is _endlist then
set end of tokensList to {tokenType:EndList, tokenText:c}
set i to i + 1
else if c is in _number then
set tokenText to ""
repeat while character i of theCode is in _number and i ≤ l
set tokenText to tokenText & character i of theCode
set i to i + 1
end repeat
set end of tokensList to {tokenType:ANumber, tokenText:tokenText}
else if c is in _word then
set tokenText to ""
repeat while character i of theCode is in _word and i ≤ l
set tokenText to tokenText & character i of theCode
set i to i + 1
end repeat
set end of tokensList to {tokenType:AWord, tokenText:tokenText}
else if c is in _whitespace then -- skip over white space
repeat while character i of theCode is in _whitespace and i ≤ l
set i to i + 1
end repeat
else
error "Unknown character: '" & c & "'"
end if
end repeat
return tokensList
end considering
end tokenizeCode
The syntax rules for this language are as follows:
A number expression contains one or more digits, "+" or "-" signs, and/or decimal point. (The above code currently doesn't check that the token is a valid number, e.g. it'll happily accept nonsensical input like "0.1.2-3+", but that's easy enough to add.)
A word expression contains one or more characters (a-z).
A list expression begins with a "(" and ends with a ")". The first token in a list expression must be the name of the operator to apply; this may be followed by zero or more additional expressions representing its operands.
Any unrecognized characters are treated as an error.
For example, let's use it to tokenize the mathematical expression "3 + (2.5 * -2)", which in prefix notation is written like this:
set programText to "(add 3 (multiply 2.5 -2))"
set programTokens to tokenizeCode(programText)
--> {{tokenType:"START", tokenText:"("},
{tokenType:"WORD", tokenText:"add"},
{tokenType:"NUMBER", tokenText:"3"},
{tokenType:"START", tokenText:"("},
{tokenType:"WORD", tokenText:"multiply"},
{tokenType:"NUMBER", tokenText:"2.5"},
{tokenType:"NUMBER", tokenText:"-2"},
{tokenType:"END", tokenText:")"},
{tokenType:"END", tokenText:")"}}
Once the text is split up into a list of tokens, the next step is to feed that list into a parser which assembles it into an abstract syntax tree which fully describes the structure of the program.
Like I say, there's a bit of a learning curve to this stuff, but you can write it in your sleep once you've grasped the basic principles. Ask and I'll add an example of how to parse these tokens into usable form later.

Following on from before, here's a parser that turns the tokenizer's output into a tree-based data structure that describes the program's logic.
-- token types
property StartList : "START"
property EndList : "END"
property ANumber : "NUMBER"
property AWord : "WORD"
-------
-- handlers called by Parser to construct Abstract Syntax Tree nodes,
-- simplified here for demonstration purposes
to makeOperation(operatorName, operandsList)
return {operatorName:operatorName, operandsList:operandsList}
end makeOperation
to makeWord(wordText)
return wordText
end makeWord
to makeNumber(numberText)
return numberText as number
end makeNumber
-------
-- Parser
to makeParser(programTokens)
script ProgramParser
property currentToken : missing value
to advanceToNextToken()
if programTokens is {} then error "Found unexpected end of program after '" & currentToken & "'."
set currentToken to first item of programTokens
set programTokens to rest of programTokens
return
end advanceToNextToken
--
to parseOperation() -- parses an '(OPERATOR [OPERANDS ...])' list expression
advanceToNextToken()
if currentToken's tokenType is AWord then -- parse 'OPERATOR'
set operatorName to currentToken's tokenText
set operandsList to {}
advanceToNextToken()
repeat while currentToken's tokenType is not EndList -- parse 'OPERAND(S)'
if currentToken's tokenType is StartList then
set end of operandsList to parseOperation()
else if currentToken's tokenType is AWord then
set end of operandsList to makeWord(currentToken's tokenText)
else if currentToken's tokenType is ANumber then
set end of operandsList to makeNumber(currentToken's tokenText)
else
error "Expected word, number, or list expression but found '" & currentToken's tokenText & "' instead."
end if
advanceToNextToken()
end repeat
return makeOperation(operatorName, operandsList)
else
error "Expected operator name but found '" & currentToken's tokenText & "' instead."
end if
end parseOperation
to parseProgram() -- parses the entire program
advanceToNextToken()
if currentToken's tokenType is StartList then
return parseOperation()
else
error "Found unexpected '" & currentToken's tokenText & "' at start of program."
end if
end parseProgram
end script
end makeParser
-------
-- parse the tokens list produced by the tokenizer into an Abstract Syntax Tree
set programTokens to {{tokenType:"START", tokenText:"("}, ¬
{tokenType:"WORD", tokenText:"add"}, ¬
{tokenType:"NUMBER", tokenText:"3"}, ¬
{tokenType:"START", tokenText:"("}, ¬
{tokenType:"WORD", tokenText:"multiply"}, ¬
{tokenType:"NUMBER", tokenText:"2.5"}, ¬
{tokenType:"NUMBER", tokenText:"-2"}, ¬
{tokenType:"END", tokenText:")"}, ¬
{tokenType:"END", tokenText:")"}}
set parserObject to makeParser(programTokens)
set abstractSyntaxTree to parserObject's parseProgram()
--> {operatorName:"add", operandsList:{3, {operatorName:"multiply", operandsList:{2.5, -2}}}}
The ProgramParser object is a very, very simple recursive descent parser, a collection of handlers, each of which knows how to turn a sequence of tokens into a specific data structure. In fact, the Lisp-y syntax used here is so simple it really only requires two handlers: parseProgram, which gets everything underway, and parseOperation, which knows how to read the tokens that make up a (OPERATOR_NAME [OPERAND1 OPERAND2 ...]) list and turn it into a record that describes a single operation (add, multiply, etc) to be performed.
The nice thing about an AST, especially a very simple regular one like this, is you can manipulate it as data in its own right. For instance, given the program (multiply x y) and a definition of y = (add x 1), you could walk the AST and replace any mention of y with its definition, in this case giving (multiply x (add x 1)). i.e. You can not only do arithmetic calculations (algorithmic programming), but algebraic manipulations (symbolic programming) too. That's a bit heady for here, but I'll see about knocking together a simple arithmetical evaluator for later.

To finish off, here's a simple evaluator for the parser's output:
to makeOperation(operatorName, operandsList)
if operatorName is "add" then
script AddOperationNode
to eval(env)
if operandsList's length ≠ 2 then error "Wrong number of operands."
return ((operandsList's item 1)'s eval(env)) + ((operandsList's item 2)'s eval(env))
end eval
end script
else if operatorName is "multiply" then
script MultiplyOperationNode
to eval(env)
if operandsList's length ≠ 2 then error "Wrong number of operands."
return ((operandsList's item 1)'s eval(env)) * ((operandsList's item 2)'s eval(env))
end eval
end script
-- define more operations here as needed...
else
error "Unknown operator: '" & operatorName & "'"
end if
end makeOperation
to makeWord(wordText)
script WordNode
to eval(env)
return env's getValue(wordText)'s eval(env)
end eval
end script
end makeWord
to makeNumber(numberText)
script NumberNode
to eval(env)
return numberText as number
end eval
end script
end makeNumber
to makeEnvironment()
script EnvironmentObject
property _storedValues : {}
--
to setValue(theKey, theValue)
-- theKey : text
-- theValue : script
repeat with aRef in _storedValues
if aRef's k is theKey then
set aRef's v to theValue
return
end if
end repeat
set end of _storedValues to {k:theKey, v:theValue}
return
end setValue
--
to getValue(theKey)
repeat with aRef in _storedValues
if aRef's k is theKey then return aRef's v
end repeat
error "'" & theKey & "' is undefined." number -1728
end getValue
--
end script
end makeEnvironment
to runProgram(programText, theEnvironment)
set programTokens to tokenizeCode(programText)
set abstractSyntaxTree to makeParser(programTokens)'s parseProgram()
return abstractSyntaxTree's eval(theEnvironment)
end runProgram
This replaces the make... handlers used to test the parser with new handlers that construct full-blown objects representing each type of structure that can make up an Abstract Syntax Tree: numbers, words, and operations. Each object defines an eval handler that knows how to evaluate that particular structure: in a NumberNode it simply returns the number, in a WordNode it retrieves and evaluates the structure stored under that name, in an AddOperationNode it evaluates each operand then sums them, and so on.
For example, to evaluate our original 3 + 2.5 * -2 program:
set theEnvironment to makeEnvironment()
runProgram("(add 3 (multiply 2.5 -2))", theEnvironment)
--> -2.0
In addition, an EnvironmentObject is used to store named values. For example, to store a value named "x" for use by a program:
set theEnvironment to makeEnvironment()
theEnvironment's setValue("x", makeNumber(5))
runProgram("(add 3 x)", theEnvironment)
--> 8
Obviously this will need a bit more work to make it into a proper calculator: a full set of operator definitions, better error reporting, and so on. Plus you'll probably want to replace the parenthesized prefix syntax with a more familiar infix syntax, for which you'll need something like a Pratt parser that can handle precedence, association, etc. But once you've got the basics working it's just a matter of reading up on the various techniques and making changes and improvements one by one until you arrive at the desired solution. HTH.

You can write a calculator in AppleScript if you wish to, but you need to do it as you would in any other language: 1. using a tokenizer to split the input text into a list of tokens, 2. feeding those tokens to a parser which assembles them into an abstract syntax tree, and 3. evaluating that tree to produce a result.
For what you're doing, you could probably write your tokenizer as a regular expression (assuming you don't mind dipping down to NSRegularExpression via the AppleScript-ObjC bridge). For parsing, I recommend reading up on Pratt parsers, which are easy to implement yet powerful enough to support prefix, infix, and posfix operators and operator precedence. For evaluation, a simple recursive AST walking algorithm may well be sufficient, but one step at a time.
These are all well-solved problems, so you won't have any trouble finding tutorials and other online information on how to do it. (Lots of crap, of course, so be prepared to spend some time figuring out how tell the good from the bad.)
Your one problem is that you none will be written specifically for AppleScript, so be prepared to spelunk material written around other languages (Python, Java, etc, etc) and translate from that to AS yourself. That'll require some effort and patience wading through all the programmer-speak, but is eminently doable (I originally cut my teeth on AppleScript and now write my own automation scripting languages) and a great learning exercise for developing your skills.

reading between two values

I have to read between 2 values after asking the users if he want between the '' or not between.
Exemple if the user select 1 in the text
'Hi' My name is 'Kev'in and i'm happ'y' to be 'there'
he will have
'Hi' 'Kev' 'y' 'there'
in a text file. If he chose 2, he will have
My name is in and i'm happ to be
Right now I'm using
Do While objScriptFile.AtEndOfStream <> True
strCurrentLine = objScriptFile.ReadLine
intIsComment = InStr(1,strCurrentLine,"'")
If intIsComment > 0 Then
objCommentFile.Write strCurrentLine & vbCrLf
End If
Loop
Else
For now it's only reading both of the value (between '' and not between) but I have no idea how to change it.

That's pretty simple, provided the delimiter is unique. Split the line at ' and output either the even or the odd array elements, depending on whether 1 or 2 was chosen.
...
strCurrentLine = "'Hi' My name is 'Kev'in and i`m happ'y' to be 'there'"
arr = Split(strCurrentLine, "'")
For i = choice To UBound(arr) Step 2
objCommentFile.Write arr(i)
Next
...
The value of choice is your users' selection (either 1 or 2).
Note that for this to work the strings must not contain apostrophes anywhere else. As #Ekkehard.Horner pointed out in his comment you can't use the delimiter character elsewhere in the text (i'm), because otherwise it would be impossible to distinguish where it was intended to be a delimiter and where not.

vbscript - Replace all spaces

I have 6400+ records which I am looping through. For each of these: I check that the address is valid by testing it against something similar to what the Post Office uses (find address). I need to double check that the postcode I have pulled back matches.
The only problem is that the postcode may have been inputted in a number of different formats for example:
OP6 6YH
OP66YH
OP6 6YH.
If Replace(strPostcode," ","") = Replace(xmlAddress.selectSingleNode("//postcode").text," ","") Then
I want to remove all spaces from the string. If I do the Replace above, it removes the space for the first example but leave one for the third.
I know that I can remove these using a loop statement, but believe this will make the script run really slow as it will have to loop through 6400+ records to remove the spaces.
Is there another way?

I didn't realise you had to add -1 to remove all spaces
Replace(strPostcode," ","",1,-1)

Personally I've just done a loop like this:
Dim sLast
Do
sLast = strPostcode
strPostcode = Replace(strPostcode, " ", "")
If sLast = strPostcode Then Exit Do
Loop
However you may want to use a regular expression replace instead:
Dim re : Set re = New RegExp
re.Global = True
re.Pattern = " +" ' Match one or more spaces
WScript.Echo re.Replace("OP6 6YH.", "")
WScript.Echo re.Replace("OP6 6YH.", "")
WScript.Echo re.Replace("O P 6 6 Y H.", "")
Set re = Nothing
The output of the latter is:
D:\Development>cscript replace.vbs
OP66YH.
OP66YH.
OP66YH.
D:\Development>

This is the syntax Replace(expression, find, replacewith[, start[, count[, compare]]])
it will default to -1 for count and 1 for start. May be some dll is corrupt changing the defaults of Replace function.

String.Join("", YourString.Split({" "}, StringSplitOptions.RemoveEmptyEntries))
Because you get all strings without spaces and you join them with separator "".

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Why is awk printing chinese looking characters - windows

Hi I followed your steps and I got the same output and input: Operation =1 Option =0 Begin InputTables Name ="Fee Types" End Begin OutputColumns Expression ="[Fee Types].ID" Expression ="[Fee Types].Type" Expression ="[Fee Types].Category" End

Related

Clean up lines from log file

Write file into txt in VB6

Eliminate hyphen from AppleScript's text item delimiters

reading between two values

vbscript - Replace all spaces

Categories

Resources