Calculating checksum or XOR operations - ascii

I'm using hyperterminal and trying to send strings a to 6 digit scoreboard. I was sent a sample string from the manufacturer to test with and it worked, but to be able change the displayed message I was told to calculate a new Checksum value.
The sample string is: &AHELLO N-12345\71
Charactors A and N are addresses for the scoreboards(allowing two displays be used through one RS232 connection). HELLO and -12345 are the characters to be shown on the display. The "71" is where I am getting stuck.
How can you obtain 71 from "AHELLO N-12345"?
In the literature supplied with the scoreboard, the "71" from the sample string is described as a character by character logical XOR operation on characters "AHELLO N-12345". The manufacturer however called it a checksum. I'm not trained in this type of language and I did try to research but I can't put it together on my own.
The text below is copied from the supplied literature and describes the "71" (ckck) in question...
- ckck = 2 ASCII control characters: corresponds to the two hexadecimal digits obtained by
performing the character by character logical XOR operation on characters
"AxxxxxxByyyyyy". If there is an error in these characters, the string is ignored
Example: if the byte by byte logical XOR operation carried out on the ASCII codes of the
characters of the "AxxxxxxByyyyyy" string returns the hexadecimal value 0x2A,
the control characters ckck are "2" and "A".

You don't specify a language but here's the algorithm in C#. Basically xor the values of the string all together and you'll end up with a value of 113, 71 in hex. Hence 71 is on the end of the input string.
string input = "AHELLO N-12345";
UInt16 chk = 0;
foreach(char ch in input) {
chk ^= ch;
}
MessageBox.Show("value is " + chk);
Outputs "value is 113"

Related

How to Iterate over the ASCII characters with LC3 laguage?

Please write machine code following this outline:
Initialize index to zero
Iterate over the ASCII characters until the null terminator (zero) is found
Add the index to the current character being processed
Output the result of the previous statement as an ASCII character
Increment the index
Halt
This is a question on my homework, and it requires only the language of LC3 to traverse and find 0. I want to know how to load ASCII files into the program using only basic languages such as (AND/ADD/LD/LEA/LDI/LDR...)?
Put a string into memory like so:
STR1 .stringz "Hello"
then you can load the string and set R1 to point to it:
LEA R1, STR1 ;R1 points to STR1
Now R1 is equivelent to the Ascii code for "H", R1 + 1 would be ascii for "e". Then you just need some looping and comparisons until you know R1 = 0.
Remember that .Stringz will put a Null terminating string into memory.

Are byte slices of utf8 also utf8?

Given a slice of bytes that is valid utf8, is it true that any sub-slice of such slice is also valid utf8?
In other words, given b1: [u8] that is valid utf8, can I assume that
b2 = b1[i..j] is valid utf8 for any i,j : i<j?
If not, what would be the counter-example?
what would be the counter-example?
Any code point that encodes as more than 1 byte. For example π in hex is cf80, and slicing it in the middle produces two (separate) invalid UTF-8 strings.

How can I convert ASCII code to characters in Verilog language

I've been looking into this but searching seems to lead to nothing.
It might be too simple to be described, but here I am, scratching my head...
Any help would be appreciated.
Verilog knows about "strings".
A single ASCII character requires 8 bits. Thus to store 8 characters you need 64 bits:
wire [63:0] string8;
assign string8 = "12345678";
There are some gotchas:
There is no End-Of-String character (like the C null-character)
The most RHS character is in bits 7:0.
Thus string8[7:0] will hold 8h'38. ("8").
To walk through a string you have to use e.g.: string[ index +: 8];
As with all Verilog vector assignments: unused bits are set to zero thus
assign string8 = "ABCD"; // MS bit63:32 are zero
You can not use two dimensional arrays:
wire [7:0] string5 [0:4]; assign string5 = "Wrong";
You are probably mislead by a misconception about characters. There are no such thing as a character in hardware. There are only sets of bits or codes. The only thing which converts binary codes to characters is your terminal. It interprets codes in a certain way and forming letters for you to se. So, all the printfs in 'c' and $display in verilog only send the codes to the terminal (or to a file).
The thing which converts characters to the codes is your keyboard, which you also use to type in the program. The compiler then interprets your program. Verilog (as well as the 'c') compiler represents strings in double quotes (which you typed in) as a set of bytes directly. Verilog, as well as 'c' use ascii-8 encoding for such character strings, meaning that the code for 'a' is decimal 97 and 'b' is 98, .... Every character is 8-bit wide and the quoted string forms a concatenation of bytes of ascii codes.
So, answering you question, you can convert an ascii codes to characters by sending them to the terminal via $display (or other) function, using the %s modifier.
So, an example:
module A;
reg[8*5-1:0] hello;
reg[8*3 - 1: 0] bye;
initial begin
hello = "hello"; // 5 bytes of characters
bye = {8'd98, 8'd121, 8'd101}; // 3 bytes 'b' 'y' 'e'
$display("hello=%s bye=%s", hello, bye);
end
endmodule

New line character in serialized messages

Some protobuf messages, when serialized to string, have new line character \n inside them. Usually when the first field of the message is a string then the new line character is prepended before the message. But wa also found messages with new line character somewhere in the middle.
The problem with new line character is when you want to save the messages into one file line by line. The new line character breaks the line and makes the message invalid.
example.proto
syntax = "proto3";
package data_sources;
message StringFirst {
string key = 1;
bool valid = 2;
}
message StringSecond {
bool valid = 1;
string key = 2;
}
example.py
from protocol_buffers.data_sources.example_pb2 import StringFirst, StringSecond
print(StringFirst(key='some key').SerializeToString())
print(StringSecond(key='some key').SerializeToString())
output
b'\n\x08some key'
b'\x12\x08some key'
Is this expected / desired behaviour? How can one prevent the new line character?
protobuf is a binary protocol (unless you're talking about the optional json thing). So: any time you're treating it as text-like in any way, you're using it wrong and the behaviour will be undefined. This includes worrying about whether there are CR/LF characters, but it also includes things like the nul-character (0x00), which is often interpreted as end-of-string in text-based APIs in many frameworks (in particular, C-strings).
Specifically:
LF (0x0A) is identical to the field header for "field 1, length-prefixed"
CR (0x0D) is identical to the field header for "field 1, fixed 32-bit"
any of 0x00, 0x0A or 0x0D could occur as a length prefix (to signify a length of 0, 10, or 13)
any of 0x00, 0x0A or 0x0D could occur naturally in binary data (bytes)
any of 0x00, 0x0A or 0x0D could occur naturally in any numeric type
0x0A or 0x0D could occur naturally in text data (as could 0x00 if your originating framework allows nul-characters arbitrarily in strings, so... not C-strings)
and probably a range of other things
So: again - if the inclusion of "special" text characters is problematic: you're using it wrong.
The most common way to handle binary data as text is to use a base-N encode; base-16 (hex) is convenient to display and read, but base-64 is more efficient in terms of the number of characters required to convey the same number of bytes. So if possible: convert to/from base-64 as required. Base-64 never includes any of the non-printable characters, so you will never encounter CR/LF/nul.

Random byte being added when using string()

I am attempting to XOR two values. If I do I can get the right result, however, using string() on it results in a random byte being added to it!
Can anyone explain this?
Here's a playground: http://play.golang.org/p/tIOOjqo_Fe
So, you have:
z := 175 // 0xaf
That is the unicode code point for the character: ¯
The following line of code will then take the value and treat it as a unicode code point (rune) and turn it into a utf-8 encoded string:
out := string(z)
In utf-8 encoding, that character would be represented by two bytes: []byte(0xc2, 0xaf)
So, the bytes you see are the Go string's utf-8 encoding.

Resources