Checking a ITF or Code 128 bar code integrity in a final string - barcode

There is a software that is able either to receive bar codes scanned by bar code scanner app or enter them manually.
With EAN everything is pretty clear: it contains a checking digit and using a special algorithm I can check code integrity when code is entered manually.
With ITF or Code 128 it seems not to be so. It seems that integrity check exists only on bar code level and once a final code string was produced there is no means to check its integrity. Did I understand it right?

You have three different cases with these barcode types:
EAN/UPC the check digit is integral to the symbol (meaning the reader will verify the scan against the check digit), is printed in human readable form, and is part of the value returned by the reader.
Code128 the check digit is integral to the symbol but is not normally printed in human readable form nor is it typically returned by a reader. When entering the code manually, the check digit is not part of the data.
With ITF14 the check digit is not integral to the symbol so the reader may not verify it. ITF14 is just a 2-of-5 symbol unless the reader is configured to only accept ITF14 and in that case, should verify the check digit. The check digit is normally printed in human-readable form and is returned by the reader. When manually entering an ITF14, the check digit is typically part of the value entered.
With ITF14, a lot of the behavior depends on the configuration of the reader.

Related

Rust println! prints weird characters under certain circumstances

I'm trying to write a short program (short enough that it has a simple main function). First, I should list the dependency in the cargo.toml file:
[dependencies]
passwords = {version = "3.1.3", features = ["crypto"]}
Then when I use the crate in main.rs:
extern crate passwords;
use passwords::hasher;
fn main() {
let args: Vec<String> = std::env::args().collect();
if args.len() < 2
{
println!("Error! Needed second argument to demonstrate BCrypt Hash!");
return;
}
let password = args.get(1).expect("Expected second argument to exist!").trim();
let hash_res = hasher::bcrypt(10, "This_is_salt", password);
match hash_res
{
Err(_) => {println!("Failed to generate a hash!");},
Ok(hash) => {
let str_hash = String::from_utf8_lossy(&hash);
println!("Hash generated from password {} is {}", password, str_hash);
}
}
}
The issue arises when I run the following command:
$ target/debug/extern_crate.exe trooper1
And this becomes the output:
?sC�M����k��ed from password trooper1 is ���Ka .+:�
However, this input:
$ target/debug/extern_crate.exe trooper3
produces this:
Hash generated from password trooper3 is ��;��l�ʙ�Y1�>R��G�Ѡd
I'm pretty content with the second output, but is there something within UTF-8 that could cause the "Hash generat" portion of the output statement to be overwritten? And is there code I could use to prevent this?
Note: Code was developed in Visual Studio Code in Windows 10, and was compiled and run using an embedded Git Bash Terminal.
P.S.: I looked at similar questions such as Rust println! problem - weird behavior inside the println macro and Why does my string not match when reading user input from stdin? but those issues seem to be issues with new-line and I don't think that's the problem here.
To complement the previous, the answer to your question of "is there something within UTF-8 that could cause the "Hash generat" portion of the output statement to be overwritten?" is:
let str_hash = String::from_utf8_lossy(&hash);
The reason's in the name: from_utf8_lossy is lossy. UTF8 is a pretty prescriptive format. You can use this function to "decode" stuff which isn't actually UTF8 (for whatever reason), but the way it will do this decoding is:
replace any invalid UTF-8 sequences with U+FFFD REPLACEMENT CHARACTER, which looks like this: �
And so that is what the odd replacement you get is: byte sequences which can not be decoded as UTF8, and are replaced by the "replacement character".
And this is because hash functions generally return random-looking binary data, meaning bytes across the full range (0 to 255) and with no structure. UTF8 is structured and absolutely does not allow such arbitrary data so while it's possible that a hash will be valid UTF8 (though that's not very useful) the odds are very very low.
That's why hashes (and binary data in general) are usually displayed in alternative representations e.g. hex, base32 or base64.
You could convert the hash to hex before printing it to prevent this
Neither of the other answers so far have covered what caused the Hash generated part of the answer to get overwritten.
Presumably you were running your program in a terminal. Terminals support various "terminal control codes" that give the terminal information such as which formatting they should use to output the text they're showing, and where the text should be output on the screen. These codes are made out of characters, just like strings are, and Unicode and UTF-8 are capable of representing the characters in question – the only difference from "regular" text is that the codes start with a "control character" rather than a more normal sort of character, but control characters have UTF-8 encodings of their own. So if you try to print some randomly generated UTF-8, there's a chance that you'll print something that causes the terminal to do something weird.
There's more than one terminal control code that could produce this particular output, but the most likely possibility is that the hash contained the byte b'\x0D', which UTF-8 decodes as the Unicode character U+000D. This is the terminal control code "CR", which means "print subsequent output at the start of the current line, overwriting anything currently there". (I use this one fairly frequently for printing progress bars, getting the new version of the progress bar to overwrite the old version of the progress bar.) The output that you posted is consistent with accidentally outputting CR, because some random Unicode full of replacement characters ended up overwriting the start of the line you were outputting – and because the code in question is only one byte long (most terminal control codes are much longer), the odds that it might appear in randomly generated UTF-8 are fairly high.
The easiest way to prevent this sort of thing happening when outputting arbitrary UTF-8 in Rust is to use the Debug implementation for str/String rather than the Display implementation – it will output control codes in escaped form rather than outputting them literally. (As the other answers say, though, in the case of hashes, it's usual to print them as hex rather than trying to interpret them as UTF-8, as they're likely to contain many byte sequences that aren't valid UTF-8.)

GS1-128 barcode with ZPL does not put the AI in ()

i was expecting this command
^FO15,240^BY3,2:1^BCN,100,Y,N,Y,^FD>:>842011118888^FS
to generate a
(420) 11118888
interpretation line, instead it generates
~n42011118888
anyone have idea how to generate the expected output?
TIA!
Joey
If the firmware is up to date, D mode can be used.
^BCo,h,f,g,e,m
^XA
^FO15,240
^BY3,2:1
^BCN,100,Y,N,Y,D
^FD(420)11118888^FS
^XZ
D = UCC/EAN Mode (x.11.x and newer firmware)
This allows dealing with UCC/EAN with and without chained
application identifiers. The code starts in the appropriate subset
followed by FNC1 to indicate a UCC/EAN 128 bar code. The printer
automatically strips out parentheses and spaces for encoding, but
prints them in the human-readable section. The printer automatically
determines if a check digit is required, calculate it, and print it.
Automatically sizes the human readable.
The ^BC command's "interpretation line" feature does not support auto-insertion of the parentheses. (I think it's safe to assume this is partly because it has no way of determining what your data identifier is by just looking at the data provided - it could be 420, could be 4, could be any other portion of the data starting from the first character.)
My recommendation is that you create a separate text field which handles the logic for the parentheses, and place it just above or below the barcode itself. This is the way I've always approached these in the past - I prefer this method because I have direct control over the font, font size, and formatting of the interpretation line.

Decoded barcode extra digits

I am trying to come to terms with how a barcode is decoded and generated by a scanner.
A note from the client says the following generated bar code consists of extra characters:
Generated Code: |2389299920014}
Extra Characters: Apparently the first two and last three characters are not part of the bar code.
Question
Are the extra characters attached by the bar code reader (therefore dependent on the scanner) or are they an intrinsic part of the barcode?
Here is a sample image of a barcode:
http://imageshack.us/a/img824/1862/dm6x.jpg
Thanks
[SOLVED] My apologies. This was just another one of those cases of 'shooting your mouth off' without doing proper research.
Solution The code is EAN13. The prefix and suffix are probably scanner dependent. The 13 digits in between are as follows (first digit from the left) Check Sum (Next 9 digits) Company Id + Item Id (Last 3 Digits ) GS1 prefix
It's hard to answer without understanding what format you are trying to encode, what the intended contents are, and what the purported contents are.
Some formats add extra information as part of the encoding process, but it does not become part of the content. When correctly encoded and decoded, the output should match the input exactly.
Barcodes encode what they encode and there is no data that is somehow part of the barcode but not somehow encoded in it.
EAN-13 has no scanner-dependent considerations, no. The encoding and decoding of a given number is the same everywhere. EAN-13 encodes 13 digits, so I am not sure what the 13 digits "in between" mean.
You mention GS1, which is something else. A family of barcodes in fact. You'd have to say what specifically you are using. The GS1 encodings are likewise not ambiguous or scanner-dependent. You know what you want to encode, you encode it exactly, it's read exactly.

Why does gtk+ say "invalid utf-8" when debugging on eclipse?

I have been creating a gtk+ application in eclipse. At a point in the code, an alert dialogue is displayed using code similar to the gtk+ hello world. When I run this program, the dialogue ends up displaying the content of 'words' as expected, but the program crashes when I close the dialogue. I am new to c, so I ran the program with debug expecting to find some simple mistake. However, when i ran with debug, the dialogue displayed 'words' preceded by many null characters and logged the message.
Pango-WARNING **: Invalid UTF-8 string passed to pango_layout_set_text()
This new problem is confusing, and to add to the confusion, the program also did not crash when the dialogue was closed.
In summary, when I run the code, the text is fine, and the program crashes. When I debug the code, the text is invalid, and the program does not crash.
The text in the dialogue is generated with the following code:
char* answerBuffer = (char*)malloc(strlen(s)+strlen(words)+1);
strcat(answerBuffer,words);
char* answer = (char*)malloc(strlen(answerBuffer)+1);
g_strlcpy(answer,answerBuffer,strlen(answerBuffer)+1);
return answer;
as the code executes, the length of answerBuffer is 320 and words is a char* argument set to "a,b,c,d". I am running this on windows xp through eclipse with the minGW compiler using gtk+ 2.24. Can anyone tell me how to debug/fix this?
ps. 's' contains text from a file followed by either one or twelve null characters (one if I run, twelve if I debug)
Given the code you've supplied, this line is the problem:
strcat(answerBuffer,words);
Why? Because you don't know what is in answerBuffer. Malloc doesn't necessarily zero memory it returns to you, so answerBuffer contains essentially random bytes. You need to zero at least the first byte (so it looks like a zero length string), or use calloc() to allocate your buffer, which gives you zeroed memory.
Well, the odds are that the content of 's' isn't a valid UTF-8 sequence.
Look up what UTF-8 is about in case that confuses you. Or make sure your text file conains only ASCII characters for simplicity.
If that doesn't help you, then you're probably messing up somewhere with the file read or possible encoding conversions.

PHP - detecting the user supplied character's char set

Is it possible to detect the user's string's char set?
If not, how about the next question..
Are there reliable built-in PHP functions that can accurately tell if the user supplied string ( be it supplied thru get/post/cookie etc), are in a UTF-8 or not? In other words, can I do something like
is_utf8($_GET['first_name'])
Is there anyway this function could produce a TRUE where in reality the first_name was not in UTF-8?
Regarding 1:
You can give mb_detect_encoding a try, but it's pretty much a shot in the dark. An "encoded" string is just a bunch of bytes. Such byte sequences are often equally valid in any number of different encodings. It's therefore by definition not possible to detect an unknown encoding reliably, you can only guess. For this reason there exist meta information such as HTTP headers which should communicate the encoding of the transferred content. Check those if available.
Regarding 2:
mb_check_encoding($var, 'UTF-8') will tell you whether the string is a valid UTF-8 string. As far as I've seen, in recent versions of PHP it does what it says on the tin. That still doesn't mean the string is necessarily really a UTF-8 string, it just means the byte sequence is in an order that is valid in UTF-8.

Resources