bwip.js: How to use the Group Separator character with GS1-128

bwip.js: How to use the Group Separator character with GS1-128 - barcode

There is a service hosted on for generating barcodes metafloor.com using bwip.js
I want to generate a barcode for following data (GS character is represent by {GS}).
(01)10875066000333(10)1212{GS}(17)121212(30)8{GS}
According the documentation I'm able to generate a barcode for data without GS character
https://bwipjs-api.metafloor.com/?bcid=gs1-128&text=(01)10875066000333(10)1212(17)121212(30)8
But the scanner require GS characters.
The documentation is clear
Special characters must be encoded in format ^NNN
Parse option has to be true, by using parsefnc parameter
The parameter has to be URL-encoded.
So for my string it's:
https://bwipjs-api.metafloor.com/?bcid=gs1-128&text=(01)10875066000333(10)1212%5E029(17)121212(30)8%5E029&parsefnc
But this gives me Error: bwipp.GS1badCSET82character: AI 10: Invalid CSET 82 character.
I also tried
Send GS char directly as %1D
Send GS char as %5EGS
Send GS char as ^029
Send GS char directly
Set parsefnc=true
Combination of all above
But still getting the same error.
Is there something I'm doing wrong or is the problem on the other side?

For GS1 Application Identifier based data, trust the library to encode the data correctly by selecting the GS1-specific encoder for the symbology (gs1datamatrix in this case) and then provide the input in bracketed AI notation, i.e. without FNC1 / GS separators.
The encoder will automatically add all of the necessary FNC1 non-data characters (which are transmitted as ASCII GS characters when read by a scanner) and it will also validate the contents of the AI data that you supply.
Users that select a generic symbology and then attempt to perform the AI encoding themselves are prone to making several mistakes:
Omitting the required FNC1 in first position.
Omitting the required FNC1 separators at the end of AIs with no pre-determined width.
Terminating pre-defined length AIs with unnecessary FNC1 characters.
Terminating the message with an unnecessary FNC1 character.
Encoding ASCII GS data characters instead of the canonical FNC1 non-data characters.
Including illegal, literal parentheses to denote the AIs.
Providing improperly formatted or invalid AI values.
Omitting requisite AI attributes.
Including mutually-exclusive AI pairings.
Many of these mistakes will result in failure to decode and interpret the GS1 AI data (even if the barcode appears to read successfully) which may result in charge-backs and necessitate relabelling or disposal.
The data that you are providing falls afoul of at least some of these pitfalls.
See this article for a thorough description of the checks that BWIPP (and hence BWIP-JS) implements to prevent such data quality issues.

Related

Find and replace non utf8 character

I have a process that inserts data into PDFs that eventually loads into a system that gets searched based on that inserted data. The inserted data looks something like:
<<
/IBM-ODIndexes
<< /Private
<<
/DOB (05031983)
/FULL_NAME (TEST USER)
/YEAR (2020)
>>
/LastModified(D:20210112201530)
>>
However, there are instances where the data in the FULL_NAME field contains non UTF8 characters and then users are unable to search the data. Specifically apostrophes come over from Microsoft Word and then gets interpreted like this:
/FULL_NAME (JERRY OÃ<83>Â¢Ã¢â<80><9a>Â¬Ã¢â<80><9e>Â¢CONNELL)
In this case I am looking to strip out the apostrophe that is represented as Ã<83>Â¢Ã¢â<80><9a>Â¬Ã¢â<80><9e>Â¢ and replace it with a white space.

There are several complexities here, but in general I would say that the only reliable way to deal with it is to figure out the text encoding of the incoming document and converting it to the target encoding.
Ã<83>Â¢Ã¢â<80><9a>Â¬Ã¢â<80><9e>Â¢ is 34 characters (that is, at least 34 bytes), and no single encoding ever used that much space for a single character. What’s probably happening is multiple levels of encoding, such as HTML entities, base64, UTF-8/16/32 or escape characters like %% to represent % in SQL or \\ to represent \ in Bash. Reversing all these levels of encoding manually is going to involve quite a lot of reading the huge docx standard. The simpler alternative is to use a library which can just convert the entire text into a known character encoding for you, at which point you have to do at most a single conversion into UTF-8.
Another argument for this is that the “apostrophe string” does contain otherwise harmless characters like “a” and “e”. Without at least some understanding of the encodings you’re unlikely to be able to separate encoded characters from non-encoded ones, which would make the resulting text full of invalid text.

Parsing FNC1 character with bwip-js gs1datamatrix

What I want is to generate a GS1 datamatrix using the bwip-js API with a FNC1 passed in.
I have tried the example provided in their website (Online Barcode API documentation) throught Postman and it returns the correct value back (ie. without the FNC1 character in the scanned result).
Their example request (parses FNC1 correctly)
http://bwipjs-api.metafloor.com/?bcid=code128&text=%5EFNC1011234567890&parsefnc&alttext=%2801%291234567890
However when I use my example for the GS1 data matrix, with the FNC1 value, I get the FNC1 in the scanned result. So it is not parsing the FNC1 value correctly.
My request (does not parse FNC1 correctly):
http://bwipjs-api.metafloor.com/?bcid=gs1datamatrix&text=%5EFNC1(01)03453120000011(17)120508(10)ABCD1234(410)9501101020917&parsefnc&alttext=%2801%291234567890
I have read all the documentation and articles I can find about their generator and the FNC1 character, but didn't give me any clues.
Am I doing anything wrong here?
UPDATE:
The input to BWIP-JS:
(01)99312650999998(91)111JD507496002000960300(420)2164(8008)181102113732
Image generated:

The code in bwip-js is PostScript and I'm no expert in that language. But try taking the 'FNC1' out of your request and see if that works.
I think it's trying to automatically add FNC1 to any GS1 Datamatrix (see section starting a line 23903) when it sees an AI, whereas for Data Matrix it has to be explicitly requested.

The FNC1 character is invisible to the console, so it can be tricky to see, but I've managed to parse it out of raw strings using the following:
var decoded = decodedString.split(decodeURI("%1D"));
If you're getting the FNC codes in parentheses, you could probably use a REGEX to remove them.

What is the actual HEX / binary value of the GS1 FNC1 character?

I have searched many a page on wikipedia, the official GS1 specifications, but have yet to find a definite answer to the question
What is the actual HEX / binary value of the GS1 FNC1 character?
There is much information about how to use the GS1 identifiers, how to print the barcodes with ZPL and how to encode the FNC1, but I want to know the actual HEX value of that character.

The special function characters such as FNC1 through FNC4 belong to the class of "non-data characters" that can be encoded within various barcode symbologies but with do not have any direct ASCII representation in the decoded data stream. Each symbology that supports such characters has a different scheme for encoding them in its internal representation quite distinct from any byte-orientated character data.
The FNC characters serve both as flag characters (indicating something special to the reader) and as formatting characters (modifying the meaning of the encoded data). As such they are not intended to be transmitted directly in the data received by the host system from a basic barcode reader, although in both cases they may have an "effect" on the transmitted message.
The usual purpose of each of the FNC characters are as follows:
FNC1 - Structured Data flag character indicating GS1 and AIM formatting AND group separator formatting character, amongst other uses.
FNC2 - Message Append flag character for buffering the data in groups of symbols for a single read.
FNC3 - Reader Programming flag character for device configuration purposes.
FNC4 - Extended ASCII formatting character for encoding characters with ordinals 128-255.
Be aware that they may not all be available in certain barcode symbologies and may even be specified in different, non-typical or overloaded ways.
Encoding an FNC character in a symbol's internal data is accomplished via an "escape mechanism" that is specific to the encoding software. Each library has a different way of accepting these non-data characters within their input. For example, to use FNC1 in its typical GS1 structured data role for the data "(01)00312345678906(21)123456789012(30)0144" you might see the FNC1 characters escaped as {FNC1} so that the input looks like {FNC1}010031234567890621123456789012{FNC1}300144.
Some libraries will even use a set of regular or extended ASCII characters as placeholders for the FNC characters, but these are arbitrary representations and it is a mistake to consider them to be actual ASCII values for these non-data characters.
Upon scanning a barcode the symbol's internal data is typically decoded then transmitted to the host over a basic channel (e.g. keyboard wedge) as a sequence of bytes to be interpreted according to the Latin-1 character encoding. The FNC characters cannot be represented in such a manner and are excluded from the data stream, however their formatting effect on the data remains.
For instance, the standards for most symbologies specify that when an FNC1 character is being used in its role as a field separator in data conforming to GS1 Application Identifier Standard Format it should be decoded and transmitted as GS (ASCII 29). Explicitly stated, the formatting effect of a FNC1 character used as a GS1 Application Identifier separator is to place a GS character at the end of the variable-length field. But in other roles (such as when FNC1 is used in "first/second position" as a flag character and with non-GS1 formatted data) there is no formatting effect on the carried data and therefore no ASCII representation during decoding.
Another instance of the special function characters having a formatting effect on the data is with symbologies that use FNC4 to extend their reach from 7-bit ASCII into extended ASCII as described in this answer.
A subtle technical point is that the data transferred to the host is often prefixed with a short symbol indicator header known as a "symbology identifier" which denotes the type and usage of the symbol from which the data is being read. This is often modified by the presence of otherwise invisible flag characters within the symbol data, for example to indicate the presence of GS1 formatted data with "FNC1 in first" or to indicate reader programming mode when FNC3 appears anywhere in the symbol. The details are symbology specific.
Aside: In addition to FNC non-data characters, there are other non-data characters commonly supported by barcode symbologies that have no direct ASCII representation but affect the overall message. These include macro characters (that wrap the message data in an "envelope"), and ECI indicators that require the use of a transmission protocol beyond the typical "basic channel" mode but which enable the use of extended character sets amongst other enhancements.

Important is to know (and to setup a scanner properly) that the FNC1 character at the first position is translated to a symbology identifier according ISO/IEC 15424. The modifier m of the symbology identifier shows if there was a FNC1 or not. If this is not done the application cannot see anymore if a GS1 Structure was intended or not. Other structures are identified by e.g. Macro 06 in a data matrix code (ISO/IEC 16022, ISO/IEC 15434). Its required to figure our the difference to take the correct action to process the data.

Decoded barcode extra digits

I am trying to come to terms with how a barcode is decoded and generated by a scanner.
A note from the client says the following generated bar code consists of extra characters:
Generated Code: |2389299920014}
Extra Characters: Apparently the first two and last three characters are not part of the bar code.
Question
Are the extra characters attached by the bar code reader (therefore dependent on the scanner) or are they an intrinsic part of the barcode?
Here is a sample image of a barcode:
http://imageshack.us/a/img824/1862/dm6x.jpg
Thanks
[SOLVED] My apologies. This was just another one of those cases of 'shooting your mouth off' without doing proper research.
Solution The code is EAN13. The prefix and suffix are probably scanner dependent. The 13 digits in between are as follows (first digit from the left) Check Sum (Next 9 digits) Company Id + Item Id (Last 3 Digits ) GS1 prefix

It's hard to answer without understanding what format you are trying to encode, what the intended contents are, and what the purported contents are.
Some formats add extra information as part of the encoding process, but it does not become part of the content. When correctly encoded and decoded, the output should match the input exactly.
Barcodes encode what they encode and there is no data that is somehow part of the barcode but not somehow encoded in it.
EAN-13 has no scanner-dependent considerations, no. The encoding and decoding of a given number is the same everywhere. EAN-13 encodes 13 digits, so I am not sure what the 13 digits "in between" mean.
You mention GS1, which is something else. A family of barcodes in fact. You'd have to say what specifically you are using. The GS1 encodings are likewise not ambiguous or scanner-dependent. You know what you want to encode, you encode it exactly, it's read exactly.

Error with utf8 encoding

When I get data from some website, sometime the data is encode in utf8 but look like this:
Thỏ , Nạt
The accent mark is seperated from character when in fact these string must be:
Thỏ, Nạt
I don't know what is the problem here and how to correct it. Can someone help me with this

The first sample string contains two Vietnamese characters in decomposed form. The first one of them is “ỏ”, consisting of simple letter “o” followed by U+0309 COMBINING HOOK ABOVE.
The second sample string has those characters in precomposed form. The first one of them is “ỏ” U+1ECF LATIN SMALL LETTER O WITH HOOK ABOVE.
The decomposed and precomposed form are defined to be “canonical equivalent” and are normally expected to result in the same rendering (though this does not always happen). They are not identical, however; in programmatic comparison of characters and strings, they are very much different.
Mostly Latin letters with diacritics, such as “é” and “ä”, are used in precomposed form only, since that’s what keyboard drivers, online keyboards, character picking utilities, etc., normally produce. However, Vietnamese keyboard drivers often work so that some diacritic marks are entered after entering a base character, and the diacritic is thus produced as a combining character, i.e. the letter (like “ỏ”) is then in decomposed form.
One way of dealing with this issue, recommended in many contexts, is to convert your strings to Normalization Form C (NFC). This would put these characters into precomposed form. Note, however, that conversion to NFC removes some other distinctions, too (but this is not relevant if the text is in Vietnamese only and does not contain special symbols).
It remains a mystery why the first sample string has a space character before the comma.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio