How to unpack COMP digits using Java? - ascii

I had found this useful link for unpacked COMP-3 digit, but i need to unpack COMP digit this time, is anyone know how to unpack it? Thanks a lot!

In most Cobol compilers Comp is a big endian binary integer. For the mainframe only 2/4/8 bytes are supported. So For signed values
03 Signed-Num pic s9(4) comp.
if you have the value in an array of bytes you can do
BigInteger value = new BigInteger(byteArray);
Alternatively you could use the readShort(), readInt() and readLong() methods of DataInputStream
Finally JRecord will let you read Cobol files with a Cobol copybook

IBM Provides a library of Java methods to simplify interactions with z/OS services and data formats. An overview can be found here jZOS Toolkit
Here is a link to PackedDecimal Operations
Here are others for managing binary data ByteArrayUnmarshaller

Related

PYSPARK - Reading, Converting and splitting a EBCDIC Mainframe file into DataFrame

We have an EBCDIC Mainframe format file which is already loaded into Hadoop HDFS Sytem. The File has the Corresponding COBOL structure as well. We have to Read this file from HDFS, Convert the file data into ASCII format and need to split the data into Dataframe based on its COBOL Structure. I've tried some options which didn't seem to work. Could anyone please suggest us some proven or working ways.
For python, take a look at the Copybook package (https://github.com/zalmane/copybook). It supports most features of Copybook includes REDEFINES and OCCURS as well as a wide variety of PIC formats.
pip install copybook
root = copybook.parse_file('sample.cbl')
For parsing into a PySpark dataframe, you can use a flattened list of fields and use a UDF to parse based on the offsets:
offset_list = root.to_flat_list()
disclaimer : I am the maintainer of https://github.com/zalmane/copybook
Find the COBOL Language Reference manual and research functions DISPLAY-OF and National-Of. The link : https://www.ibm.com/support/pages/how-convert-ebcdic-ascii-or-ascii-ebcdic-cobol-program.

Ruby equivalent of ReadString?

I'm working on a project with a "customer made" database. He developed a C++/CLI application that stores and retrieves his data from a binary file using the BinaryWriter.Write(String) and BinaryReader.ReadString() methods.
I'm no C++/CLI expert but from what I understand these methods use a 7-bits encoding of the first bytes to determine the String length.
I need to access his data from a rail application, anyone's got an idea of how to do the same think in ruby?
If you're dealing with raw binary data, you'll probably need to spend some time familiarizing yourself with the pack and unpack methods and their various options. Maybe what you're describing is a "Pascal string" where the length is encoded up front, or a variation on that.
For example:
length = data.unpack("C")[0]
string = data.unpack("Ca#{length}")[0]
The double-unpack is required because you don't know the length of the string to unpack until you do the first step. You could probably do this using a substring as well, like data[1,length] if you're reasonably certain you're not dealing with UTF-8 data.

image data representation

I wanted to know if I would be able to decompress a png with png++ and be able to get an access to the pixels with a file pointer and store them in a 2d or a 3d array and represent them in a hex format as the final result like a hex editor would. If not could anybody please suggest me a way I can do the same .
Intended language : c++
platform : linux.
Thanks in advance .
use fread to get the values but you should know how the header is stored how many bytes of header length of data part start and end; i recommend start looking into this wiki page and try to read values using fread http://en.wikipedia.org/wiki/Portable_Network_Graphics

Modify ASN.1 BER encoded CDR file

I am processing some CDRs (call detailed record). I dont know which exactly the file it is? But i supposed this to be 'ASN.1' format BER encoded files. Now my problem is that I want to modify some data in this files but I dont know which Editor or decorder I can use to modify this files. I searched a lot and found many ASN.1 Decorder as well as ASN.1 BSR viewer/editor but no one allows what i want to perform.
This CDR is supposed to contain Customer detail, phone number, telecom services(telephony, SMS, MMS) etc.
One of CDR name is - GGSN01_20120105000102_56641-09-12-01-09%3A30
and file type is - File
No other information is available. When I am opening this file in some text editor it show some rectangles and some text data.
Any telecom guy can definite help me. I am new to telecom domain.
Please ask if you need more information. Thanks
You would need to know something about ASN.1 and BER to be able to correctly edit your file. BER is a binary format, not ASCII text, thus what you see in your text editor. Even modifying any embedded plain text is only safe if you are not changing the length of the string; BER uses nested structures that encode lengths and so a change in the length of a string value requires adjustments to the encoded lengths of the enclosing structures. Additionally, in order to really know what your data is, you would need to know the ASN.1 that describes it (defines the types that describe your encoded data).
You could use a tool such as ASN.1 editor, but without the requisite background knowledge, I think it will not be very helpful to you. You can follow various links on this resources page to get more information about ASN.1. (full disclosure: I am currently an Obj-Sys employee).
Look for tools like enber and unber, they come as debugging tools with the fee asn.1-compiler of Lev Walkin. At least you get text-format from them.
The systemic solution is, of course to write a program that reads the BER-file, applies the schnages and then writes out the altered BER-file. To do so you need the ASN.1-Specification file of your CDR-Format (usually to be found in the specifications of the standard e.g. IMS, you are using) an asn1-compiler such as Lev's and some programming skills.

how to check a ruby string is an actural string or a blob data such as image

In ruby how to check a string is an actural string or a blob data such as image, from the data type of view they are ruby string, but really their contents are very different since one is literal string, the other is blob data such as image.
Could anyone provide some clue for me? Thank you in advance.
Bytes are bytes. There is no way to declare that something isn't file data. It'd be fairly easy to construct a valid file in many formats consisting only of printable ASCII. Especially when dealing with Unicode, you're in very murky territory. If possible, I'd suggest modifying the method so that it takes two parameters... use one for passing text and the other for binary data.
One thing you might do is look at the length of the string. Most image formats are at least 500-600 bytes even for a tiny image, and while this is by no means an accurate test, if you get passed, say, a 20k string, it's probably an image. If it were text, it would be quite a bit (Like a quarter of a typical novel, or thereabouts)
Files like images or sound files have defined blocks that can be "sniffed". Wotsit.org has a lot of info about the key bytes and ways to determine what the files are. By looking at those byte offsets in your data you could figure it out.
Another way way is to use some "magic", which is code to sniff key-bytes or byte-types in a file to try to figure out what its type is. *nix systems have it built in via the file command. Do a man file or man magic for more info or check Wikipedia's article on Magic numbers in files.
Ruby Filemagic uses the same technique but is based on GNU's libmagic.
What would constitute a string? Are you expecting simple ASCII? UTF-8? Or text encoded some other way?
If you know you're going to get ASCII text or a blob then you can just spin through the first n bytes and see if anything has the eight bit set, that would tell you that you have binary. OTOH, not finding anything wouldn't guarantee that you had text.
If you're going to get UTF-8 Unicode then you'd do the same thing but look for invalid UTF-8 sequences. Of course, the same caveats apply.
You could scan the first n bytes for anything between 0x00 and 0x20. If you find any bytes that low then you probably have a binary blob of some sort. But maybe not.
As Tyler Eaves said: bytes are bytes. You're starting with a bunch of bytes and trying to find an interpretation of them that makes sense.
Your best bet is to make the caller supply the expected interpretation or take Greg's advice and use a magic number library.

Resources