How to handle multi line fixed length file with BeanIO - etl

I'm very new to BeanIO, it is solving most of my problems but I'm unable to figure out how to solve this one:
I have a multiline fixed width file in the following format:
BBB001 000 000000
BBB555 001 George
BBB555 002 London
BBB555 003 UK
BBB555 999 000000
BBB555 001 Jean
BBB555 002 Paris
BBB555 003 France
BBB555 004 Europe
BBB555 999 000000
BBB999 000 000000
Basically there is a header and footer which I can easily read because they are well defined. However a single record is actually on multiple lines and end of the record is the line that that has 999 in the middle ( there is no other information on that line). I was wondering what should my xml be or what classes do I need to override so I can properly read this type of format.

I would suggest using the lineContinuationCharacter property, as described in the BeanIO documentation. It probably has to be configured as a carriage return and line feed.
Try something like this:
<beanio xmlns="http://www.beanio.org/2012/03"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">
<stream name="s1" format="fixedlength" lineContinuationCharacter="">
<!-- record layout... -->
</stream>
</beanio>
Note that I haven't tested this, but according to the documentation this should work.

Related

YamlDotNet- packet to yaml

Hello I wanted to create program that will help me with converting packet data from game to yaml it should look like this
monsters:
- map_monster_id: 1678
vnum: 333
map_x: 30
map_y: 165
- map_monster_id: 1679
vnum: 333
map_x: 24
map_y: 157
i have code that is supposed to write those things in database and I want rework so it can write to yaml anyone who should tell me where to start thank you :)

How to load data using sqlldr

My raw file like below, Here Transaction Branch is different and Home branch is different
I need data like below, can any one help me to archive this (using sqlldr concept)
Tr_Bra Acc Name Bal Home_Bra
100 1000 bbbb 100 100
100 1001 bbbb 200 100
101 1003 bbbb 400 200
101 1004 bbbb 500 200
102 1005 bbbb 400 500
102 1006 bbbb 500 500
Assuming you want to upload the filename (ie: 100.csv) as a value for SQL Loader to load into the table, there no way for the program to identify the file name as a parameter.
I suggest concatenating the files with another script (Python would we great for the job) and then loading the final file with SQL Loader.

SoftQuad DESC or font file binary

I read this question but it doesn't helped me. I am solving a challenge where I have two files, first one was .png which gave me upper half part of an image, second file is SoftQuad DESC or font file binary I am sure that this file should somehow convert into .png file to complete the image. I googled and got hint about magic bytes but I am unable to match the bytes.
These are the first two rows of output of xxd command
00000000: aaaa a6bb 67bb bf18 dd94 15e6 252c 0a2f ....g.......%,./
00000010: fe14 d943 e8b5 6ad5 2264 1632 646e debc ...C..j."d.2dn..
These are the last two rows of output of xxd command
00001c10: 7a05 7f4c 3600 0000 0049 454e 44ae 4260 z..L6....IEND.B`
00001c20: 82
.

Find missing values between text files using VBScript

EDIT 2
Just to make it clear - this question is not asking for "DA CODEZ", just an idea as to possible approaches in VBScript. It's not my weapon of choice, and if I needed to do this on a un*x box then I wouldn't be bothering anyone with this...
I have two text files. File A contains a list of keys:
00001 13756 000.816.000 BE2B
00001 13756 000.816.000 BR1B
00002 16139 000.816.000 BR1B
00001 10003 000.816.000 CH1C
00001 10003 000.816.000 CH3D
00001 13756 000.816.000 CZ1B
....
....
File B, tab separated, contains two columns the keys, and a UUID:
00003 16966 001.001.023 2300 a3af3b1d-ea04-4948-ba25-59b36ae364ae
00001 12119 001.001.023 CZ1B e6efe825-0759-48b0-89b9-05bbe5d49625
00002 16966 001.001.023 BR1B d3a1d62b-a0d5-43c3-ba49-a219de5f32a5
00001 12119 001.001.023 BR1B 5d74af27-ed4b-4f90-8229-90b6d807515b
00001 10009 001.001.024 BR1B 590409cc-496a-49eb-885c-9bbc51863363
00002 24550 001.001.024 2100 46ecea5d-f8f5-4df9-92cf-0b73f6c81adc
00001 12119 001.001.024 CZ1B e415ce6f-7394-4a66-a7f8-f76487e78086
00002 16966 001.001.024 CZ1B c591a726-4d71-4f61-adfd-63310d21d397
....
....
I need to extract, using plain VBScript, the UUIDs for those entries in File B which have no matching entry in File A. (I need to optimise for speed, if that's an important criteria.) The result should be a file of orphaned UUID codes.
If this isn't easy/possible, that's also an answer - I can do it in the db we're using but the performance is woefully inadequate. Just been blown away by how much faster VBScript was than db solution for a previous processing task.
EDIT
Someone's suggested using some sort of ADO library, after converting the file to CSV, which I'm looking into.
Maybe the fastest way to do it is just to ask the OS to do it
Dim orphan
orphan = WScript.CreateObject("WScript.Shell").Exec( _
"findstr /g:keys.txt /v uuids.txt" _
).StdOut.ReadAll()
WScript.Echo orphan
That is, use findstr to check the uuids.txt file for lines that do not match any of the lines in the keys.txt file

Change Data Capture in delimited files

There are two tab delimited files (file1, file2) with same number and structure of records but with different values for columns.
Daily we get another file (newfile) with same number and structure of records but with some changes in column values.
Compare this file (newfile) with two files (file1, file2) and update the records in them with changed records, keeping unchanged records intact.
Before applying changes:
file1
11 aaaa
22 bbbb
33 cccc
file2
11 bbbb
22 aaaa
33 cccc
newfile
11 aaaa
22 eeee
33 ffff
After applying changes:
file1
11 aaaa
22 eeee
33 ffff
file2
11 aaaa
22 eeee
33 ffff
What could be the easy and most efficient solution? Unix shell scripting? The files are huge containing millions of records, can a shell script be efficient solution in this case?
Daily we get another file (newfile) with same number and structure of records but with
some changes in column values.
This sounds to me like a perfect case for git. With git you can commit the current file as it is.
Then as you get new "versions" of the file, you can simply replace the old version with the new one, and commit again. The best part is each time you make a commit git will record the changes from file to file, giving you access to the entire history of the file.

Resources