COBOL logic for de-normalized file to Normalized table - logic

How to load de-normalized file to Normalized table. I'm new to cobol, any suggestion on the below requirement. Thanks.
Inbound file: FileA.DAT
ABC01
ABC2014/01/01
FDE987
FDE2012/01/06
DEE6759
DEE2014/12/12
QQQ444
QQQ2004/10/12
RRR678
RRR2001/09/01
Table : TypeDB
TY_CD Varchar(03)
SEQ_NUM CHAR(10)
END_DT DATE
I have to write a COBOL program to load the table : TypeDB
Output of the result should be,
TY_CD SEQ_NUM END_DT
ABC 01 2014/01/01
FDE 987 2012/01/06
DEE 6759 2014/12/12
QQQ 444 2004/10/12
RRR 678 2001/09/01
Below is the pseudo-codeish
Perform Until F1 IS EOF
Read F1
MOVE F1-REC to WH1-REC
Read F1
MOVE F1-REC to WH2-REC
IF WH1-TY-CD = WH2-TY-CD
move WH1-TY-CD to TY-CD
move WH1-CD to SEQ_NUM
move WH2-DT to END-DT
END-IF
END-PERFORM
This is not working.. any thing better? instead read 2 inside the perform?

I'd definitely go with reading in pairs, like you have. It is clearer, to me, than having "flags" to say what is going on.
I suspect you've overwritten your first record with the second without realising it.
A simple way around that, for a beginner, is to use READ ... INTO ... to get your two different layouts. As you become more experienced, you'll perhaps save the data you need from the first record, and just use the second record from the FD area.
Here's some pseudo-code. It is the same as yours, but by using a "Priming Read". This time the Priming Read is reading two records. No problem.
By testing the FILE STATUS field as indicated, the paired structure of the file is verified. Checking the key ensures that the pairs are always for the same "key" as well. All built-in and hidden away from your actual logic (which in this case is not much anyway).
PrimingRead
FileLoop Until EOF
ProcessPair
ReadData
EndFileLoop
ProcessPair
Do the processing from Layout1 and Layout2
PrimingRead
ReadData
Crash with non-zero file-status
ReadData
ReadRec1
ReadRec2
If Rec2-key not equal to Rec1-key, crash
ReadRec1
Read Into Layout1
Crash with non-zero file-status
ReadRec2
Read Into Layout2
Crash with file-status other than zero or 10
While we are at it, we can apply this solution from Valdis Grinbergs as well (see https://stackoverflow.com/a/28744236/1927206).
PrimingRead
FileLoop Until EOF
ProcessPair
ReadPairStructure
EndFileLoop
ProcessPair
Do the processing from Layout1 and Layout2
PrimingRead
ReadPairStructure
Crash with non-zero file-status
ReadPairStructure
ReadRec1
ReadSecondOfPair
ReadSecondOfPair
ReadRec2
If Rec2-key not equal to Rec1-key, crash
ReadRec1
Read Into Layout1
Crash with non-zero file-status
ReadRec2
Read Into Layout2
Crash with file-status other than zero or 10
Because the structure of the file is very simple either can do. With fixed-number-groups of records, I'd go for the read-a-group-at-a-time. With a more complex structure, the second, "sideways".
Either method clearly reflects the structure of the file, and when you do that in your program, you aid the understanding of the program for human readers (which may be you some time in the future).

Related

Print lines around position in the file

I'm importing a big csv (5gb) file to the BiqQuery and I had information about an error in the file and its position — specified as a byte offset from the start of the file (for example, 134683757). I'd like to look at lines around this error position.
Some example lines of the file:
field1, field2, field3
abc, bcd, efg
...
dge, hfr, kdf,
dgj, "a""a", fbd # in this line is an invalid csv element and I get error, let's say on the position 134683757
skd, frd, lqw
...
asd, fij, fle
I need some command to show lines around error like
dge, hfr, kdf,
dgj, "a""a", fbd
skd, frd, lqw
I tried sed and awk but I didn't find any simple solution.
It was definitely not clear from the original version of the question that you only got a byte offset from the start of the file.
You need to get a better position from the software generating the error; the developer was lazy in reporting an unusable number. It is reasonable to request a line number (and preferably offset within the line), rather than (or as well as) the byte offset from the start.
Assuming that the number is a byte position in the file, that gets tricky. Most Unix utilities work with lines (of variable length). I'd be tempted to write some C code to do the job, but that might be beyond you (and no shame in that).
Failing that, your best is likely the dd command. If the number reported is 134683757, then I'd guess that your lines are probably not more than 1 KiB each (adjust numbers if they're bigger, or smaller), and then use:
dd if=big.csv of=extract.csv bs=1 skip=$((134683757 - 3 * 1024)) count=6144
echo >> extract.csv
You'd then look at extract.csv. The raw dd output probably won't have a newline at the end of the last line (the echo >>extract.csv fixes that). The output will probably start part way through a record and end part way through another record. However, you're likely to have the relevant information, as well as some irrelevant information. As I said, adjust the numbers to suit your exact situation.
The trickiest part is identifying exactly where the byte offset is in the file you get. With custom C code, that can be provided easily (more easily). With the output from dd, you have to do the calculation yourself.
awk -v offset=$((134683757 - 3 * 1024)) '
{ printf "%9d: %s\n", offset, $0; offset += length($0) + 1 }
' extract.cvs
That takes the starting offset from the dd command, and prefixes the (remnants of) the first line with that number and the data; it then adds the length to the offset plus one for the newline that wasn't counted, and continues to the end of the file. That gives you the start offset for each line in the extracted data. You can see where your actual start was by looking at the offsets — you should be able to identify which record that was.
You could use a variant of this Awk script that reads the whole file line by line, and tracks the offset (as well as the line numbers) and prints the data when it gets to the vicinity of where you have the problem.
In times long past, I had to deal with data from 1/2 inch mag tapes (those big circular tapes you see in old movies) where the files generated on a mainframe seemed sanely formatted for the first few tens of megabytes, but then the format changed to some alternative format for a few megabytes, and then reverted to the original format once more. I never did find out why; I just learned how to deal with it. Trial and error!

Why does Trackpy give me an error when I try to compute the overall drift speed?

I'm going through the Trackpy walkthrough (http://soft-matter.github.io/trackpy/v0.3.0/tutorial/walkthrough.html) but using my own pictures. When I get to calculating the overall drift velocity, I get this error and I don't know what it means:drift error
I don't have a ton of coding experience so I'm not even sure how to look at the source code to figure out what's happening.
Your screenshot shows the traceback of the error, i.e. you called a function, tp.compute_drift(), but this function called another function, pandas_sort(), which called another function, etc until raise ValueError(msg) is called, which interrupts the chain. The last line is the actual error message:
ValueError: 'frame' is both an index level and a column label, which is ambiguous.
To understand it, you have to know that Trackpy stores data in DataFrame objects from the pandas library. The tracking data you want to extract drift motion from is stored in such an object, t2. If you print t2 it will probably look like this:
y x mass ... ep frame particle
frame ...
0 46.695711 3043.562648 3.881068 ... 0.007859 0 0
3979 3041.628299 1460.402493 1.787834 ... 0.037744 0 1
3978 3041.344043 4041.002275 4.609833 ... 0.010825 0 2
The word "frame" is the title of two columns, which confuses the sorting algorithm. As the error message says, it is ambiguous to sort the table by frame.
Solution
The index (leftmost) column does not need a name here, so remove it with
t2.index.name = None
and try again. Check if you have the newest Trackpy and Pandas versions.

ASCII control of VFD

All,
I am a new user here, and thought I would see if the experts could help me with something I am new to.
I have been given the following statement to try and solve:
The Variable Frequency Drive (VFD) is connected to the PLC by RS485 communication. The speed of the motor (M2) can be adjusted by sending the following command:
STX N DATA ETX , with each separate value having the <> symbols around them.
Data : Length of data is 1 byte, in which the value of S (Slow), M (Medium) or F (Fast) can be sent.
N : Node number of the VFD, with a data length of two byte ASCII.
My question is, how would I type to send this data? It doesn't say whether to use a specific data type to represent, so surely I could just type the data as it is, e.g. STX 1 S ETX?
Othwerside, I'm not sure how to combine the byte representations of the data, representing them in hex, binary or decimal. I'm not sure what is meant by two byte ASCII, is this not UNICODE-16? Also, I'm not sure if I need to send the values of STX or ETX with the data string or not
I hope someone can shed some light on this.
Thanks in advance.
Since the frequency goes from 0-50 Hz, I think we should send data in this range.
So if we want the frequency to be half maximal, we will send 25.
To send this to VFD, we first need to split that number into 2 and 5
The message should read STX 2 5 ETX?
Now we look at the ASCII code table and find 2 and 5.
0x50 = 2
0x53 = 5
We convey everything in a message that reads
STX 0x50 0x53 ETX
The aforementioned S7-300 is recommended for operation. You can also solve this through his TIA portal.
All,
I managed to figure this out with a bit of digging. I simulated it using Siemens S7-300 on TIA portal, and set up communications on a module. I sent the values I wanted using a "move" block, to a value set in the Data Block.
I repeated this for the Node value, making sure the correct data type was chosen, and sent the data through a Send_ptp command block.
Must have got a bit flustered and tired the other night when I was trying it. Hopefully it might help someone in the future.

ORA-29285 error While exporting data from oracle database to CSV file using procedure

I have a problem here while exporting data from oracle database to CSV file using a procedure. While exporting data, sometimes the CSV file is getting truncated and the error it shows is "ORA-29285 - File write error". The problem here is the file is getting truncated not all the times but randomly.
Edit : Below is the chunk of code I used in my procedure
conn := utl_file.fopen('sample_directory','output.csv','W',4096);
for i in (select * from per_data)
loop
utl_file.put_line(conn,i.name||','||i.sub||','||to_char(i.start_date,'dd-mon-yy')||','||to_char(i.expire_date,'dd-mon-yy')||','||i.loc||CHR(13));
end loop;
utl_file.fclose(conn);`
I am pulling my hair to trace out the reason. Can someone help me out ?
One way to get this error is to open the file with a certain maximum line size - possibly the default 1024 - and then try to write a single line to that file which is longer than that.
In your case you don't seem to be doing that, as you open it with 4096 and your lines are all (apparently) shorter than that.
So you may be hitting the 32k limitation:
The maximum size of the buffer parameter is 32767 bytes unless you specify a smaller size in FOPEN. If unspecified, Oracle supplies a default value of 1024. The sum of all sequential PUT calls cannot exceed 32767 without intermediate buffer flushes.
You don't seem to be doing any flushing. You could change your put_line call to auto-flush:
utl_file.put_line(conn,
i.name||','||i.sub||','||to_char(i.start_date,'dd-mon-yy')
||','||to_char(i.expire_date,'dd-mon-yy')||','||i.loc||CHR(13),
true);
or keep a counter in your loop and manually flush every 100 lines (or whatever number works and is efficient for you).
As noted in the documentation:
If there is buffered data yet to be written when FCLOSE runs, you may receive WRITE_ERROR when closing a file.
You aren't flushing before you close, so adding an explicit flush - even if you have autoflush set to true - might also help avoid this, at least if the exception is being thrown by the fclose() call rather than by put_line():
...
end loop;
utl_file.fflush(conn);
utl_file.fclose(conn);

How can I debug a Fortran READ/WRITE statement with an implicit DO loop?

The Fortran program I am working is encountering a runtime error when processing an input file.
At line 182 of file ../SOURCE_FILE.f90 (unit = 1, file = 'INPUT_FILE.1')
Fortran runtime error: Bad value during integer read
Looking to line 182 I see a READ statement with an implicit/implied DO loop:
182: READ(IT4, 310 )((IPPRM2(IP,I),IP=1,NP),I=1,16) ! read 6 integers
183: READ(IT4, 320 )((PPARM2(IP,I),IP=1,NP),I=1,14) ! read 5 reals
Format statement:
310 FORMAT(1X,6I12)
When I reach this code in the debugger NP has a value of 2. I has a value of 6, and IP has a value of 67. I think I and IP should be reinitialized in the loop.
My problem is that when I try to step through in the debugger once I get to the READ statement it seems to execute and then throw the error. I'm not sure how to follow it as it reads. I tried stepping into the function, but it seems like that may be a difficult route to take since I am unfamiliar with the gfortran library. The input file looks OK, I think it should be read just fine. This makes me think this READ statement isn't looping as intended.
I am completely new to Fortran and implicit DO loops like this, but from what I can gather line 182 should read in 6 integers according to the format string #310. However, when I arrive NP has a value of 2 which makes me think it will only try to read 2 integers 16 times.
How can I debug this read statement to examine the values read into IPPARM as they are read from the file? Will I have to step through the Fortran library?
Any tips that can clear up my confusion regarding these implicit loops would be appreciated!
Thanks!
NOTE: I'm using gfortran/gcc and gdb on Linux.
Is there any reason you need specific formatting on the read? I would use READ(IT4, *) where feasible...
Later versions of gfortran support unlimited format reads (see link http://fortranwiki.org/fortran/show/Fortran+2008+status)
Then it may be helpful to specify
310 FORMAT("*(1X,6I12)")
Or for older compilers
310 FORMAT(1000(1X,6I12))
The variables IP and I are loop indices and so they are reinitialized by the loop. With NP=2 the first statement is going to read a total of 32 integers -- it is contributing to the determination the list of items to read. The format determines how they are read. With "1X,6I12" they will be read as 6 integers per line of the input file. When the first 6 of the requested 32 integers is read fron a line/record, Fortran will consider that line/record completed and advance to the next record.
With a format of "1X,6I12" the integers must be precisely arranged in the file. There should be a single blank, then the integers should each be right-justified in fields of 12 columns. If they get out of alignment you could get the wrong value read or a runtime error.

Resources