This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
So I have a couple of compressed files, and the uncompressed versions as well. I do not have the software that created these files originally. I'm trying to figure out what the underlying algorithm is -- can you figure it out? Initially, I thought it might be some LZW variant, but I'm not sure. The data seems to make more sense when broken up into 6-bit words -- I see lots of repeating patterns then.
The two files are very similar, and the uncompressed versions differ by only a few bytes -- that could help establish where those differing bytes are in the compressed files. I've highlighted the differences.
Compressed file #1:
02 02 01 17 0E 11 92 14 C0 55 52 44 FF BC AE 47 DB E1 05 42 F8 70 DE 57 23 FF
54 1A 55 3D BF 54 10 E3 38 0C B2 FB C4 92 1C 20 DE 57 23 FF 54 1A 55 3D BE 5E
4C 96 B2 0E 32 80 CB 2F BC 48 70 83 79 5C 8F FD 50 69 54 F6 F9 96 48 A9 07 19
C2 30 F0 E1 BC AE 47 FE A8 34 AA 7B 7E 32 BF E5 1F EE A8 48 CA 11 87 87 0D E5
72 3F F5 41 A5 53 DB E5 24 5D F8 CA FF 4C B1 13 8C 71 18 7B C3 86 F2 B9 1F FA
A0 D2 A9 ED FD 55 97 BA 22 32 C0 CB 2F BC
Compressed file #2:
02 02 01 17 0E 11 92 14 C0 55 52 44 FF BC AE 47 DB E1 05 42 F8 70 DE 57 23 FF
54 1A 55 3D BF 54 10 E3 38 0D 36 D4 04 92 1C 20 DE 57 23 FF 54 1A 55 3D BE 5E
4C 96 B2 0E 32 80 D3 6D 40 48 70 83 79 5C 8F FD 50 69 54 F6 F9 96 48 A9 07 19
C2 30 F0 E1 BC AE 47 FE A8 34 AA 7B 7E 32 BF E5 1F EE A8 48 CA 11 87 87 0D E5
72 3F F5 41 A5 53 DB E5 24 5D F8 CA FF 4C B1 13 8C 71 18 7B C3 86 F2 B9 1F FA
A0 D2 A9 ED FD 55 97 BA 22 32 C0 D3 6D 40
Uncompressed file #1:
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 2A 2A 2A 2A 2A 20 47 52 41 4E 44 20 54 4F 54 41 4C 53 20
2A 2A 2A 2A 2A 0D 0A 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44
53 20 52 45 41 44 20 20 20 20 20 20 20 20 20 20 20 20 20 20 32 32 38 37 0D 0A
0D 0A 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44 53 20 42 59 50
41 53 53 45 44 20 20 20 20 20 20 20 20 20 20 32 32 38 37 0D 0A 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44 53 20 43 48 41 4E 47 45 44 20 20 20
20 20 20 20 20 20 20 20 20 20 20 30 0D 0A 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 54 4F 54 41 4C
20 52 45 43 4F 52 44 53 20 4E 4F 54 20 4F 4E 20 58 52 45 46 20 20 20 20 20 20
20 20 20 20 30 0D 0A 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44
53 20 42 41 4E 4B 20 4E 4F 54 20 46 4F 55 4E 44 20 20 20 20 20 20 20 30 0D 0A
0D 0A 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44 53 20 57 52 49
54 54 45 4E 20 20 20 20 20 20 20 20 20 20 20 32 32 38 37
Uncompressed file #2:
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 2A 2A 2A 2A 2A 20 47 52 41 4E 44 20 54 4F 54 41 4C 53 20
2A 2A 2A 2A 2A 0D 0A 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44
53 20 52 45 41 44 20 20 20 20 20 20 20 20 20 20 20 20 20 20 33 34 33 39 0D 0A
0D 0A 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44 53 20 42 59 50
41 53 53 45 44 20 20 20 20 20 20 20 20 20 20 33 34 33 39 0D 0A 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44 53 20 43 48 41 4E 47 45 44 20 20 20
20 20 20 20 20 20 20 20 20 20 20 30 0D 0A 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 54 4F 54 41 4C
20 52 45 43 4F 52 44 53 20 4E 4F 54 20 4F 4E 20 58 52 45 46 20 20 20 20 20 20
20 20 20 20 30 0D 0A 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 20 20 20 20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44
53 20 42 41 4E 4B 20 4E 4F 54 20 46 4F 55 4E 44 20 20 20 20 20 20 20 30 0D 0A
0D 0A 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20 20 54 4F 54 41 4C 20 52 45 43 4F 52 44 53 20 57 52 49
54 54 45 4E 20 20 20 20 20 20 20 20 20 20 20 33 34 33 39
As you can see, the output files are just plain ASCII text files. Any ideas?
This seems to be some proprietary encoding format, designed to shave some bits of specific types of messages.
It operates on 8 bit (ascii) input and outputs a bit stream using a mixture of a 5 and 6 bit token set, including some control characters.
The following tokens can be identified:
// 5 bit tokens:
00000 switch to 6 bit mode
00011 take the following 6 bits as N, and output N spaces
00100 A
00101 B
.....
11101 Z
11110 <crlf>
11111 space
// 6 bit tokens:
000001 switch to 5 bit mode
000011 take the following 6 bits as N, and output N spaces
001001 <crlf>
011000 1
011001 2
......
100000 9
// pure speculation:
010111 0
010010 *
000110 repeat the next 6 bit char N times
001100 space
00001 skip 3 bits, take the next 8 bits as ascii, and output N times
Without more examples it is hard to determine what happens at the beginning of the stream. It might be some magic value, or could contain some control values.
Related
I am sending a text by TCP sampler in Jmeter for POS testing(ISO8583) as below:
02441200.0..... .......*19000608032XXXXXX663900100000000000000900000000007340322018053017210620180530200067000020000007340320000000042056636SREESVAGENCIE 28SREESVAGENCIESPEDDAPURAMAPININR005CPYBK3101 140 915555577860003POS005NFNET002NP
But when it is received at the server it is supposed to come as:
30 32 34 34 31 32 30 30 f0 30 81 01 08 e0 80 20
00 00 00 00 04 00 00 2a 31 39 30 30 30 36 30 38
30 33 32 58 58 58 58 58 58 36 36 33 39 30 30 31
30 30 30 30 30 30 30 30 30 30 30 30 30 30 39 30
30 30 30 30 30 30 30 30 30 37 33 34 30 33 32 32
30 31 38 30 35 33 30 31 37 32 31 30 36 32 30 31
38 30 35 33 30 32 30 30 30 36 37 30 30 30 30 32
30 30 30 30 30 30 37 33 34 30 33 32 30 30 30 30
30 30 30 30 34 32 30 35 36 36 33 36 53 52 45 45
53 56 41 47 45 4e 43 49 45 20 20 32 38 53 52 45
45 53 56 41 47 45 4e 43 49 45 53 50 45 44 44 41
50 55 52 41 4d 41 50 49 4e 49 4e 52 30 30 35 43
50 59 42 4b 33 31 30 31 20 20 20 20 20 20 20 20
20 31 34 30 20 20 20 20 20 39 31 35 35 35 35 35
37 37 38 36 30 30 30 33 50 4f 53 30 30 35 4e 46
4e 45 54 30 30 32 4e 50
But it is coming as:
30 32 34 34 31 32 30 30 2e 30 2e 2e 2e 2e 2e 20
2e 2e 2e 2e 2e 2e 2e 2a 31 39 30 30 30 36 30 38
30 33 32 58 58 58 58 58 58 36 36 33 39 30 30 31
30 30 30 30 30 30 30 30 30 30 30 30 30 30 39 30
30 30 30 30 30 30 30 30 30 37 33 34 30 33 32 32
30 31 38 30 35 33 30 31 37 32 31 30 36 32 30 31
38 30 35 33 30 32 30 30 30 36 37 30 30 30 30 32
30 30 30 30 30 30 37 33 34 30 33 32 30 30 30 30
30 30 30 30 34 32 30 35 36 36 33 36 53 52 45 45
53 56 41 47 45 4e 43 49 45 20 20 32 38 53 52 45
45 53 56 41 47 45 4e 43 49 45 53 50 45 44 44 41
50 55 52 41 4d 41 50 49 4e 49 4e 52 30 30 35 43
50 59 42 4b 33 31 30 31 20 20 20 20 20 20 20 20
20 31 34 30 20 20 20 20 20 39 31 35 35 35 35 35
37 37 38 36 30 30 30 33 50 4f 53 30 30 35 4e 46
4e 45 54 30 30 32 4e 50
Kindly let me know if there is any specific setting to be done in Jmeter to send the ISO Request.
some fields in iso8583 are fixed length, if you feed these fields with less data than it expected, it will padded the the remain.In your example seems message fields are padded with 0x2E.Also this could be because of differences on ISO8583 protocol version in server and terminal.As you know some fields length are changed in ver1993 compared to ver1987.
I have a csv file /tmp/test.csv with the following content.
VM,Datacenter,Cluster,Host,Folder,OS,VM ID,VM UUID,vCenter UUID
A0F0US014XVM022,"/AMMWDC04_DC/A0F0 <96> AA DR","Red Hat Enterprise Linux 6 (64-bit)",vm-2910,421f2eba-8b60-6166-3b56-f22e3f71eecf,94694731-df3a-4ee6-9962-49df97a6f08d
I want to replace <96> (surrounded by space) with - in the csv file. I tried sed -i -e 's/<96>/-/g' /tmp/test.csv but this did not work. May be because of the special symbols involved.
sed version 4.2.1
[root#fmsprdchef001 ~]# grep vm-2910 /tmp/test.csv | hexdump -C
00000000 41 30 46 30 55 53 30 31 34 58 56 4d 30 32 32 2c |A0F0US014XVM022,|
000000e0 41 4d 4d 57 44 43 30 34 5f 44 43 2f 41 30 46 30 |AMMWDC04_DC/A0F0|
000000f0 20 96 20 41 41 20 44 52 22 2c 22 52 65 64 20 48 | . AA DR","Red H|
00000100 61 74 20 45 6e 74 65 72 70 72 69 73 65 20 4c 69 |at Enterprise Li|
00000110 6e 75 78 20 36 20 28 36 34 2d 62 69 74 29 22 2c |nux 6 (64-bit)",|
00000120 76 6d 2d 32 39 31 30 2c 34 32 31 66 32 65 62 61 |vm-2910,421f2eba|
00000130 2d 38 62 36 30 2d 36 31 36 36 2d 33 62 35 36 2d |-8b60-6166-3b56-|
00000140 66 32 32 65 33 66 37 31 65 65 63 66 2c 39 34 36 |f22e3f71eecf,946|
00000150 39 34 37 33 31 2d 64 66 33 61 2d 34 65 65 36 2d |94731-df3a-4ee6-|
00000160 39 34 36 32 2d 34 39 64 66 39 37 61 36 66 30 38 |9462-49df97a6f08|
00000170 64 0a 41 30 46 30 55 53 30 31 34 58 56 4d 30 32 |d.A0F0US014XVM02|
00000180 32 2c 70 6f 77 65 72 65 64 4f 6e 2c 46 61 6c 73 |2,poweredOn,Fals|
00000190 65 2c 56 6d 78 6e 65 74 33 2c 2c 2c 54 72 75 65 |e,Vmxnet3,,,True|
000001a0 2c 54 72 75 65 2c 30 30 3a 35 30 3a 35 36 3a 39 |,True,00:50:56:9|
000001b0 66 3a 30 31 3a 62 38 2c 61 73 73 69 67 6e 65 64 |f:01:b8,assigned|
000001c0 2c 22 31 30 2e 31 30 30 2e 31 2e 31 32 2c 20 66 |,"10.100.1.12, f|
000001d0 65 38 30 3a 3a 32 35 30 3a 35 36 66 66 3a 66 65 |e80::250:56ff:fe|
000001e0 39 66 3a 31 62 38 22 2c 22 41 6d 65 72 69 63 61 |9f:1b8","America|
000001f0 6e 20 41 69 72 6c 69 6e 65 73 3b 20 50 52 44 3b |n Airlines; PRD;|
00000200 20 61 70 70 20 26 20 44 42 32 3b 44 52 22 2c 41 | app & DB2;DR",A|
00000210 4d 4d 57 44 43 30 34 5f 44 43 2c 41 4d 4d 57 44 |MMWDC04_DC,AMMWD|
00000220 43 30 34 43 41 2c 61 6d 6d 77 64 63 30 34 63 75 |C04CA,ammwdc04cu|
00000230 73 74 65 73 78 30 31 2e 69 6d 7a 63 6c 6f 75 64 |stesx01.imzcloud|
00000240 2e 69 62 6d 61 6d 6d 73 61 70 2e 6c 6f 63 61 6c |.ibmammsap.local|
00000250 2c 22 2f 41 4d 4d 57 44 43 30 34 5f 44 43 2f 41 |,"/AMMWDC04_DC/A|
00000260 30 46 30 20 96 20 41 41 20 44 52 22 2c 22 52 65 |0F0 . AA DR","Re|
00000270 64 20 48 61 74 20 45 6e 74 65 72 70 72 69 73 65 |d Hat Enterprise|
00000280 20 4c 69 6e 75 78 20 36 20 28 36 34 2d 62 69 74 | Linux 6 (64-bit|
00000290 29 22 2c 76 6d 2d 32 39 31 30 2c 34 32 31 66 32 |)",vm-2910,421f2|
000002a0 65 62 61 2d 38 62 36 30 2d 36 31 36 36 2d 33 62 |eba-8b60-6166-3b|
000002b0 35 36 2d 66 32 32 65 33 66 37 31 65 65 63 66 2c |56-f22e3f71eecf,|
000002c0 39 34 36 39 34 37 33 31 2d 64 66 33 61 2d 34 65 |94694731-df3a-4e|
000002d0 65 36 2d 39 34 36 32 2d 34 39 64 66 39 37 61 36 |e6-9462-49df97a6|
000002e0 66 30 38 64 0a |f08d.|
000002e5
This is the important part from your hexdump:
000000f0 20 96 20 41 41 20 44 52 22 2c 22 52 65 64 20 48 | . AA DR","Red H|
I suggest:
sed -i 's/ \x96 /-/' file
Following sed may help you on same.
sed 's/\([^<]*\)\(<.*>\)\(.*\)/\1-\3/' Input_file
In case you want to save the output into same Input_file then use -i option with sed in above code.
EDIT: In case your Input_file is having control characters then use following.
sed 's/\r//g;s/\([^<]*\)\(<.*>\)\(.*\)/\1-\3/' Input_file
I was trying to assist with a (deleted) question, here at SO, about how to define an Hive external table over data generated by teragen.
According to the teragen code's comments, each 100 bytes of data (=row) should end with \r \n, however, It seems that it ends with 4 characters with hex values of cc dd ee ff
The full demo is down below.
Any thoughts?
Thanks
/** * Generate the official terasort input data set. * The user
specifies the number of rows and the output directory and this *
class runs a map/reduce program to generate the data. * The format of
the data is: * * (10 bytes key) (10 bytes rowid) (78 bytes
filler) \r \n * The keys are random characters from the set ' '
.. '~'. * The rowid is the right justified row id as a int. *
The filler consists of 7 runs of 10 characters from 'A' to 'Z'. *
* *
https://github.com/facebookarchive/hadoop-20/blob/master/src/examples/org/apache/hadoop/examples/terasort/TeraGen.java
Using teragen to generate 7 records
hadoop jar /usr/jars/hadoop-examples.jar teragen 7 /user/hive/warehouse/teragen
As expected, we get files with total data volume of 700 bytes
hdfs dfs -ls /user/hive/warehouse/teragen
Found 3 items
-rw-r--r-- 1 cloudera supergroup 0 2017-03-03 22:38 /user/hive/warehouse/teragen/_SUCCESS
-rw-r--r-- 1 cloudera supergroup 400 2017-03-03 22:38 /user/hive/warehouse/teragen/part-m-00000
-rw-r--r-- 1 cloudera supergroup 300 2017-03-03 22:38 /user/hive/warehouse/teragen/part-m-00001
Moving the files to local directory and checking the HEX values.
hdfs dfs -get /user/hive/warehouse/teragen/part-m-00001
od -v -Anone -w20 -tx1
At this point I was expecting to see 0a 0d (\r\n) as the last 2 characters of each 100 bytes, but instead I see ee ff.
There are no newline at the end of the "rows".
5c 90 ab 38 ae 52 89 62 15 d7 00 11 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 34 88 99 aa bb 41 41 41 41 42 42 42 42 42 42 42 42
32 32 32 32 34 34 34 34 34 34 34 34 39 39 39 39 35 35 35 35
42 42 42 42 31 31 31 31 38 38 38 38 44 44 44 44 cc dd ee ff <--
72 dc 0c a5 1e 33 3f 32 4b 7a 00 11 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 35 88 99 aa bb 38 38 38 38 33 33 33 33 42 42 42 42
38 38 38 38 38 38 38 38 34 34 34 34 37 37 37 37 32 32 32 32
37 37 37 37 39 39 39 39 30 30 30 30 32 32 32 32 cc dd ee ff <--
10 43 1a f6 a0 d8 47 b8 c5 5f 00 11 30 30 30 30 30 30 30 30
30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
30 30 30 36 88 99 aa bb 39 39 39 39 37 37 37 37 34 34 34 34
41 41 41 41 37 37 37 37 45 45 45 45 44 44 44 44 41 41 41 41
41 41 41 41 39 39 39 39 38 38 38 38 42 42 42 42 cc dd ee ff <--
I'm not sure that the output of your terasort is relative to that TeraGen which you are referencing in your link. If you open the terasort content from some other source you'll be able to see:
Generate the official GraySort input data set. The user specifies the number of rows and the output directory and this class runs a map/reduce program to generate the data. The format of the data is:
(10 bytes key) (constant 2 bytes) (32 bytes rowid) (constant 4 bytes) (48 bytes filler) (constant 4 bytes)
The rowid is the right justified row id as a hex number.
Following this description I compare it with your first link:
5c 90 ab 38 ae 52 89 62 15 d7 - 10 bytes key
00 11 - constant 2 bytes
30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 34 - 32 bytes rowid
88 99 aa bb - constant 4 bytes
41 41 41 41 42 42 42 42 42 42 42 42 32 32 32 32 34 34 34 34 34 34 34 34 39 39 39 39 35 35 35 35 42 42 42 42 31 31 31 31 38 38 38 38 44 44 44 44 - 8 bytes filler
cc dd ee ff - constant 4 bytes
So it is not the newline but just a constant 4 bytes produced by generator for every record.
Consider the following script.
#!/bin/bash
echo {00..99}
n=99
echo {00..$n}
The output of this script is:
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
{00..99}
The desired output is:
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
One solution which produces the desired output is
eval echo {00..$n}
Unfortunately, this solution uses eval which I'd prefer to avoid if possible.
Does anyone know of a way to obtain the desired result using brace expansion but not eval?
From the bash manual:
The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and pathname expansion.
Given that variable expansion comes after brace expansion, and that there is no way to induce a different order of operations without using eval, I would have to conclude that no, there is no way to avoid using eval.
Does anyone know of a way to obtain the desired without without eval?
You can use seq command,
seq -w -s ' ' 0 $n
Test:
sat $ seq -w -s " " 0 $n
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
Not sure if this meets your requirements, as it doesn't use braces, but (with GNU seq at least) the following produces the desired output:
$ n=99
$ seq -f%02.0f -s' ' 00 "$n"
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
The "-f" option produces the zero-padding, and the "-d" uses spaces to separate, rather than newlines.
I am trying to solve the Projecteuler #11 but I am running into an error when I'm trying to create a function to calculate the multiplication of every 4 numbers in a column. I am getting an error:
Project11.rb:59:in `sumvertical': undefined method `[]' for nil:NilClass (NoMeth
odError)
I feel like there is something I am easily overlooking here. I appreciate the help!
#project #11 http://projecteuler.net/problem=11
grid="08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48"
grid=grid.split()
grid=grid.collect {|s| s.to_i}
multiarray=[]
i = 0
e = 19
until e > 400
multiarray << grid[i..e]
i+= 20
e+= 20
end
def sumhorizontal(x) #checks sum of all horizontal 4 elements
sum = 0
x.each {|a|
i=0
e=3
while e < a.length
if a[i..e].inject(:*) > sum
sum = a[i..e].inject(:*)
i += 1
e += 1
else
i += 1
e += 1
end
end
}
return sum
end
def sumvertical(x)
sum = 0
i=0
e=0
while e < x.length #Will break once the end point is longer than the length of an array
until i > 20 #Checks the first column
if x[i][e]*x[i+1][e]*x[i+2][e]*x[i+3][e] > sum #Error is here
sum = x[i][e]*x[i+1][e]*x[i+2][e]*x[i+3][e]
i += 1
else
i += 1
end
end
e += 1 #once you are out of the until statement, it increases e by 1 to check the next column
i = 0 #resets i so it can go back to the zero
end
return sum
end
print sumvertical(multiarray)
The grid has 20 rows. Your loop is actually trying to reach all the way to a 24rd row; that's because it goes through 21 iterations (i starts at 0, and goes until it equals 21), and each iteration reaches 3 beyond the current value of i (when you call x[i+3]). When i is 17, your code will break, because x[i+3][e] is trying to index into the 21st row of x. i+3 is 20, but the highest available index is 19. So what happens is, x[20] returns nil, and then the [] method is called on nil, which generates your error.
Also, the standard library has a transpose method that you can call on your array. If you use it, you just need one method (sumhorizontal). You can get the column sums with sumhorizontal(multiarray.transpose).
One more thing... it looks like you're coming from a procedural language. Ruby has an extensive standard library and coding constructs that can save you a lot of time and keystrokes. There is typically no need to iterate with while loops and index variables in Ruby. sumhorizontal, for instance, can be written like this (it should really be called producthorizontal, though if you're trying to solve Project Euler #11:
def sumhorizontal(x)
x.map { |r| r.each_slice(4).map { |s| s.reduce(:*) }.max }.max
end
Good luck with the rest of your Ruby learning journey!