I'm studying binary file structure of JVM classfile.
My current toolbox consists of $ xxd <classfile> and $ javap -v <classfile>.
Sample outputs of these two tools are as follows:
$ xxd com/example/mycode/MyTest.class
00000000: cafe babe 0000 003d 001d 0a00 0200 0307 .......=........
00000010: 0004 0c00 0500 0601 0010 6a61 7661 2f6c ..........java/l
00000020: 616e 672f 4f62 6a65 6374 0100 063c 696e ang/Object...<in
00000030: 6974 3e01 0003 2829 5609 0008 0009 0700 it>...()V.......
...
000001a0: 0000 000a 0002 0000 0005 0008 0006 0001 ................
000001b0: 001b 0000 0002 001c ........
and
$ javap.exe -v com/example/mycode/MyTest.class
Classfile /<PathTo>/MyTest.class
Last modified 2022/11/01; size 440 bytes
...
interfaces: 0, fields: 0, methods: 2, attributes: 1
Constant pool:
#1 = Methodref #2.#3 // java/lang/Object."<init>":()V
#2 = Class #4 // java/lang/Object
#3 = NameAndType #5:#6 // "<init>":()V
#4 = Utf8 java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
...
#27 = Utf8 SourceFile
#28 = Utf8 MyTest.java
{
public com.example.mycode.MyTest();
descriptor: ()V
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 3: 0
public static void main(java.lang.String[]);
...
}
SourceFile: "MyTest.java"
But, from these two outputs, it is difficult to comprehend which part of one output
correspond which part of the other.
It is hard to analyze the hex dumped binary by comparing with the disassembled output.
In this particular case I could manually assign tags by referring
specification,
but it was hard work even if the sample file is a trivial hello world.
In general large files such method is hard to be done.
Edit: made the question prorer
So what I want to do is the following:
Syntax highlight the xxd dump output along classfile structure
so that easily view which part is, for example, the constant pool part,
or the method info and attributes, in order to easily compare
with javap output.
More aggressively, it is useful to view javap and xxd outputs
side by side, and selecting a text on one side results in
highlighting corresponding text on the other side.
So, my question:
Is there any way or any other tools to understand
xxd hex dump output in terms of javap decompiled output?
Especially I'd like to comprehend that each hex corresponds to
each decompiled entry one-to-one.
My current idea is to highlight colors on hex dump,
possibly like the following image.
Is there any software to do like this?
Maybe I need to do some coding,
something like writing parser of .class-file.
Then, which is the efficient way to do it in less effort
to obtain highlighted hex dump with format tag annotations
according to the .class-file specs, like shown in the image below?
Thank you for reading.
I have this model:
poss_in = layers.Input((1,))
poss_lr = layers.Dense(8, activation='relu')(poss_in)
hist_in = layers.Input((100,))
hist_lr = layers.Reshape((100, 1))(hist_in)
hist_lr = layers.LSTM(32)(hist_lr)
hist_lr = layers.Dense(32, activation='relu')(hist_lr)
sent_in = layers.Input((10,))
sent_lr = layers.Reshape((10, 1))(sent_in)
sent_lr = layers.Conv1D(4, 3)(sent_lr)
sent_lr = layers.GRU(4)(sent_lr)
root_lr = layers.concatenate([poss_lr, hist_lr, sent_lr])
root_lr = layers.Reshape((44, 1))(root_lr)
root_lr = Attention(16)(root_lr)
root_lr = layers.Dense(16)(root_lr)
root_lr = layers.Dense(1)(root_lr)
model = Model([poss_in, hist_in, sent_in], root_lr)
and I'm trying to create a DQN agent with:
dqn = agents.DQNAgent(
model=model,
memory=memory.SequentialMemory(limit=50000, window_length=1),
policy=policy.BoltzmannQPolicy(),
nb_actions=1,
nb_steps_warmup=64,
target_model_update=1e-2
)
dqn.compile('Adam', metrics=['mae'])
but I receive this error:
/usr/local/lib/python3.7/dist-packages/keras/optimizer_v2/adam.py:105: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
super(Adam, self).__init__(name, **kwargs)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-18-3d71fb800af2> in <module>
7 target_model_update=1e-2
8 )
----> 9 dqn.compile(opt.Adam(lr=1e-3), metrics=['mae'])
17 frames
/usr/local/lib/python3.7/dist-packages/rl/agents/dqn.py in compile(self, optimizer, metrics)
165
166 # We never train the target model, hence we can set the optimizer and loss arbitrarily.
--> 167 self.target_model = clone_model(self.model, self.custom_model_objects)
168 self.target_model.compile(optimizer='sgd', loss='mse')
169 self.model.compile(optimizer='sgd', loss='mse')
/usr/local/lib/python3.7/dist-packages/rl/util.py in clone_model(model, custom_objects)
13 'config': model.get_config(),
14 }
---> 15 clone = model_from_config(config, custom_objects=custom_objects)
16 clone.set_weights(model.get_weights())
17 return clone
/usr/local/lib/python3.7/dist-packages/keras/saving/model_config.py in model_from_config(config, custom_objects)
50 '`Sequential.from_config(config)`?')
51 from keras.layers import deserialize # pylint: disable=g-import-not-at-top
---> 52 return deserialize(config, custom_objects=custom_objects)
53
54
/usr/local/lib/python3.7/dist-packages/keras/layers/serialization.py in deserialize(config, custom_objects)
209 module_objects=LOCAL.ALL_OBJECTS,
210 custom_objects=custom_objects,
--> 211 printable_module_name='layer')
212
213
/usr/local/lib/python3.7/dist-packages/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
681 custom_objects=dict(
682 list(_GLOBAL_CUSTOM_OBJECTS.items()) +
--> 683 list(custom_objects.items())))
684 else:
685 with CustomObjectScope(custom_objects):
/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py in from_config(cls, config, custom_objects)
707 'name', 'layers', 'input_layers', 'output_layers']):
708 input_tensors, output_tensors, created_layers = reconstruct_from_config(
--> 709 config, custom_objects)
710 model = cls(
711 inputs=input_tensors,
/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py in reconstruct_from_config(config, custom_objects, created_layers)
1324 # First, we create all layers and enqueue nodes to be processed
1325 for layer_data in config['layers']:
-> 1326 process_layer(layer_data)
1327 # Then we process nodes in order of layer depth.
1328 # Nodes that cannot yet be processed (if the inbound node
/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py in process_layer(layer_data)
1306 from keras.layers import deserialize as deserialize_layer # pylint: disable=g-import-not-at-top
1307
-> 1308 layer = deserialize_layer(layer_data, custom_objects=custom_objects)
1309 created_layers[layer_name] = layer
1310
/usr/local/lib/python3.7/dist-packages/keras/layers/serialization.py in deserialize(config, custom_objects)
209 module_objects=LOCAL.ALL_OBJECTS,
210 custom_objects=custom_objects,
--> 211 printable_module_name='layer')
212
213
/usr/local/lib/python3.7/dist-packages/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
684 else:
685 with CustomObjectScope(custom_objects):
--> 686 deserialized_obj = cls.from_config(cls_config)
687 else:
688 # Then `cls` may be a function returning a class.
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer_v1.py in from_config(cls, config)
515 A layer instance.
516 """
--> 517 return cls(**config)
518
519 def compute_output_shape(self, input_shape):
/usr/local/lib/python3.7/dist-packages/keras/layers/dense_attention.py in __init__(self, use_scale, **kwargs)
321
322 def __init__(self, use_scale=False, **kwargs):
--> 323 super(Attention, self).__init__(**kwargs)
324 self.use_scale = use_scale
325
/usr/local/lib/python3.7/dist-packages/keras/layers/dense_attention.py in __init__(self, causal, dropout, **kwargs)
70
71 def __init__(self, causal=False, dropout=0.0, **kwargs):
---> 72 super(BaseDenseAttention, self).__init__(**kwargs)
73 self.causal = causal
74 self.dropout = dropout
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py in _method_wrapper(self, *args, **kwargs)
627 self._self_setattr_tracking = False # pylint: disable=protected-access
628 try:
--> 629 result = method(self, *args, **kwargs)
630 finally:
631 self._self_setattr_tracking = previous_value # pylint: disable=protected-access
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in __init__(self, seed, force_generator, **kwargs)
3436 **kwargs: other keyword arguments that will be passed to the parent class
3437 """
-> 3438 super().__init__(**kwargs)
3439 self._random_generator = backend.RandomGenerator(
3440 seed, force_generator=force_generator)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/tracking/base.py in _method_wrapper(self, *args, **kwargs)
627 self._self_setattr_tracking = False # pylint: disable=protected-access
628 try:
--> 629 result = method(self, *args, **kwargs)
630 finally:
631 self._self_setattr_tracking = previous_value # pylint: disable=protected-access
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer_v1.py in __init__(self, trainable, name, dtype, dynamic, **kwargs)
138 }
139 # Validate optional keyword arguments.
--> 140 generic_utils.validate_kwargs(kwargs, allowed_kwargs)
141
142 # Mutable properties
/usr/local/lib/python3.7/dist-packages/keras/utils/generic_utils.py in validate_kwargs(kwargs, allowed_kwargs, error_message)
1172 for kwarg in kwargs:
1173 if kwarg not in allowed_kwargs:
-> 1174 raise TypeError(error_message, kwarg)
1175
1176
TypeError: ('Keyword argument not understood:', 'units')
I have tryied to replace the DQN with SARSA and DDPG agents but they all generated the same error.
I looked up the problem in internet for a while and I've asked on r/tensorflow but I haven't resolved anything yet.
For additional information, I'm using Google Colab.
Thanks for every reply!
UPDATE:
I tryied to simplify the model in order to check if the problem was in a layer, so I created this model:
poss_in = layers.Input((1,))
poss_lr = layers.Dense(1)(poss_in)
hist_in = layers.Input((100,))
hist_lr = layers.Dense(1)(hist_in)
sent_in = layers.Input((10,))
sent_lr = layers.Dense(1)(sent_in)
root_lr = layers.concatenate([poss_lr, hist_lr, sent_lr])
root_lr = layers.Dense(1)(root_lr)
model = Model([poss_in, hist_in, sent_in], root_lr)
Using this model the DQN agent was compiled with no errors.
The data set had 1511 observations. I used the first 1400 values to fit ARIMA model of order (1,1,9), keeping the rest for predictions. But when I look at the predictions, apart from the first 16 values all the remaining values are the same. Here's what I tried:
model2=ARIMA(tstrain,order=(1,1,9))
fitted_model2=model2.fit()
And for prediction:
start=len(tstrain)
end=len(tstrain)+len(tstest)-1
predictions=fitted_model2.predict(start,end,typ='levels')
Here tstrain and tstest are the train and test sets.
predictions.head(30)
1400 214.097742
1401 214.689674
1402 214.820804
1403 215.621131
1404 215.244980
1405 215.349230
1406 215.392444
1407 215.022312
1408 215.020736
1409 215.021384
1410 215.021118
1411 215.021227
1412 215.021182
1413 215.021201
1414 215.021193
1415 215.021196
1416 215.021195
1417 215.021195
1418 215.021195
1419 215.021195
1420 215.021195
1421 215.021195
1422 215.021195
1423 215.021195
1424 215.021195
1425 215.021195
1426 215.021195
1427 215.021195
1428 215.021195
1429 215.021195
Please help me out here. What am I missing?
I am trying to build a temperature control application for a 68000 processor. I am currently using GCC 8.2.0. I am compiling with the -msoft-float flag. However, the floating point library routines appear to be broken. Example:
'000174f4 <__ltdf2>:'
'174f4: 4e56 0000 linkw %fp,#0'
'174f8: 4878 0001 pea 1 <ADD>'
'174fc: 2f2e 0014 movel %fp#(20),%sp#-'
'17500: 2f2e 0010 movel %fp#(16),%sp#-'
'17504: 2f2e 000c movel %fp#(12),%sp#-'
'17508: 2f2e 0008 movel %fp#(8),%sp#-'
'1750c: 61ff bsrs 1750d <__ltdf2+0x19>'
'1750e: ffff .short 0xffff'
'17510: fd94 .short 0xfd94'
'17512: 4e5e unlk %fp'
'17514: 4e75 rts'
'17516: 4e71 nop'
Can someone explain why this code is generated or what is happening here? No way will a 68000 branch to an odd address.
UPDATE
I've been digging into this, and the problem appears to be injected during linking. Dumping the code for this function from libgcc.a shows the following:
`00000000 <__ltdf2>:`
` 0: 4e56 0000 linkw %fp,#0`
`4: 4878 0001 pea 1 <__ltdf2+0x1>`
` 8: 2f2e 0014 movel %fp#(20),%sp#-`
` c: 2f2e 0010 movel %fp#(16),%sp#-`
`10: 2f2e 000c movel %fp#(12),%sp#-`
`14: 2f2e 0008 movel %fp#(8),%sp#-`
`18: 61ff 0000 0000 bsrl 1a <__ltdf2+0x1a>`
`1e: 4e5e unlk %fp`
` 20: 4e75 rts`
So the linker must be trying to fill in the branch offset and messing up. Since the source for this function is a string of macros, I'm not sure where it really wanted to branch to.
On MC68020 and later, bra, bsr and bcc are followed by a 32bit displacement, if the 8bit displacement is 0xff. In your case this does not make sense, since 0xfffffd94 would fit into 16bit.
Make sure to compile (and that your softfloat-library is compiled) for 68000, if you don't have a 68020 or later.
I download a CSV file and save it with this code:
body = HTTPoison.get!(url).body
|> String.replace("ü", "ü")
|> String.replace("ö", "ö")
File.write!("/tmp/example.csv", body)
To do the String.replace/3 to replace ü with ü is of course not a good way. HTTPoison tells me that the body is {"Content-Type", "csv;charset=utf-8"}.
How can I solve this without String.replace/3?
What you have here is data that is first UTF-8 encoded, then the bytes are treated as latin1 encoding and encoded to UTF-8 again.
A hex dump snippet from the data in that URL shows this:
00007d20: 2c22 222c 2c2c 224f 7269 6769 6e3a 2044 ,"",,,"Origin: D
00007d30: c383 c2bc 7373 656c 646f 7266 222c 224b ....sseldorf","K
00007d40: 6579 776f 7264 733a 204c 6173 7420 4d69 eywords: Last Mi
ü is encoded as <<0xc3, 0x83, 0xc2, 0xbc>> which was probably created like this:
iex(1)> "ü\0"
<<195, 188, 0>>
iex(2)> <<195::utf8, 188::utf8>> == <<0xc3, 0x83, 0xc2, 0xbc>>
true
To reverse this process, you can use a combination of :unicode.characters_to_list and :erlang.list_to_binary.
iex(3)> <<0xc3, 0x83, 0xc2, 0xbc>> |> :unicode.characters_to_list |> :erlang.list_to_binary
"ü"
That URL also includes a BOM at the start:
00000000: efbb bf22 5a75 7069 6422 2c22 5072 6f67 ..."Zupid","Prog
^^^^ ^^
00000010: 7261 6d49 6422 2c22 4d65 7263 6861 6e74 ramId","Merchant
00000020: 5072 6f64 7563 744e 756d 6265 7222 2c22 ProductNumber","
This can be removed using |> Enum.drop(1) after :unicode.characters_to_list.
So the following should work for you:
HTTPoison.get!(url).body
|> :unicode.characters_to_list
|> Enum.drop(1)
|> :erlang.list_to_binary