Jackson XML "Undeclared general entity" caused by custom entity - jackson-dataformat-xml

I'm deserializing a large XML file (not mine) and it contains custom entities defined as:
<!ENTITY math "mathematics">
and elements used this way:
<field>&math;</field>
When I try to deserialize it by:
XmlMapper xmlMapper = new XmlMapper();
ClassLoader classloader = Thread.currentThread().getContextClassLoader();
return xmlMapper.readValue(classloader.getResourceAsStream("file.xml"), MyClass.class);
I get this error:
com.fasterxml.jackson.databind.JsonMappingException: Undeclared general entity "math"
I think it might be a security measure to prevent Xml External Entity injections.
Is there a way to mark these custom entities as valid? Like create an Enum for them or something?
If not, is there a flag to just parse these as Strings?
Update:
I was able to work around this problem by basically doing a find-replace on the text file. It's quite an ugly solution and if anyone has a better idea, I'm all ears. :)

I know it may be a little late, but just in case someone else is stuck on the same issue:
You have to set a custom XMLResolver as XMLInputFactory's property:
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import com.ctc.wstx.api.WstxInputProperties;
import javax.xml.stream.XMLResolver;
import javax.xml.stream.XMLStreamException;
var xmlMapper = new XmlMapper();
xmlMapper.getFactory().getXMLInputFactory().setProperty(
WstxInputProperties.P_UNDECLARED_ENTITY_RESOLVER,
new XMLResolver() {
#Override
public Object resolveEntity(String publicId, String systemId, String baseUri, String ns) throws XMLStreamException {
// replace the entity with a string of your choice, e.g.
switch (ns) {
case "nbsp":
return " ";
default:
return "";
}
// some useful tool is org.apache.commons.text.StringEscapeUtils
// e.g.
// return StringEscapeUtils.escapeXml10(StringEscapeUtils.unescapeHtml4('&' + ns + ';'));
}
}
);
// then xmlMapper.readValue....

Related

InvalidProtocolBufferException in Java client when deserializing protobuf data from C++ server

I have a protobuf message like this:
message Update {
Path path = 1; // The path (key) for the update.
Value value = 2 [deprecated=true]; // The value (value) for the update.
TypedValue val = 3; // The explicitly typed update value.
}
// TypedValue is used to encode a value being sent between the client and
// target (originated by either entity).
message TypedValue {
oneof value {
string string_val = 1; // String value.
int64 int_val = 2; // Integer
....
google.protobuf.Any any_val = 9; // protobuf.Any encoded bytes.
....
}
}
On the server side (C++), we are setting this field as follows (LLDP is the outer class and Interfaces is inside that):
openconfig_lldp::Lldp out;
GetLldpProto(&out);
update->mutable_val()->mutable_any_val()->PackFrom(out.interfaces());
On the client side (Java), we are extracting this field like this:
OpenconfigLldp.Lldp.Interfaces interfaces = update.getVal().getAnyVal().unpack(OpenconfigLldp.Lldp.Interfaces.class);
This is throwing a InvalidProtocolBufferException exception. When I dump the "update" in my Java client, I see this:
path {
elem {
name: "lldp"
}
elem {
name: "interfaces"
}
}
val {
any_val {
type_url: "type.googleapis.com/openconfig_lldp.Lldp.Interfaces"
value: "\212\207\237\334\v\374\001\022\371\001\262\211\267l\031\342\367\304\260\002\v\n\tEth 1/1/1\242\340\247\230\017\002\b\001\352\316\234\250\017\324\001\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh1\342\253\214\353\001\v\n\tEth 1/1/1\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh2\342\253\214\353\001\v\n\tEth 1/1/2\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh3\342\253\214\353\001\v\n\tEth 1/1/3\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh4\342\253\214\353\001\v\n\tEth 1/1/4\242\364\301\261\a\002\b\n\212\207\237\334\v\374\001\022\371\001\262\211\267l\031\342\367\304\260\002\v\n\tEth 1/1/2\242\340\247\230\017\002\b\001\352\316\234\250\017\324\001\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh1\342\253\214\353\001\v\n\tEth 1/1/1\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh2\342\253\214\353\001\v\n\tEth 1/1/2\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh3\342\253\214\353\001\v\n\tEth 1/1/3\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh4\342\253\214\353\001\v\n\tEth 1/1/4\242\364\301\261\a\002\b\n\212\207\237\334\v\374\001\022\371\001\262\211\267l\031\342\367\304\260\002\v\n\tEth 1/1/3\242\340\247\230\017\002\b\001\352\316\234\250\017\324\001\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh1\342\253\214\353\001\v\n\tEth 1/1/1\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh2\342\253\214\353\001\v\n\tEth 1/1/2\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh3\342\253\214\353\001\v\n\tEth 1/1/3\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh4\342\253\214\353\001\v\n\tEth 1/1/4\242\364\301\261\a\002\b\n\212\207\237\334\v\374\001\022\371\001\262\211\267l\031\342\367\304\260\002\v\n\tEth 1/1/4\242\340\247\230\017\002\b\001\352\316\234\250\017\324\001\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh1\342\253\214\353\001\v\n\tEth 1/1/1\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh2\342\253\214\353\001\v\n\tEth 1/1/2\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh3\342\253\214\353\001\v\n\tEth 1/1/3\242\364\301\261\a\002\b\n\262\217\304\272\017/\022-\302\340\317\247\001\'\202\225\377\302\001\b\n\006Neigh4\342\253\214\353\001\v\n\tEth 1/1/4\242\364\301\261\a\002\b\n"
}
}
The type_url seems correct to me. What am I doing wrong here?
Thanks for your time.
EDIT #1:
I looked at the exception string. It is "Type of the Any message does not match the given class."
The same proto file is used for C++ and Java, but I see "openconfig_lldp.Lldp.Interfaces" in C++, where as, it is "OpenconfigLldp.Lldp.Interfaces" in Java. Need to find out why..
EDIT #2:
The same .proto file is used. In this case, it is:
openconfig_lldp.proto
---------------------
syntax = "proto3";
package openconfig.openconfig_lldp;
message Lldp {
message Config {
....
....
}
....
....
}
In case of Java, I see the parent class as OpenconfigLldp in a package called openconfig_lldp.
package openconfig.openconfig_lldp;
public final class OpenconfigLldp {
private OpenconfigLldp() {}
....
....
/**
* Protobuf type {#code openconfig.openconfig_lldp.Lldp}
*/
public static final class Lldp extends com.google.protobuf.GeneratedMessageV3 implements
// ##protoc_insertion_point(message_implements:openconfig.openconfig_lldp.Lldp)
....
....
}
In C++, I don't see any class called "OpenconfigLldp" generated. Instead it is just "Lldp"
So, the type_url in the Any.protobuf is a mismatch. C++ side puts it as
type_url: "type.googleapis.com/openconfig_lldp.Lldp.Interfaces"
While in the Java side I use:
OpenconfigLldp.Lldp.Interfaces interfaces = update.getVal().getAnyVal().unpack(OpenconfigLldp.Lldp.Interfaces.class);
Anyone has thoughts on why there is a wrapper class in Java protoc output?
EDIT #3
Apparently looks like it is because of the "outer_class_name". In the Java code, I have an outer class "OpenconfigLldp".
The type_url format is:
type.googleapis.com/packagename.messagename
So, C++ code sets this to openconfig_lldp.Lldp.Interfaces.
But, this maps to OpenconfigLldp.Lldp.Interfaces in Java.
How could I work around this?
FINAL EDIT and FINAL QUESTION
After some digging around, this is what I found out.
By default, type_url is:
type_url: "type.googleapis.com/openconfig_lldp.Lldp.Interfaces"
On the Java side, I looked at the Any implementation. It tries to compare this with:
openconfig.openconfig_lldp.Lldp.Interfaces
I found this out by printing:
Lldp.Interfaces defaultInstance = (Lldp.Interfaces)Internal.getDefaultInstance(Lldp.Interfaces.class);
logger.info("full descriptor name: " + defaultInstance.getDescriptorForType().getFullName());
So, I hacked the C++ side to send:
update->mutable_val()->mutable_any_val()->set_type_url(std::string("type.googleapis.com/openconfig.openconfig_lldp.Lldp.Interfaces"));
So, I think I know what is happening here!
Thanks for reading through all the edits.
I am not sure if I understand correctly what's going wrong -- ideally the qualified names would be in protocol buffer namespaces -- language specific mappings shouldn't matter.
If the question is still open, I'd recommend to move the core of it to the top, preserving the edits as "what I have done so far".
Perhaps this is some kind of bug that could be worked around with on of these options:
java_multiple_files
java_outer_classname
More details about these options can be found here:
https://developers.google.com/protocol-buffers/docs/proto3

Java 8 Stream: How to get a new list from one list property in a list

I'm new to java 8 and meet a problem trouble me a lot:
I've a List like below:
List<objMain>
class objMain{
Long rid;
List<objUser> list;
String rname;
}
class objUser{
String userid;
}
now,I want get a new List like below:
List<objUserMain>
class objUserMain{
Long rid;
String rname;
String userid;
}
How can I do this by java 8 stream? Thanks anyone answer me.
It seems as though you want to map each objMain instance into an objUserMain type.
In order to accomplish the task at hand, you'll need to utilise flatMap along with map then collect to a list implementation.
Assuming you have getters and setter where necessary then you can perform the following logic to get the required result.
List<objUserMain> result =
objMainsList.stream()
.flatMap(obj -> obj.getList().stream().map(e -> {
objUserMain user = new objUserMain();
user.setRid(obj.getRid());
user.setRname(obj.getRname());
user.setUserid(e.getUserid());
return user;
})).collect(Collectors.toList());
You can do this using streams the following way
public static List convert(List existing, Function func) {
return existing.stream().map(func).collect(Collectors.toList());
}
The above method will help you to convert your list from one object type to another. The parameters to this is the initial object you want to convert and the method you want to use for conversion. Import the following in you main class
import java.util.*;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import java.util.function.*;
Now call the method for conversion the following way in the main class that you are trying to convert
List result=convert(newList,
l->{
objUserMain r=new objUserMain();
r.rid=l.rid;
r.rname=l.rname;
r.userid=l.list.get(0).userid;
return r;});
System.out.println(result.get(0).rid);
System.out.println(result.get(0).rname);
System.out.println(result.get(0).userid);
The above is a mixture of lambda functions and streams to allow you to convert object from one type to another. Let me know if you have any queries then I am happy to help. Happy coding.

Hibernate flush optimization using `hibernate.ejb.use_class_enhancer`

I am trying to use the hibernate feature that enhances the flush performance without making code changes. I came across the option hibernate.ejb.use_class_enhancer.
I made the following changes.
1) enabled the property hibernate.ejb.use_class_enhancer to true.
Build failed with error 'Cannot apply class transformer without LoadTimeWeaver specified'
2) I added
context:load-time-weaver to the context files.
Build failed with the following error :
Specify a custom LoadTimeWeaver or start your Java virtual machine with Spring’s agent: -javaagent:spring-agent.jar
3) I added the following to the maven-surefire-plugin
javaagent:${settings.localRepository}/org/springframework/spring-
agent/2.5.6.SEC03/spring-agent-2.5.6.SEC03.jar
the build is successful now.
We have an interceptor that tracks the number of entities being flushed in a transaction.
After I did the above changes, I was expecting that number to come down significantly, but, they did not.
My question is:
Are the above changes correct/enough for getting the 'entity flush optimization'?
How to verify that the application is indeed using the optimization?
Edit:
After debugging, I found the following.
There is a time when our DO class is submitted for transformation, but, the logic that figures out whether a given class is supposed to be transformed is not handling the class names correctly (in my case), because of that, the DO class goes without being transformed.
Is there a way I can pass my logic instead ?
the relevant code is below.
The return copyEntities.contains( className ); is coming out false for the following inputs.
copyEntities contains list of strings "com.x.y.abcDO", "com.x.y.asxDO" where are the className is "com.x.y.abcDO_$$_jvsteb8_48"
public InterceptFieldClassFileTransformer(List<String> entities) {
final List<String> copyEntities = new ArrayList<String>( entities.size() );
copyEntities.addAll( entities );
classTransformer = Environment.getBytecodeProvider().getTransformer(
//TODO change it to a static class to make it faster?
new ClassFilter() {
public boolean shouldInstrumentClass(String clas sName) {
return copyEntities.contains( className );
}
},
//TODO change it to a static class to make it faster?
new FieldFilter() {
#Override
public boolean shouldInstrumentField(String clas sName, String fieldName) {
return true;
}
#Override
public boolean shouldTransformFieldAccess(
String transformingClassName, String fieldOwnerClassName, String fieldName
) {
return true;
}
}
);
}
edited on June 15th
I updated my project to use Spring 4.0.5.RELEASE and hibernate to 4.3.5.Final
I started using org.hibernate.jpa.HibernatePersistenceProvider
and
org.springframework.instrument.classloading.InstrumentationLoadTimeWeaver
and
hibernate.ejb.use_class_enhancer=true
with these changes, I am debugging the flush behavior. I have a question in this code block .
private boolean isUnequivocallyNonDirty(Object entity) {
if(entity instanceof SelfDirtinessTracker)
return ((SelfDirtinessTracker) entity).$$_hibernate_hasDirtyAttributes();
final CustomEntityDirtinessStrategy customEntityDirtinessStrategy =
persistenceContext.getSession().getFactory().getCustomEntityDirtinessStrategy();
if ( customEntityDirtinessStrategy.canDirtyCheck( entity, getPersister(), (Session) persistenceContext.getSession() ) ) {
return ! customEntityDirtinessStrategy.isDirty( entity, getPersister(), (Session) persistenceContext.getSession() );
}
if ( getPersister().hasMutableProperties() ) {
return false;
}
if ( getPersister().getInstrumentationMetadata().isInstrumented() ) {
// the entity must be instrumented (otherwise we cant check dirty flag) and the dirty flag is false
return ! getPersister().getInstrumentationMetadata().extractInterceptor( entity ).isDirty();
}
return false;
}
In my case, the flow is returning false because of persister saying yes for hasMutableProperties. I think the interceptor did not have a chance to answer at all.
Is it not that the bytecode transformer cause an interceptor here? Or the bytecode transform should make the entity a SelfDirtinessTracker?
Can anyone explain, what is the behavior I should expect here from the bytecode transformation here.

MongoDB - override default Serializer for a C# primitive type

I'd like to change the representation of C# Doubles to rounded Int64 with a four decimal place shift in the serialization C# Driver's stack for MongoDB. In other words, store (Double)29.99 as (Int64)299900
I'd like this to be transparent to my app. I've had a look at custom serializers but I don't want to override everything and then switch on the Type with fallback to the default, as that's a bit messy.
I can see that RegisterSerializer() won't let me add one for an existing type, and that BsonDefaultSerializationProvider has a static list of primitive serializers and it's marked as internal with private members so I can't easily subclass.
I can also see that it's possible to RepresentAs Int64 for Doubles, but this is a cast not a conversion. I need essentially a cast AND a conversion in both serialization directions.
I wish I could just give the default serializer a custom serializer to override one of it's own, but that would mean a dirty hack.
Am I missing a really easy way?
You can definitely do this, you just have to get the timing right. When the driver starts up there are no serializers registered. When it needs a serializer, it looks it up in the dictionary where it keeps track of the serializers it knows about (i.e. the ones that have been registered). Only it it can't find one in the dictionary does it start figuring out where to get one (including calling the serialization providers) and if it finds one it registers it.
The limitation in RegisterSerializer is there so that you can't replace an existing serializer that has already been used. But that doesn't mean you can't register your own if you do it early enough.
However, keep in mind that registering a serializer is a global operation, so if you register a custom serializer for double it will be used for all doubles, which could lead to unexpected results!
Anyway, you could write the custom serializer something like this:
public class CustomDoubleSerializer : BsonBaseSerializer
{
public override object Deserialize(BsonReader bsonReader, Type nominalType, Type actualType, IBsonSerializationOptions options)
{
var rep = bsonReader.ReadInt64();
return rep / 100.0;
}
public override void Serialize(BsonWriter bsonWriter, Type nominalType, object value, IBsonSerializationOptions options)
{
var rep = (long)((double)value * 100);
bsonWriter.WriteInt64(rep);
}
}
And register it like this:
BsonSerializer.RegisterSerializer(typeof(double), new CustomDoubleSerializer());
You could test it using the following class:
public class C
{
public int Id;
public double X;
}
and this code:
BsonSerializer.RegisterSerializer(typeof(double), new CustomDoubleSerializer());
var c = new C { Id = 1, X = 29.99 };
var json = c.ToJson();
Console.WriteLine(json);
var r = BsonSerializer.Deserialize<C>(json);
Console.WriteLine(r.X);
You can also use your own serialization provider to tell Mongo which serializer to use for certain types, which I ended up doing to mitigate some of the timing issues mentioned when trying to override existing serializers. Here's an example of a serialisation provider that overrides how to serialize decimals:
public class CustomSerializationProvider : IBsonSerializationProvider
{
public IBsonSerializer GetSerializer(Type type)
{
if (type == typeof(decimal)) return new DecimalSerializer(BsonType.Decimal128);
return null; // falls back to Mongo defaults
}
}
If you return null from your custom serialization provider, it will fall back to using Mongo's default serialization provider.
Once you've written your provider, you just need to register it:
BsonSerializer.RegisterSerializationProvider(new CustomSerializationProvider());
I looked through the latest iteration of the driver's code and checked if there's some sort of backdoor to set custom serializers. I am afraid there's none; you should open an issue in the project's bug tracker if you think this needs to be looked at for future iterations of the driver (https://jira.mongodb.org/).
Personally, I'd open a ticket -- and if a quick workaround is necessary or required, I'd subclass DoubleSerializer, implement the new behavior, and then use Reflection to inject it into either MongoDB.Bson.Serialization.Serializers.DoubleSerializer.__instance or MongoDB.Bson.Serialization.BsonDefaultSerializationProvider.__serializers.

how to do validation with not well form XML while doing unmarshalling?

I have an unmarshaller along with an MySchema.xsd file.
StreamSource sources = new StreamSource(getClass().getClassLoader().getResourceAsStream("/xmlValidation.xsd"));
SchemaFactory sf = SchemaFactory.newInstance( XMLConstants.W3C_XML_SCHEMA_NS_URI );
unmarshaller.setSchema(sf.newSchema(sources));
And make a call to unmarshaller.setEventHandler() function, to specify a custom validation event handler, which basically format a error tips string , by:
final String errorString = new String();
unmarshaller.setEventHandler(new ValidationEventHandler() {
#Override
public boolean handleEvent(ValidationEvent validationevent) {
if(validationevent.getSeverity()!= ValidationEvent.WARNING){
errorString.format( "Line:Col[" + validationevent.getLocator().getLineNumber()
+ ":" + validationevent.getLocator().getColumnNumber()
+ "]:" + validationevent.getMessage());
return false;
}
return true;
}
});
The above codes seem work ok(I can get java object when the input string is validated. and also the error tips string is formated as excepted)
The problem is that, when the input xml is not well form, it also throw a SaxParseException.
Thanks in advance.
Andrew
Well formed relates to the XML syntax itself, as opposed to being valid WRT an XML schema:
http://en.wikipedia.org/wiki/Well-formed_element
If you have XML that is not well formed then you will get a ValidationEvent.FATAL_ERROR and unmarshalling will not be able to continue, as the underlying parser used by JAXB cannot continue.
For more information:
http://bdoughan.blogspot.com/2010/12/jaxb-and-marshalunmarshal-schema.html
K, I mess up something and get this problem.
Now I figure it out. If I am wrong, please point me out. below it's what I find in javadoc and test on my project:
javax.xml.bind.ValidationEventHandler can handler the constrain error with the given schema constrains, when unmarshaller is unmarshaling.
unmarshaller.unmarshal(xmlInputStream);
The ValidationEventHandler will be called during the unmarshaling process if error occurs.
The SAXEception will be thrown, if the xmlInputStream is not well form.
And I cant find a way to catch the SAXException, throw by the sax parser, so I guess using validation during unmarshaling can't due with un-well form xml string.
I use javax.xml.validation.Validator to validate that the xml string is well form and under constrain.
jaxbValidator.validate(xmlSource);
The above code will throw SAXException.
If no exception is thrown, then unmarshall the xml string into object.

Resources