Using .generate function for beam search over predictions in custom model extending TFPreTrainedModel class - huggingface-transformers

I want to use .generate() functionality of hugging face in my model's predictions.
My model is a custom model inehriting from "TFPreTrainedModel" class and has a custom transformer inheriting from tf.keras.layers followed by few hidden layers and a final dense layer (inherited from tf.keras.layers).
I am not able to use .generate() inspite of adding get_lm_head() function (as given here https://huggingface.co/docs/transformers/main_classes/model) and returning my last dense layer in it.
When I call .generate() it throws
TypeError: The current model class (NextCateModel) is not compatible with.generate(), as it doesn't have a language model head.
Can anyone suggest on how to use .generate() functionality of huggingface in our custom transformer based models without using the huggingface's list of pre-trained models?
PS: It checks for models among huggingface pretrained ones which are defined in their generation_tf_utils.py
generate_compatible_mappings = [
TF_MODEL_FOR_CAUSAL_LM_MAPPING,
TF_MODEL_FOR_VISION_2_SEQ_MAPPING,
TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING,
TF_MODEL_FOR_SPEECH_SEQ_2_SEQ_MAPPING,
]
I donot intend to use their pretrained models given in above mappings (one of them is shown below)
TF_MODEL_FOR_CAUSAL_LM_MAPPING=
("bert", "TFBertLMHeadModel"),
("camembert", "TFCamembertForCausalLM"),
("ctrl", "TFCTRLLMHeadModel"),
("gpt2", "TFGPT2LMHeadModel"),
("gptj", "TFGPTJForCausalLM"),
("openai-gpt", "TFOpenAIGPTLMHeadModel"),
("opt", "TFOPTForCausalLM"),
("rembert", "TFRemBertForCausalLM"),
("roberta", "TFRobertaForCausalLM"),
("roformer", "TFRoFormerForCausalLM"),
("transfo-xl", "TFTransfoXLLMHeadModel"),
("xglm", "TFXGLMForCausalLM"),
("xlm", "TFXLMWithLMHeadModel"),
("xlnet", "TFXLNetLMHeadModel"),
1340 if generate_compatible_classes:
1341 exception_message += f" Please use one of the following classes instead: {generate_compatible_classes}"
-> 1342 raise TypeError(exception_message)

Related

How do I finetune a model while preserving layer names

When I fine tune a pretrained resnet152 model, I seem to lose all the named layers I’d like access to. I’ve include the simple fine tuned model code, and the print out of named layers of both pretrained and fine tuned.I'd like to maintain the layer names so I can visualize their output in a Class Activation Map.
Code
class ConvNet3(nn.Module):
def init(self):
super().init()
model = models.resnet152(pretrained=True)
model.fc = nn.Linear(2048, 10)
self.model = model
def forward(self, x):
return self.model(x) # [batch_size, 10]
import torchvision.models as models
model = ConvNet3().eval()
print([n for n, _ in model.named_children()])
model = models.resnet152(pretrained=True).eval()
print([n for n, _ in model.named_children()])
Output
[‘model’]
[‘conv1’, ‘bn1’, ‘relu’, ‘maxpool’, ‘layer1’, ‘layer2’, ‘layer3’, ‘layer4’, ‘avgpool’, ‘fc’]
The layers are not lost, you are encapsulating the original Resnet model in your own class. If you use:
print([n for n, _ in model.model.named_children()])
since the Resnet model is stored under the model attribute of the ConvNet3 class.
Unless you need it for another reason, the wrapper class seems unnecessary, a simpler approach would be to do something as follows:
model = models.resnet152(pretrained=True)
model.fc = nn.Linear(2048,10)
model.eval()
print([n for n, _ in model.named_children()])

How to average out value returned by an instance method for collection?

I have a simple method inside a model:
def term_months
((started_at - injected_at) / 1.month).to_i
end
This returns a simple integer.
In my View, I have a collection of this model type and I want to average out the results of each model's term_months value.
If this were a column, I could use something like #terms.average(:term_months), but this isn't the case.
Is there some way to average them out inline?
You 'll have to do it manually with a map:
#terms.map(&:term_months).inject(:+).to_f / #terms.length
What you can do is define that as a class method on Term
def self.average_term_months
scoped.map(&:term_months).inject(:+).to_f / scoped.length
end
and use it as #terms.average_term_months
This method is not for use as a classic class method, but more as a scope. However I do not define it as a scope because (personal taste here) I want scopes to be chainable.
#terms.sum(&:term_months).to_f / #terms.size
if started_at and injected_at are columns in your DB, then below would is possible has a better performance than using Enumerable methods (:sum) as it delegates the averaging to the DB and just returns an integer/float object then term_months would not be required:
Model.average("(started_at - injected_at)/ #{1.month}") #where Model is the name of your ActiveRecord Object
You might consider using the quickstats gem, which is designed to update basic statistics on an observation by observation basis as new observations become available. This can be very useful if the dataset is large and you just want the summary stats without having to retain all of the individual observations. Quickstats uses recurrence relationships Xbar(n+1) <- f(Xbar(1), x_n) and s^2(n+1) <- g(s^2(n), x_n), where Xbar(n) and s^2(n) the average and sample variance, respectively, based on n observations; x_n is the nth observation; and f and g represent the appropriate update functions.

Sentence detection using opennlp on hadoop

I want to do sentence detection using OPenNLP and Hadoop. I have implemented same on Java successfully. Want to implement same on Mapreduce platform. Can anyone help me out?
I have done this two different ways.
One way is to push out your Sentence detection model to each node to a standard dir (ie /opt/opennlpmodels/), and at the class level in your mapper class read in the serialized model, and then use it appropriately in your map or reduce function.
Another way is to put the model in a database or the distributed cache (as a blob or something... I have used Accumulo to store Document categorization models before like this). then at the class level make the connection to the database and get the model as a bytearrayinputstream.
I have used Puppet to push out the models, but use whatever you typically use to keep files up to date on your cluster.
depending on your hadoop version you may be able to sneak the model in as a property on jobsetup and then only the master (or wherever you launch jobs from) will need to have the actual model file on it. I've never tried this.
If you need to know how to actually use the OpenNLP sentence detector let me know and I'll post an example.
HTH
import java.io.File;
import java.io.FileInputStream;
import opennlp.tools.sentdetect.SentenceDetector;
import opennlp.tools.sentdetect.SentenceDetectorME;
import opennlp.tools.sentdetect.SentenceModel;
import opennlp.tools.util.Span;
public class SentenceDetection {
SentenceDetector sd;
public Span[] getSentences(String docTextFromMapFunction) throws Exception {
if (sd == null) {
sd = new SentenceDetectorME(new SentenceModel(new FileInputStream(new File("/standardized-on-each-node/path/to/en-sent.zip"))));
}
/**
* this gives you the actual sentences as a string array
*/
// String[] sentences = sd.sentDetect(docTextFromMapFunction);
/**
* this gives you the spans (the charindexes to the start and end of each
* sentence in the doc)
*
*/
Span[] sentenceSpans = sd.sentPosDetect(docTextFromMapFunction);
/**
* you can do this as well to get the actual sentence strings based on the spans
*/
// String[] spansToStrings = Span.spansToStrings(sentPosDetect, docTextFromMapFunction);
return sentenceSpans;
}
}
HTH... just make sure the file is in place. There are more elegant ways of doing this but this works and it's simple.

Is it ok for a Django mixin to inherit another mixin?

I'm pretty sure the answer to this question is obviously "NO", since Django mixins are supposed to
inherit "object"s, but I can't find an alternative solution to my problem :(
To make the question as simple as possible,,,
views.py
class JSONResponseMixin(object):
def render_to_response(self, context):
"Returns a JSON response containing 'context' as payload"
return self.get_json_response(self.convert_context_to_json(context))
def get_json_response(self, content, **httpresponse_kwargs):
"Construct an `HttpResponse` object."
return http.HttpResponse(content,
content_type='application/json',
**httpresponse_kwargs)
def convert_context_to_json(self, context):
"Convert the context dictionary into a JSON object"
# Note: This is *EXTREMELY* naive; in reality, you'll need
# to do much more complex handling to ensure that arbitrary
# objects -- such as Django model instances or querysets
# -- can be serialized as JSON.
return json.dumps(context)
class HandlingAJAXPostMixin(JSONResponseMixin):
def post(self, request, *args, **kwargs):
.....
data = {'somedata': somedata}
return JSONResponseMixin.render_json_response(data)
class UserDetailView(HandlingAJAXPostMixin, DetailView):
model = MyUser
.....
So the problem I have is that, for multiple Views, I want to respond to their "post" request with the same
JSON Response. That is why I defined the HandlingAJAXPostMixin so that I could reuse it for
other Views. Since the HandlingAJAXPostMixin returns a JSON response,
it requires a render_json_response method, which is defined in the JSONResponseMixin.
This is the reason why I am making my HandlingAJAXPostMixin inherit the JSONResponseMixin, but this obviously seems wrong :(..
Any suggestions..?
Thanks!!!
It's perfectly valid for a mixin to inherit from another mixin - in fact, this is how most of Django's more advanced mixins are made.
However, the idea of mixins is that they are reusable parts that, together with other classes, build a complete, usable class. Right now, your JSONResponseMixin might as well be a separate class that you don't inherit from, or the methods might just be module-wide methods. It definitely works, there's nothing wrong with it, but that's not the idea of a mixin.
If you look at Django's BaseDetailView, you see the following get() method:
def get(self, request, *args, **kwargs):
self.object = self.get_object()
context = self.get_context_data(object=self.object)
return self.render_to_response(context)
get_object() and get_context_data() are defined in the subclasses of BaseDetailView, but render_to_response() isn't. It's okay for mixins to rely on methods that it's superclasses don't define, this allows different classes that inherit from BaseDetailView to supply their own implementation of render_to_response(). Right now, in Django, there's only one subclass, though.
However, logic is delegated as much as possible to those small, reusable methods that the mixins supply. That's what you want to aim for. If/else logic is avoided as much as possible - the most advanced logic in Django's default views is:
if form.is_valid():
return self.form_valid(form)
else:
return self.form_invalid(form)
That's why very similar views, like CreateView and UpdateView are in fact two separate views, while they could easily be a single view with some additional if/else logic. The only difference is that CreateView does self.object = None, while UpdateView does self.object = self.get_object().
Right now you are using a DetailView that defines a get() method that returns the result of self.render_to_response(). However, you override render_to_response() to return a JSON response instead of a template-based HTML response. You're using a mixin that you don't what to use (SingleObjectTemplateResponseMixin) and then override it's behavior to do something that you don't want to do either, just to get the view doing what you want it to do. A better idea would be to write an alternative for DetailView who's only job is to supply a JSON response based on a single object. To do this, I would create a SingleObjectJSONResponseMixin, similar to the SingleObjectTemplateResponseMixin, and create a class JSONDetailView that combines all needed mixins into a single object:
class SingleObjectJSONResponseMixin(object):
def to_json(context):
return json.dumps(context)
def render_to_response(context, **httpresponse_kwargs):
return HttpResponse(self.to_json(context),
context_type='application/json',
**httpresponse_kwargs)
class BaseJSONDetailView(SingleObjectMixin, View):
# if you want to do the same for get, inherit just from BaseDetailView
def post(self, request, *args, **kwargs):
self.object = self.get_object()
context = self.get_context_data(object=self.object)
return render_to_response(context)
class JSONDetailView(SingleObjectJSONResponseMixin, BaseJSONDetailView):
"""
Return JSON detail data of a single object.
"""
Notice that this is almost exactly the same as the BaseDetailView and the SingleObjectTemplateResponseMixin provided by Django. The difference is that you define a post() method and that the rendering is much more simple with just a conversion to JSON of the context data, not a complete template rendering. However, logic is deliberately kept simple as much as possible, and methods that don't depend on each other are separated as much as possible. This way, SingleObjectJSONResponseMixin can e.g. be mixed with BaseUpdateView to easily create an AJAX/JSON-based UpdateView. Subclasses can easily override the different parts of the mixins, like overriding to_json() to supply a certain data structure. Rendering logic is where it belongs (in render_to_response()).
Now all you need to do to create a specific JSONDetailView is to subclass and define which model to use:
class UserJSONDetailView(JSONDetailView):
model = MyUser

Gson, How to write a JsonDeserializer for Generic Typed Classes?

Situation
I have a class that holds a generic type, and it also has a non-zero arg constructor. I don't want to expose a zero arg constructor because it can lead to erroneous data.
public class Geometries<T extends AbstractGeometry>{
private final GeometryType geometryType;
private Collection<T> geometries;
public Geometries(Class<T> classOfT) {
this.geometryType = lookup(classOfT);//strict typing.
}
}
There are several (known and final) classes that may extend AbstractGeometry.
public final Point extends AbstractGeometry{ ....}
public final Polygon extends AbstractGeometry{ ....}
Example json:
{
"geometryType" : "point",
"geometries" : [
{ ...contents differ... hence AbstractGeometry},
{ ...contents differ... hence AbstractGeometry},
{ ...contents differ... hence AbstractGeometry}
]
}
Question
How can I write a JsonDeserializer that will deserialize a Generic Typed class (such as Geometires)?
CHEERS :)
p.s. I don't believe I need a JsonSerializer, this should work out of the box :)
Note: This answer was based on the first version of the question. The edits and subsequent question(s) change things.
p.s. I don't believe I need a JsonSerializer, this should work out of the box :)
That's not the case at all. The JSON example you posted does not match the Java class structure you apparently want to bind to and generate.
If you want JSON like that from Java like that, you'll definitely need custom serialization processing.
The JSON structure is
an object with two elements
element 1 is a string named "geometryType"
element 2 is an object named "geometries", with differing elements based on type
The Java structure is
an object with two fields
field 1, named "geometryType", is a complex type GeometryType
field 2, named "geometries" is a Collection of AbstractGeometry objects
Major Differences:
JSON string does not match Java type GeometryType
JSON object does not match Java type Collection
Given this Java structure, a matching JSON structure would be
an object with two elements
element 1, named "geometryType", is a complex object, with elements matching the fields in GeometryType
element 2, named "geometries", is a collection of objects, where the elements of the different objects in the collection differ based on specific AbstractGeometry types
Are you sure that what you posted is really what you intended? I'm guessing that either or both of the structures should be changed.
Regarding any question on polymorphic deserialization, please note that the issue was discussed a few times on StackOverflow.com already. I posted a link to four different such questions and answers (some with code examples) at Can I instantiate a superclass and have a particular subclass be instantiated based on the parameters supplied.

Resources