opennlp TokenNameFinder for entities different than names - opennlp

I´m new to openNlp. I start training a model (TokenNameFinderTrainer), to identify names. So far so good, but now I want to identify organization (such as "Microsoft").
My question is: which types of entities does opennlp recognize by default? (if there is any
...)
I see that can handle <START:person> Daryl Williams <END> .
But is okay to create something like: <START:organization> Metro-Goldwyn-Mayer Studios Inc. <END>? or <START:company> Metro-Goldwyn-Mayer Studios Inc. <END>
Meaning: Can I label categories as I please? or
Do I have to use a default category for that?. That being the case, which are the default ones?
EDIT:
I have found the answers via further reading. I asking now for confirmation...
I can label entities as I please, and is wiser to make 1 model per entity, am I right there?.
For instance: 1 for locations, 1 for names, 1 for companies?
Any ideas on have to procede where the same (for instance) company is written like:
Microsoft, and also microsoft?
Thanks for the help!

you can make a model for any NER model you want, i recommend one model per type.
OpenNLP uses machine learning to find entities, so it will find what your model tells it to. So if you annotate microsoft and Microsoft, or even a misspelling of microsoft it will try to find it.
If you have a small list of names, and only a few variants for each, and you need the results to be normalized, consider using a RegexNameFinder. If you pull the trunk you can construct the RegexNameFinder with a Map that maps a label to a set of regex patterns.
EDIT: here is a link to the OpenNLP unit test cases for the RegexNameFinder. This is the 1.6-snapshot
http://svn.apache.org/viewvc/opennlp/trunk/opennlp-tools/src/test/java/opennlp/tools/namefind/RegexNameFinderTest.java?view=co
if the link won't work, here is a basic example.
public void test() {
Pattern testPattern = Pattern.compile("test");
String sentence[] = new String[]{"a", "test", "b", "c"};
Pattern[] patterns = new Pattern[]{testPattern};
Map<String, Pattern[]> regexMap = new HashMap<>();
String type = "testtype";
regexMap.put(type, patterns);
RegexNameFinder finder =
new RegexNameFinder(regexMap);
Span[] result = finder.find(sentence);
}

Related

How to access View Template Properties for Revit and compare them in Real Time?

I am trying to list the view template’s properties so we can compare them with another old template.
For example what model elements are hidden or have overrides in a given template or which Revit links have been hidden or overridden in a given template.
View Template
(https://www.google.com/search?q=view+template+revit&rlz=1C1GGRV_enUS770US770&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjLndrd2cTbAhVESq0KHX1cAPwQ_AUICygC&biw=1536&bih=824#imgrc=Q0v-pV7Nxl4kfM:)
I’m looking to devise a View Template Compare tool and access to the owner and creator of them.
public void ApplyViewTemplateToActiveView()
{
Document doc = this.ActiveUIDocument.Document;
View viewTemplate = (from v in new FilteredElementCollector(doc)
.OfClass(typeof(View))
.Cast<View>()
where v.IsTemplate == true && v.Name == "MyViewTemplate"
select v)
.First();
using (Transaction t = new Transaction(doc,"Set View Template"))
{
t.Start();
doc.ActiveView.ViewTemplateId = viewTemplate.Id;
t.Commit();
}
}
With Revit API you can access with:
GetTemplateParameterIds Method / ViewTemplateId Property
The Revit API exposes almost all the ViewTemplate properties.
For instance this method returns all the Visibility/Graphic Overrides for a specific category:
https://apidocs.co/apps/revit/2019/ed267b82-56be-6e3b-0c6d-4de7df1ed312.htm
The only thing I couldn't get for a ViewTemplate are the "includes", but all the rest seems to be there.
Update:
The list or properties "not included" can be retrieved with GetNonControlledTemplateParameterIds().
Yes, and no.
Yes, I guess you can use Forge Model Derivative API to export RVT file and then build a dashboard around the View Templates data. That's assuming that View Templates data actually gets exported when the model is translated. That data is not attached to any geometry so I would not be surprised if it was skipped. The question here is why? This is like renting a 16-wheel truck to move a duffel bag across the street.
No, if your intention is to directly interact with the RVT model. Forge can view it, but to push anything back or request changes to the model, is not available yet. Then again, I am not even sure that the view template data is available via model derivative exports.
This brings me another alternative. Why not just collect the data using Revit API, the standard way and then push it out to a Database and build on top of that? There is no reason to employ Forge for any of that.
Thanks Jeremy, I had dig into your amazing website and also some solution that Konrad post in the Dynamo Forum about this. In Revit seems pretty achievable, you filter the View that is View Template and then extracts these properties, is it correct?.
I am wondering if someone can point me in the right direction with Forge.
Some amazing guys are developing a BQL https://www.retriever.works/.
BQL(Building Query Language) is a query language for buildings, similar to how SQL is a query language for databases. It is fast and flexible. BQL helps improve efficiency for QA/QC (quality assurance and quality control), and building data extraction without leaving Revit. I am also trying these and I would like to understand if there are some works where I could start with Forge next week about this.

Querying by nested pointers?

I'm rewriting a "Yelp for Dorm rooms" web app in Node/Express with a Parse backend (PHP version here for context).
Since I am very used to SQL databases, I have organized my data into four tables of one-to-many pointers:
Rooms
Halls
Clusters
Campuses
Every room has a pointer to its hall, each hall has a pointer to its cluster (a small group of halls) and each cluster has a pointer to its campus.
However, since each hall/cluster/campus has its own culture, I want to be able to search by each level (e.g. I want to live on South campus, or Norris Hall). However, since the pointers are nested three levels deep, I'm having a problem searching by campus and returning rooms. I'd hate to have to duplicate data and copy/paste cluster and campus data into each room object.
Searching for a cluster is easy. I can just:
var clusterQuery = new Parse.Query("clusters");
clusterQuery.equalTo("cluster", req.params.cluster);
var hallsQuery = new Parse.Query("halls");
hallsQuery.matchesQuery("cluster", clusterQuery);
query.matchesQuery("hall", hallsQuery);
So I figured doing a campus search would be simply
var campusQuery = new Parse.Query("campuses");
campusQuery.equalTo("cluster", req.params.campus);
var clusterQuery = new Parse.Query("clusters");
clusterQuery.matchesQuery("campus", campusQuery);
var hallsQuery = new Parse.Query("halls");
hallsQuery.matchesQuery("cluster", clusterQuery);
query.matchesQuery("hall", hallsQuery);
But of course, that would be too easy.
Instead, I get an error 154: Query had too many nested queries.
So my question for you, almighty Stackoverflow community: What should I do instead?
It makes more sense to name your classes with singular names, Campus rather than Campuses. So, I will go with singular names.
Your model is a tree structure and there are some patterns for it. The one you use is keeping parent references, that is simple but requires multiple queries for subtrees as you realized. Since Parse is using MongoDB, you can check use cases and model patterns of MongoDB, such as Product Catalog and Model Tree Structures.
Consider Array of Ancestors pattern where you have something like {_id: "Room1", ancestors: [pointerToAHall, pointerToACluster, pointerToACampus], parrent: pointerToAHall}.
You can find rooms where ancestors array contains a campus:
var query = new Parse.Query("Room");
query.equalTo("ancestors", aCampusObject)
Note that equalTo knows that ancestors is an array. You may want to check Parse docs for queries on array.

How to implement custom resource provider dependent on different criteria than UI culture?

I am working on .NET 4.0 MVC3 web application. The application is all in English and allows users to fill information regarding different regions. For simplicity let's say we have two regions: United States and Western Europe.
Now in the view I present a string let's say Project opening, but if the user works on region United States I would like it to read Project initiation.
When I think about this functionality I immediately think about resource files for different regions, but independent from the UI culture.
Does anyone have a recipe how to achieve what I want?
Would be also nice, if in the future I could make it read e.g. ExtendedDisplayAttribute(string displayName, int regionId) placed over properties of my ViewModels.
EDIT
I am already at the stage where I can access region information in a helper that should return the string for this region. Now I have a problem with the resource files. I want to create multiple resource files with failover mechanism. I expected there would be something working out of the box, but the ResourceManager cannot be used to read resx files.
Is there any technique that will allow me to read the values from specific resource files without some non-sense resgen.exe?
I also do not want to use System.Resources.ResXResourceReader, because it belongs to System.Windows.Forms.dll and this is a Web app.
Just in case someone wants to do the same in the future. This article turned out to be really helpful: http://www.jelovic.com/articles/resources_in_visual_studio.htm
The piece of code that I use (VB) is:
<Extension()>
Public Function Resource(Of TModel)(ByVal htmlHelper As HtmlHelper(Of TModel), resourceKey As String) As MvcHtmlString
Dim regionLocator As IRegionLocator = DependencyResolver.Current.GetService(GetType(IRegionLocator))
Dim resources = New List(Of String)
If Not String.IsNullOrEmpty(regionLocator.RegionName) Then
resources.Add(String.Format("Website.Resources.{0}", regionLocator.RegionName))
End If
resources.Add("Website.Resources")
Dim value = String.Empty
For Each r In resources
Dim rManager = New System.Resources.ResourceManager(r, System.Reflection.Assembly.GetExecutingAssembly())
rManager.IgnoreCase = True
Try
value = rManager.GetString(resourceKey)
If Not String.IsNullOrEmpty(value) Then
Exit For
End If
Catch
End Try
Next
Return New MvcHtmlString(value)
End Function

Model for localizable attribute values in Core Data Entity?

If I want to create a entity in Core Data that has attributes with values that should be localizable I'm wondering how the most efficient way would look like?
As a example let's assume the following structure:
Book
name (localizable)
description (localizable)
author
An localized book entry would look like this:
name: "A great novel" (en/international),
"Ein großartiger Roman" (de),
"Un grand roman" (fr)
description:
"Great!" (en/international),
"Großartig!" (de),
"Grand!" (fr)
author: "John Smith"
In a SQL/SQLite implementation I would use two tables. A books table containing the book information (author, the english/international name and description) and the localizationBooks table that is related using the primary key of the corresponding book. This second table contains the localized name and description values as well as the languageCode. This would allow to have a scalable number of localizations.
For fetching the data I would use a
SELECT COALESCE(localizationBooks.name, books.name)
to get the actual values for the given language code. This allows to use the international english value as fallback for unsupported languages.
Would this require a separate entity in Core Data (e.g. BookLocalization) that has a relation to the Book or is there another recommended way of doing this?
The strategy you mention is usually the best. As you suggest, this requires an extra entity for the localized properties.
You can then write a helper method on your Books entity to get the appropriate localization object for a given language and use it like this:
[book bookLocalizationForLanguage:#"de"].name
You could even take it a step further and just add properties like localizedName, localizedDescription on the Books entity which will fetch the appropriate localized value.
Well, despite that this topic is 3 years old... I just stumbled upon it, asking my self the very same as the original poster. I found another thread with an answer.
I'll just repeat it in here in case somebody else hits this thread (Answer from Gordon Hughes):
Good practices for multiple language data in Core Data
To summarize:
Let's say you have Entity Books. Then you will have to make an additional one, called Localizedbook for example. In Books you have the attribute "title" and in LocalizedBook you have "localizedTitle" and "locale" for international strings like "en_US".
You now have to set the relationship between title -> localizedTitle (one to many, as one original title can have multiple translations).
So, every time you fetch "title" you will get the "localizedTitle" given to a specific locale, if the relations are set correctly.

How can I force a complete load along a navigation relationship in Entity Framework?

Okay, so I'm doing my first foray into using the ADO.NET Entity Framework.
My test case right now includes a SQL Server 2008 database with 2 tables, Member and Profile, with a 1:1 relationship.
I then used the Entity Data Model wizard to auto-generate the EDM from the database. It generated a model with the correct association. Now I want to do this:
ObjectQuery<Member> members = entities.Member;
IQueryable<Member> membersQuery = from m in members select m;
foreach (Member m in membersQuery)
{
Profile p = m.Profile;
...
}
Which halfway works. I am able to iterate through all of the Members. But the problem I'm having is that m.Profile is always null. The examples for LINQ to Entities on the MSDN library seem to suggest that I will be able to seamlessly follow the navigation relationships like that, but it doesn't seem to work that way. I found that if I first load the profiles in a separate call somehow, such as using entities.Profile.ToList, then m.Profile will point to a valid Profile.
So my question is, is there an elegant way to force the framework to automatically load the data along the navigation relationships, or do I need to do that explicitly with a join or something else?
Thanks
Okay I managed to find the answer I needed here http://msdn.microsoft.com/en-us/magazine/cc507640.aspx. The following query will make sure that the Profile entity is loaded:
IQueryable<Member> membersQuery = from m in members.Include("Profile") select m;
I used this technique on a 1 to many relationship and works well. I have a Survey class and many questions as part of that from a different db table and using this technique managed to extract the related questions ...
context.Survey.Include("SurveyQuestion").Where(x => x.Id == id).First()
(context being the generated ObjectContext).
context.Survey.Include<T>().Where(x => x.Id == id).First()
I just spend 10mins trying to put together an extention method to do this, the closest I could come up with is ...
public static ObjectQuery<T> Include<T,U>(this ObjectQuery<T> context)
{
string path = typeof(U).ToString();
string[] split = path.Split('.');
return context.Include(split[split.Length - 1]);
}
Any pointers for the improvements would be most welcome :-)
On doing a bit more research found this ... StackOverflow link which has a post to Func link which is a lot better than my extension method attempt :-)

Resources