What does the representative mention mean in Coref - stanford-nlp

See link:
https://stanfordnlp.github.io/CoreNLP/faq.html#what-is-the-format-of-the-xml-output-for-coref
In the XML output for coref, there is a mention labelled representative which can either be true or false. what does that mean? Are the nouns true and pronouns false?

You can learn more about how "representativeness" is calculated by looking at edu.stanford.nlp.coref.data.Mention#moreRepresentativeThan.
In short, you want to choose the "best" mention. So for instance "Barack Obama" is a more complete mention than "Obama." How this is calculated is represented in that method.

Related

Ruby: difference between hexencode and hexdigest

Today I read the documentation on Rubies hexdigest method, e.g.
Digest::SHA256.hexdigest('123')
=> "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"
The documentation says:
Returns the hex-encoded hash value of a given string. This is almost equivalent to Digest.hexencode(Digest::Class.new(*parameters).digest(string)).
Highlighting is by me: What does almost mean here? How is it different?
Of course my example string above yields the same result:
Digest.hexencode(Digest::SHA256.digest('123'))
=> "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"
Can anyone point me to the cases where the result can be different? I want to understand whether the "almost" points to an important difference or if the difference is irrelevant for me.
As in the module Digest::Instance described hexdigest(string) return hexencode_str_new(value);. In the module Digest described hexencode(string) return hexencode_str_new(value); too. So, there are no differences if use same instance type. "almost" because in the documentation example can be Digest::SHA512 or other.

Ontologies, OWL, Sparql: Modelling that "something is not there" and performance considerations

we want to model that "something is not there" as opposed to missing information, e.g. an explicit statement that "a patient did not get chemotherapy" or that "a patient does not have dyspnea" is different from missing information about whether a patient has dyspnea.
We thought about several approaches, e.g.
Using a negation class: "No_Dyspnea". But that seems semantically problematic, since what type would that class be? It cannot be a descendant of the "Dyspnea" class.
Using "not there" object properties, e.g. "denies" or "does_not_have" and then an individual of the Dyspnea root class as the object of that triple.
Using blank nodes that describe that the individual belongs to the group of things that do not have dyspnea. E.g.:
dat:PatientW2 a [ rdf:type owl:Class;
owl:complementOf [
rdf:type owl:Restriction ;
owl:onProperty roo:has_finding;
owl:someValuesFrom nci:Dyspnea;
]
] .
We feel like the 3rd option is the most "ontologically correct" way of expressing this. However, when playing around with it we encountered severe performance problems in simple scenarios.
We are using Sesame with an OWLIM-Lite store and imported the NCI thesaurus (280MB, about 80,000 concepts) and another very small ontology into the store and added two individuals, one having that complementOf/restriction class.
The following query took forever to execute and I terminated it after 15 minutes:
select *
where {
?s a [ rdf:type owl:Class;
owl:complementOf [
rdf:type owl:Restriction ;
owl:onProperty roo:has_finding;
owl:someValuesFrom nci:Dyspnea;
]
] .
} Limit 100
Does anybody know why? I would assume that this approach creates a lot of blank nodes and the query engine has to go through the entire NCI thesaurus and compare all blank nodes with this one?
If I put this triple in a separate graph and only query that graph, the query returns the result instantaneously.
To sum things up. The two basic questions are:
Is the third approach really the best for modelling "something is not there"
Is this going to affect query performance?
EDIT 1
We discussed the proposed options. It actually helped us in clarifying what we are really trying to achieve:
We want to be able to state that "Patient has Dyspnea" or "Patient does not have Dyspnea" at a particular point in time.
In the future there may/will be more information about that patient, e.g. that he/she now has dyspnea.
We want to be able to write Sparql queries that ask for "all patients that have dyspnea" and "all patients that do not have dyspnea".
We want to keep the Sparql as simple and intuitive as possible. E.g. only use one property "has_finding" rather than having to know about two properties (one for "has_exclusion"). Or having to know about some complex blank node construct.
We played around with options:
Negative Property Assertions: This sounded like the best solution to this problem since we are stating that one individual is not related to another individual on that property. The issues are that we have to create an individual of Dyspnea for the sake of having something as owl:targetIndividual. And we cannot find a way of querying the negative assertion easily other then going through the whole owl:sourceIndividual and owl:targetIndividual chain. Which makes the Sparql quite lengthy and puts a burden on the person writing the query to know about it.
Blank node with complementOf: We would be stating something with this that we do not want to state. This would state that "Patient1 can never have a finding of dyspnea". Whereas we want to state the "Patient1 does not have a dyspnea finding now (or at date X)". So we should not use this approach.
Using an Exclusion/Inclusion Types (Option 1 and 2): After a closer look a Jeen's suggestion we believe that using general :Exclusion and :Inclusion classes along with only one property has_finding and giving the dyspnea individual the inclusion/exclusion type is the easiest to understand, query and provides enough reasoning abilities. Example:
:Patient1 a :Patient .
:Dyspnea1 a :Dyspnea .
:Dyspnea1 a :Exclusion.
:Patient1 ex:has_finding :Dyspnea1 .
That way, the person writing the Sparql query only has to know that:
There is one property has_finding, which represents the intentions properly. Since "No dyspnea" is technically a finding as well.
But just querying using has_finding will not give sufficient information about whether the person actually has it or not. The query also needs to contain a triple about whether the dyspnea individual is a :Exclusion (or inclusion depending on the goal of the query).
While this puts some additional burden on the query writer, it is less than negative property assertions and easier to understand.
We would really appreciate some feedback on these conclusions!
If your diseases are represented as individuals, then you can use negative object property assertions to literally say, e.g.,
¬hasFinding(john,Dyspnea)
NegativeObjectPropertyAssertion(hasFinding john Dyspnea)
Of course, if you have lots of things that aren't the case, then this might get a bit involved. It's probably the most semantically correct, though. It also means that your query could match directly against the data in the ontology, which might make for quicker results. (Of course, you'd still have the issues of trying to infer when the negative object property holds.)
This doesn't work if diseases are represented as classes, though. If diseases are represented by classes, then you can use class expressions, similar to what you propose. E.g.,
(∀ hasFinding.¬Dyspnea)(john)
ClassAssertion(ObjectAllValuesFrom(hasFinding ObjectComplementOf(Dyspnea)) john)
This is similar to your third option, but I wonder if it might perform better. It seems like a slightly more direct way of saying what you're trying to say (i.e., if someone has a disease, it's not one of these diseases).
I do agree with Jeen's answer, though; there's a lot of subjectivity here, and a great deal of getting it "right" is actually just a matter of finding something that's reasonable to work with, performs well enough for you, and that seems not entirely unnatural.
With respect to the modeling question, I'd like to offer a fourth alternative, which is, in fact, a mix of your options 1 and 2: introduce a separate class (hierarchy) for these 'excluded/missing' symptoms, diseases or treatments, and have the specific exclusions as instances:
:Exclusion a owl:Class .
:ExcludedSymptom rdfs:subClassOf :Exclusion .
:ExcludedTreatment rdfs:subClassOf :Exclusion .
:excludedDyspnea a :ExcludedSymptom .
:excludedChemo a :ExcludedTreatment .
:Patient a owl:Class ;
owl:equivalentClass [ a owl:Restriction ;
owl:onProperty :excluded ;
owl:allValuesFrom :Exclusion ] .
// john is a patient without Dyspnea
:john a :Patient ;
:excluded :excludedDyspnea .
Optionally, you can link the exclusion instances semantically with the treatment/symptom/diseases:
:excludedDyspnea :ofSymptom :Dyspnea .
In my view, this is just as "ontologically correct" (this kind of thing is quite subjective to be honest) as your other options, and possibly a lot easier to maintain, query, and indeed reason with.
As for your second question: while I can't speak for the behavior of the particular reasoner you're using, in general any construction involving complementOf is computationally very heavy, but perhaps more importantly, it probably does not capture what you intend.
OWL has an open world assumption, which (in broad terms) means that we cannot decide a certain fact is untrue simply because that fact is currently unknown. Your complementOf construction will logically be an empty class, because for any individual X, even if we currently do not know that X has been diagnosed with Dyspnea, there is a possibility that in the future that fact will become known, and therefore X will not be in the complement class.
EDIT
In response to your edit, with the proposal using a single :hasFinding property, I think that generally looks good, though I would perhaps modify it slightly:
:patient1 a :Patient;
:hasFinding :dyspneaFinding1 .
:dyspneaFinding1 a :Finding ;
:of :Dyspnea ;
:conclusion false .
You have now separated the 'finding' as a concept a bit more cleanly from the symptom/treatment that it is a finding of. Also, whether or not the finding is positive or negative is explicitly modeled (rather than implied by the presence/absense of an 'excluded' property or a 'Exclusion' type).
(As an aside: since we link an individual with a class here via a non-typing relation (... :of :Dyspnea) we must rely on OWL 2 punning to make this valid in OWL DL)
To query for a patient with a finding (whether positive or negative) about Dyspnea:
SELECT ?x
WHERE {
?x a :Patient;
:hasFinding [ :of :Dyspnea ] .
}
And to query for patients with confirmed absense of Dyspnea:
SELECT ?x
WHERE {
?x a :Patient;
:hasFinding [ :of :Dyspnea ;
:conclusion false ] .
}

Describe a film (entity and attribute) using the first order logic

Good morning,
I want to understand how can I describe something using the first order logic.
For example I want to describe what is a film (an entity) and what is an attribute (for example actor: Clooney) for the film. How can I describe that using the first order logic?
******* UPDATE ********
What I need to explain in first logic order is:
ENTITY: an element, an abstraction or an object that can be described with a set of properties or attributes. So I think that I must says that the entity has got a set of attributes with their respective values. An Entity describes an element, an abstraction or an object.
ATTRIBUTE: an attribute has always got a value and it always associated to an entity. It describes a specific feature/property of the entity.
DOCUMENT: a pure text description (pure text it not contains any html tags). Every document describes only ONE entity through its attribute.
To state that an object has a certain property you would use a single place predicate. For example, to state that x is a film you could write Film(x). If you want to attribute some value to an object you can use two (or more) place predicate. Using your example you could say that Clooney starred in a film as Starred(clooney, x).
There are certain conventions that people use. For example, predicates start with capital letters (Actor, Film, FatherOf) and constants start with a lower case letter (x, clooney, batman). Constants denote objects and predicates say something about the objects. In case of predicates with more than one argument the first argument is usually the subject about which you are making the statement. That way you can naturally read the logical formula as a sentence in normal language. For example, FatherOf(x, y) would read as "x is the father of y".
Answer for the update:
I am not sure whether you can do that in first order logic. You could describe an Entity as something that has certain properties by formula such as
\forall x (Entity(x) ==> Object(x) | Element(x) | Abstraction(x))
This is a bit more difficult for the Attribute. In first order logic an attribute ascribes some quality to an object or relates it to another object. You could probably use a three place predicate as in:
\forall attribute (\exists object (\exists value (Has(object, attribute, value))))
As to the document, that would be just a conjunction of such statements. For example, the description of George Clooney could be the following:
Entity(clooney) & Has(clooney, starred, gravity) & Has(clooney, bornIn, lexington) & ...
The typical way to do this is to explain that a specific object exists and this object has certain attributes. For example:
(∃x)(property1(x) & property2(x) & ~property3(x))
aka: There exists a thing that satisfies properties 1 and 2 but does not satisfy property 3.
Your current question formulation makes it unclear as to what you mean by attributes and documents. Perhaps towards your idea of attributes: it's possible to describe as the domain of property1 all the entities that satisfy it; so, for example, the domain of blue is all blue objects.
First-order logic has nothing to do with HTML -- are you trying to use HTML to represent an entity in first-order logic somehow? It remains incredibly unclear what your question is.

In Prolog (SWI), how to build a knowledge base of user supplied pairs and assert to be equal

I am very new to Prolog and trying to learn.
For my program, I would like to have the user provide pairs of strings which are "types of".
For example, user provides at command line the strings "john" and "man". These atoms would be made to be equal, i.e. john(man).
At next prompt, then user provides "man" and "tall", again program asserts these are valid, man(tall).
Then the user could query the program and ask "Is john tall?". Or in Prolog: john(tall) becomes true by transitive property.
I have been able to parse the strings from the user's input and assign them to variables Subject and Object.
I tried a clause (where Subject and Object are different strings):
attribute(Subject, Object) :-
assert(term_to_atom(_ , Subject),
term_to_atom(_ , Object)).
I want to assert the facts that Subject and Object are valid pair. If the user asserts it, then they belong to together. How do I force this equality of the pairs?
What's the best way to go about this?
Questions of this sort have been asked a lot recently (I guess your professors all share notes or something) so a browse through recent history might have been productive for you. This one comes to mind, for instance.
Your code is pretty wide of the mark. This is what you're trying to do:
attribute(Subject, Object) :-
Fact =.. [Object, Subject],
assertz(Fact).
Using it works like this:
?- attribute(man, tall).
true.
?- tall(X).
X = man.
So, here's what you should notice about this code:
We're using =../2, the "univ" operator, to build structures from lists. This is the only way to create a fact from some atoms.
I've swapped subject and object, because doing it the other way is almost certainly not what you want.
The predicate you want is assertz/1 or asserta/1, not assert/2. The a and z on the end just tells Prolog whether you want the fact at the beginning or end of the database.
Based on looking at your code, I think you have a lot of baggage you need to shed to become productive with Prolog.
Prolog predicates do not return values. So assert(term_to_atom(... wasn't even on the right track, because you seemed to think that term_to_atom would "return" a value and it would get substituted into the assert call like in a functional or imperative language. Prolog just plain works completely differently from that.
I'm not sure why you have an empty variable in your term_to_atom predicates. I think you did that to satisfy the predicate's arity, but this predicate is pretty useless unless you have one ground term and one variable.
There is an assert/2, but it doesn't do what you want. It should be clear why assert normally only takes one argument.
Prolog facts should look like property(subject...). It is not easy to construct facts and then query them, which is what you'd have to do using man(tall). What you want to say is that there is a property, being tall, and man satisfies it.
I would strongly recommend you back up and go through some basic Prolog tutorials at this point. If you try to press forward you're only going to get more lost.
Edit: In response to your comment, I'm not sure how general you want to go. In the basic case where you're dealing with a 4-item list with [is,a] in the middle, this is sufficient:
build_fact([Subject,is,a,Object], is_a(Subject, Object)).
If you want to isolate the first and last and create the fact, you have to use univ again:
build_fact([Subject|Rest], Fact) :-
append(PredicateAtoms, [Object], Rest),
atomic_list_concat(PredicateAtoms, '_', Predicate),
Fact =.. [Predicate, Subject, Object].
Not sure if you want to live with the articles ("a", "the") that will wind up on the end though:
?- build_fact([john,could,be,a,man], Fact).
Fact = could_be_a(john, man)
Don't do variable fact heads. Prolog works best when the set of term names is fixed. Instead, make a generic place for storing properties using predefined, static term name, e.g.:
is_a(john, man).
property(man, tall).
property(john, thin).
(think SQL tables in a normal form). Then you can use simple assertz/1 to update the database:
add_property(X, Y) :- assertz(property(X, Y)).

Predicate vs Functions in First order logic

I have been so confused lately regarding difference between predicate and function in first order logic.
My understanding so far is,
Predicate is to show a comparison or showing a relation between two objects such as,
President(Obama, America)
Functions are to specify what a particular object is such as,
Human(Obama)
Now am I heading on right track to differentiate these two terms or I am completely wrong and need a brief explanation, I would like to have opinion from expert to clarify my knowledge(or approve my understanding). Thanks in advance
Krio
A predicate is a function that returns true or false.
Function symbols,
which map individuals to individuals
–
father-of(Mary) = John
–
color-of(Sky) = Blue
•
Predicate symbols,
which map individuals to truth values
–
greater(5,3)
–
green(Grass)
–
color(Grass, Green)
From what I understand
Function returns a value that is in the domain, mapping n elements to a single member of the domain.
Predicate confirms whether the relation you are trying to make is true or not according to the axioms and inference rules you are following in your system.
Predicate is confirmation for a particular property an objects or relation between objects. that is telling that property exists for that object. if you are given a formula P for president of America then
P(Obama,America)=true.
it tells you you are right and that property of Obama being President of America is true and that relation of Obama being president of America is true but
P(Putin,America)=false.
tells Putin being Americas president is false thus telling you that an object or objects holds or does not hold a particular property or relation.
As for functions returns the value associated with a specific property of an object like America's President , Ann's mother etc. You give them a value and they will return a value.Like let P be a function that returns the president of country passed as arguments
P(America)=Obama.
P(Russia)=Putin.
Functions are relations in which there is only one value for a given input.
source : AIMA (Artificial Intelligent A Modern Approach Book)
more description in the image:

Resources