Avoid data redundancy in Prolog - prolog

I'm tinkering with Prolog and ran into the following problem.
Assume I want to create a little knowledge base about the courses of an university.
I need the following two relation schemes:
relation scheme for lecturer: lecturer(Name,Surname)
relation scheme for course: course(Topic,Lecturer,Date,Location).
I have a lecturer, John Doe:
lecturer(doe,john).
John Doe teaches the complexity class:
course(complexity,lecturer(doe,john),monday,roomA).
Now I have a redundancy in the information - not good!
Is there any way to achieve something like this:
l1 = lecturer(doe,john).
course(complexity,l1,monday,roomA).
Many thanks in advance!

The same normalization possibilities as in data bases apply:
id_firstname_surname(1, john, doe).
and:
course_day_room_lecturer(complexity, monday 'A', 1).
That is, we have introduced a unique ID for each lecturer, and use that to refer to the person.

Related

Firebase many to many performance

I'm wondering about the performance of Firebase when making n + 1 queries. Let's consider the example in this article https://www.firebase.com/blog/2013-04-12-denormalizing-is-normal.html where a link has many comments. If I want to get all of the comments for a link I have to:
Make 1 query to get the index of comments under the link
For each comment ID, make a query to get that comment.
Here's the sample code from that article that fetches all comments belonging to a link:
var commentsRef = new Firebase("https://awesome.firebaseio-demo.com/comments");
var linkRef = new Firebase("https://awesome.firebaseio-demo.com/links");
var linkCommentsRef = linkRef.child(LINK_ID).child("comments");
linkCommentsRef.on("child_added", function(snap) {
commentsRef.child(snap.key()).once("value", function() {
// Render the comment on the link page.
));
});
I'm wondering if this is a performance concern as compared to the equivalent of this query if I were using a SQL database where I could make a single query on comments: SELECT * FROM comments WHERE link_id = LINK_ID clause.
Imagine I have a link with 1000 comments. In SQL this would be a single query, but in Firebase this would be 1001 queries. Should I be worried about the performance of this?
One thing to keep in mind is that Firebase works over web sockets (where available), so while there may be 1001 round trips there is only one connection that needs to be established. Also: a lot of the round trips will be happening in parallel. So you might be surprised at how much time this takes.
Should I worry about this?
In general people over-estimate the amount of use they'll get. So (again: in general) I recommend that you don't worry about it until you actually have that many comments. But from day 1, ensure that nothing you do today precludes optimizing later.
One way to optimize is to further denormalize your data. If you already know that you need all comments every time you render an article, you can also consider duplicating the comments into the article.
A fairly common scenario:
/users
twitter:4784
name: "Frank van Puffelen"
otherData: ....
/messages
-J4377684
text: "Hello world"
uid: "twitter:4784"
name: "Frank van Puffelen"
-J4377964
text: "Welcome to StackOverflow"
uid: "twitter:4784"
name: "Frank van Puffelen"
So in the above data snippet I store both the user's uid and their name for every message. While I could look up the name from the uid, having the name in the messages means I can display the messages without the lookup. I'm also keeping the uid, so that I provide a link to the user's profile page (or other message).
We recently had a good question about this, where I wrote more about the approaches I consider for keeping the derived data up to date: How to write denormalized data in Firebase

Defining a flexible structure in Prolog

Well, I'm a bit new to Prolog, so my question is on Prolog pattern/logic.
I have an relationship called tablet. It has many parameters, such as name, operationSystem, ramCapacity, etc. I have many objects/predicates of this relationship, like
tablet(
name("tablet1"),
operatingSystem("ios"),
ramCapacity(1024),
screen(
type("IPS"),
resolution(1024,2048)
)
).
tablet(
name("tablet2"),
operatingSystem("android"),
ramCapacity(2048),
screen(
type("IPS"),
resolution(1024,2048),
protected(yes)
),
isSupported(yes)
).
And some others similar relationships, BUT with different amounts of parameters. Some of attributes in different objects I do not need OR I have created some tablets, and one day add one more field and started to use it in new tablets.
There are two questions:
I need to use the most flexible structure as possible in prolog. Some of the tablets have attributes/innerPredicates and some do not, but They are all tablets.
I need to access data the easiest way, for example find all tablets that have ramCapacity(1024), not include ones that do not have this attributes.
I do need to change some attributes' values in the easiest way. For example query - change ramCapacity to 2048 for tablet that has name "tablet1".
If it's possible it should be pretty to read in a word editor :)
Is this structure flexible? Should I use another one? Do I need additional rules to manipulate this structure? Is this structure easy to change with query?(I keep this structure in a file).
Since the number of attributes is not fixed and needs to be so flexible, consider to represent these items as in option lists, like this:
tablet([name=tablet1,
operating_system=ios,
ram_capacity=1024,
screen=screen([type="IPS",
resolution = res(1024,2048)])]).
tablet([name=tablet2,
operating_system=android,
ram_capacity=2048,
screen=screen([type="IPS",
resolution = res(1024,2048)]),
is_supported=yes]).
You can easily query and arbitrarily extend such lists. Example:
?- tablet(Ts), member(name=tablet2, Ts).
Ts = [name=tablet2, operating_system=android, ram_capacity=2048, screen=screen([type="IPS", resolution=res(..., ...)]), is_supported=yes] ;
false.
Notice also the common Prolog naming_convention_to_use_underscores_for_readability instead of mixingCasesAndMakingEverythingExtremelyHardToRead.

How to limit the number of retrieved characters from a database field in rails?

Consider a passage (~400 characters) in a database table(text).
Like
There is only one more week to Easter. I have already started my
holiday. The idea of visiting my uncle during this Easter is
wonderful. His farm is in this village down in Cornwall. This village
is very peaceful and beautiful. I have asked my aunt if I can bring
Sam, my dog, with me. I promise her I will keep him under control. He
attacked and he ate some animals from her farm in October. But he is
part of the family and I cannot leave him behind.
but i need to retrieve only limited characters from that like ~150 characters only.
There is only one more week to Easter. I have already started my
holiday. The idea of visiting my uncle during this Easter is
wonderful. His farm is in this village down in Cornwall. This village
is very peaceful...
Is there any function in rails or only truncate(:limit,:option{}) function for that output?
Assuming you have a model Passage with a field text, you can select specific field (and use SQL functions within) like this:
passages = Passage.select("id, LEFT(text,10) as text_short, CHAR_LENGTH(text) as text_length")
# => [#<Passage id: 1>, #<Passage id: 2>, #<Passage id: 3>]
passages.first.id
# => 1
passages.first.text_short
# => "There is o"
passages.first.text_length
# => 453
Why not get the whole string and only use the first 150 characters? I doubt it will slow things down much at all.
somehow_access_string[0...150] + '...'

Prolog list issue

I have the following rules:
/*The structure of a subject teaching team takes the form:
team(Subject, Leader, Non_management_staff, Deputy).
Non_management_staff is a (possibly empty) list of teacher
structures and excludes the teacher structures for Leader and
Deputy.
teacher structures take the form:
teacher(Surname, Initial,
profile(Years_teaching,Second_subject,Club_supervision)).
Assume that each teacher has his or her team's Subject as their
main subject.*/
team(computer_science,teacher(may,j,profile(20,ict,model_railways)),
[teacher(clarke,j,profile(32,ict,car_maintenance))],
teacher(hamm,p,profile(11,ict,science_club))).
team(maths,teacher(vorderly,c,profile(25,computer_science,chess)),
[teacher(o_connell,d,profile(10,music,orchestra)),
teacher(brankin,p,profile(20,home_economics,cookery_club))],
teacher(lynas,d,profile(10,pe,football))).
team(english,teacher(brewster,f,profile(30,french,french_society)),
[ ],
teacher(flaxman,j,profile(35,drama,debating_society))).
team(art,teacher(lawless,m,profile(20,english,film_club)),
[teacher(walker,k,profile(25,english,debating_society)),
teacher(brankin,i,profile(20,home_economics,writing)),
teacher(boyson,r,profile(30,english,writing))],
teacher(carthy,m,profile(20,music,orchestra))).
I am supposed to bring back the initial and surname of any leader in a team that contains a total of 2 or more teachers with ict as their second subject.
I am new to prolog so unsure of this. Also, I have gotten back the results correctly but it is being returned 3 times.
Any help on this would be greatly appreciated.
Also, my aplogies if this is terribly easy.
You didn't provide the code you use to find these teachers, so I can't say for sure, but if there were a team with 3 members w/ ict as their second subject (for example, computer_science), then there would be 3 ways to find 2 (AB, AC, and BC), which would explain your multiple results. But saying how to modify your code to fix that would require seeing the code to be fixed.

Algorithm for grouping names

What's a good way to group this list of names:
Doctor Watson.
Dr. John Watson.
Dr. J Watson.
Watson.
J Watson.
Sherlock.
Mr. Holmes.
S Holmes.
Holmes.
Sherlock Holmes.
Into a grouped list of unique and complete names:
Dr. John Watson.
Mr. Sherlock Holmes.
Also interesting:
Mr Watson
Watson
Mrs Watson
Watson
John Watson
Since the algorithm doesn't need to make inferences about whether the first Watson is a Mr (likely) or Mrs but only group them uniquely, the only problem here is that John Watson obviously belongs to Mr and not Mrs Watson. Without a dictionary of given names for each gender, this can't be deduced.
So far I've thought of iterating through the list and checking each item with the remaining items. At each match, you group and start from the beginning again, and on the first pass where no grouping occurs you stop.
Here's some rough (and still untested) Python. You'd call it with a list of names.
def groupedNames(ns):
if len(ns) > 1:
# First item is query, rest are target names to try matching
q = ns[0]
# For storing unmatched names, passed on later
unmatched = []
for i in range(1,len(ns)):
t = ts[i]
if areMatchingNames(q,t):
# groupNames() groups two names into one, retaining all info
return groupedNames( [groupNames(q,t)] + unmatched + ns[i+1:] )
else:
unmatched.append(t)
# When matching is finished
return ns
If your names are always of the form [honorific][first name or initial]LastName, then you can start by extracting and sorting by the last name. If some names have the form LastName[,[honorific][first name or initial]], you can parse them and convert to the first form. Or, you might want to convert everything to some other form.
In any case, you put the names into some canonical form and then sort by last name. Your problem is greatly reduced. You can then sort by first name and honorific within a last name group and then go sequentially through them to extract the complete names from the fragments.
As you noted, there are some ambiguities that you'll have to resolve. For example, you might have:
John Watson
Jane Watson
Dr. J. Watson
There's not enough information to say which of the two (if either!) is the doctor. And, as you pointed out, without information about the gender of names, you can't resolve Mr. J. Watson or Mrs. J. Watson.
I suggest using hashing here.
Define a hash function as interpreting the name as a base 26 number where a = 0 and z = 25
Now just hash the individual words. So
h(sherlock holmes) = h(sherlock) + h(holmes) = h(holmes) + h(sherlock).
Using this you can easily identify names like:
John Watson and Watson John
For ambiguities like Dr. John Watson and Mr John Watson you can define the hash value for Mr and Dr to be the same.
To resolve conflicts like J. Watson and John Watson, you can just have just the first letter and the last name hashed. You can extend the idea for similar conflicts.

Resources