I'm trying to figure out how the patient searching on EPIC FHIR is working.
Testing all on the sandbox here: https://fhir.epic.com/Documentation?docId=testpatients.
The docs:
Starting in May 2019, Patient.Search requests require one of the following minimum data sets by > default in order to match and return a patient record:
FHIR ID
{IDType}|{ID}
SSN identifier
Given name, family name, and birthdate
Given name, family name, legal sex, and phone number/email
This is working correctly (returning one patient):
/api/FHIR/R4/Patient?family=Lin&given=Derrick&birthdate=1973-06-03
But this is also returning same record (extra character in family, wrong gender)
:
/api/FHIR/R4/Patient?family=Lina&given=Derrick&birthdate=1973-06-03&gender=female
Also this one is returning one record (extra character in family, no given name):
/api/FHIR/R4/Patient?family=Lina&birthdate=1973-06-03
Not sure what I am doing wrong, or is it expected behaviour?
There is a bunch of history here, but Epic's current Patient.search behaves more like Patient.$match. Specifically, the criteria provided to Patient.search are combined using (approximately) OR logic rather than AND logic. Behind the scenes, it is actually more of weighted score, but ultimately, the more criteria you provide, the more possible results you might get. This is often counterintuitive if you are used to how REST API query params normally work. Technically it is spec legal though, since FHIR has a blurb in there about servers being able to return other appropriate results as it sees fit.
https://build.fhir.org/search.html#Introduction
However, the server has the prerogative to return additional search results if it believes them to be relevant.
We don't have any specific updates right now, but there may be changes coming in Soon(tm).
I'm surprised the last one is returning any results, but regarding the first two searches, this is quite possible or even expected with Epic. Epic has special logic in the background that evaluates the parameter values you pass in against certain criteria, such as whether the name matches exactly, the name is similar, the birthdate matches exactly, etc. As a result, oftentimes not only exact matches but also similar matches will be returned by the Patient.Search API. The weighting of the criteria is customizable by Epic customer, so some may have stricter logic than others.
I'd recommend always validating the returned result against your input parameters to verify you are working with an exact match.
Related
I am trying to map an existing domain into HL7 FHIR.
So far it was pretty easy to find FHIR resources that more or less represent the same data and can be used for that purpose. But now I am running into a problem of which I am not sure how to solve it.
The existing domain allows that data can be anonymized depending on the users access level. e.g. a patient's name or address might be removed and marked as anonymized. Other data will be pseudonymised, for example a the birthdate in 1980 will be replaced with 01.01.1980. An Age of 37 will be replaced with a category of 30-40.
So I am unsure how to integrate that into the FHIR domain. I was thinking I could create an extension holding a boolean, indicating if a value was anonymized or not and always replace or remove the original value. This might work, but I will run into big problems when the anonymized value is of a different type than the original value (e.g. Age is replaced by a range of values)
Is that even a valid approach? I thought this might be common problem, but I could not find any examples where people described methods of how to mark data as altered. Unfortunately the documentation at http://build.fhir.org/extensibility-registry.html does not contain anything that would help my case.
You can use security labels for this purpose (Resource.meta.security). Take a look at REDACTED and SUBSETTED in the security label value set: https://www.hl7.org/fhir/valueset-security-labels.html
If you need to convey a data type other than the one allowed by the resource (e.g. wanting to convey a range rather than a birthdate), you'd need to use an extension. (Note that dates are valid even if you only include the year.)
I am looking for geocodes with the google geocode-API:
http://maps.googleapis.com/maps/api/geocode/json?address=london%c2+UK&sensor=false
The problem is, that the input isn't very accurate (specially the street) and sometimes google mixes things up and ignores UK, because the street has a perfect match (as street and city) somewhere else. e.g. US.
Now i cannot solve this issue (input data), but I am wondering if there is a parameter, which forces google to search in UK and return no result instead of a completly wrong result.
You can add component filters in the url to constraint results. In this case you can use:
http://maps.googleapis.com/maps/api/geocode/json?address=london%c2+UK**&components=country:UK**&sensor=false
For more information about how to use component filtering see:
https://developers.google.com/maps/documentation/geocoding/intro#ComponentFiltering
I'm creating a class that needs to parse user contact info to determine if the presented user already exists in the db. Because the source is unvalidated, user generated data I have to test for matches under a variety of conditions.
The content is presented in 3 fields - Name (first & last are combined); Company Name; Email
I need to return a result based on each of these possible match conditions:
Exact Match
Email Match
Domain Name Only
Full Name Exact
Last Name Only
Institution Match
I have a rough idea of how I'd go about coding this and am sure that the result would be inferior to what would be produced by a formal TDD approach. My TDD learning curve is just past the very basics but I don't have the depth to see how the above scenario is staged and developed thru the full lifecycle.
I'd like some help structuring the project from an architectural point of view.
thx
Seems like tou already listed the primary positive test cases in your list of match types. So take those from the top, write a small test for the first case (exact match), warch it fail, make it pass, iterate until exact match works. Then do the same for the other match types.
I'm currently developing a website that allows a search on a PostgreSQL
database, the search works with to_tsquery() and I'm trying to find a way to validate the input before it's being sent as a query.
Other than that I'm also trying to add a phrasing capability, so that if someone searches for HELLO | "I LIKE CATS" it will only find results with "hello" or the entire phrase "i like cats" (as opposed to I & LIKE & CATS that will find you articles that have all 3 words,
regardless where they might appear).
Is there some reason why it's too expensive to let the DB server validate it? It does seem a bit excessive to duplicate the ts_query parsing algorithm in the client.
If the concern is that you don't want it to try running the whole query (which presumably will involve table access) each time it validates, you could use the input in a smaller query, just in pseudocode (which may look a bit like Python, but that's just coincidence):
is_valid_query(input):
try:
execute("SELECT ts_query($1)", input);
return True
except DatabaseError:
return False
With regard to phrasing, it's probably easiest to search by the non-phrased query first (using indexes), then filter those for having the phrase. That could be done server side or client side. Depending on the language being parsed, it might be easiest to construct a simple regex of the phrase that deals with repeated whitespace or other ignorable symbols.
Search for to_tsquery('HELLO|(I&LIKE&CATS)'), getting back a list of documents which loosely match.
In the client, filter that to those matching the regex "HELLO|(I\s+LIKE\s+CATS)".
The downside is you do need some additional code for translating your query into the appropriate looser query, and then for translating it into a regex.
Finally, there might be a technique in PostgreSQL to do proper phrase searching using the lexeme positions that are stored in ts_vectors. I'm guessing that phrase searches are one of the intended uses, but I couldn't find an example of it in my cursory search. There's a section on it near the bottom of http://linuxgazette.net/164/sephton.html at least.
This is silly, but I haven't found this information. If you have names of concepts and suitable references, just let me know.
I'd like to understand how should I validate a given named id for a generic entity, like, say, an email login, just like Yahoo, Google and Microsoft do.
I mean... If you do have an user named foo, trying to create foo2 will be denied, as it is likely to be someone trying to mislead users by using a fake id.
Coming to mind:
Levenshtein Distance
Hamming Distance
You're going to have to take a two pass approach.
The first is a potential RegEx expression to validate that the entity name meets your specifications as much as possible. For example, disallowing certain characters.
The second is to perform some type of fuzzy search during the name creation. This could be as simple as a LIKE '%value%' where clause or as complicated as using some type of full-text search and limiting hits to a certain relevance rating.
That said, I would guess the failure rate (both false positives and false negatives ) match would be high enough to justify not doing this.
Good luck.