Can a person have null name? - validation

I am writing an app that has a sign-up form. This article made me doubt everything I knew about human names. My question is: does a person's name necessarily have positive length? Or can I validate names in this way and be confident that I have not denied anyone their identity?
P.S.: one might ask why am I validating at all. The answer is that this is for a school project and proper validation is a part of the mark. The article above proves that person's name can be pretty much any string of positive length but I don't know if zero length is OK.

With all types of programming, you have to draw a distinction between what is meaningful in the real world, and what is meaningful for your software solution.
How the data is to be used will validate what type of validation is required.
For instance, if your software interfaces with a government API, and the government API requires a first name and surname, you should do the same.
If you're interacting with bank accounts, you may have a single string which represents that account name, which many or may not be a human name or not, but may have other constraints around length.
If the name is only to be used for display purposes, maybe there is no point to capture the name at all, and instead you should capture a preferred display name (which doesn't needlessly assume a certain number of name components).
When writing software, you should target to make as few assumptions as possible, unless those assumptions will cause an increase in complexity of your software solution. If the software requires people to have non-empty names, then you should validate at the border that this is true.
In addition, if you were my student, you would have already lost marks for conflating null, and an empty string. In this instance, null would represent you lack data about the name, and an empty string would indicate that user has specified that their name is empty.
Also, if you decide not to validate something, you should at least leave a comment to indicate that you thought of it. If you do something unusual, it's possible a future developer may come along and fix the "bug". In addition, this helps you avoid losing marks.

Related

Internationalisation - displaying gendered adjectives

I'm currently working on an internationalisation project for a large web application - initially we're just implementing French but more languages will follow in time. One of the issues we've come across is how to display adjectives.
Let's take "Active" as an example. When we received translations back from the company we're using, they returned "Actif(ve)", as English "Active" translates to masculine "Actif" or feminine "Active". We're unsure of how to display this, and wondered if there are any well established conventions in the web development world.
As far as I see it there are three possible scenarios:
We know at development time which noun a given adjective is referring to. In this case we can determine and use the correct gender.
We're referring to a user, either directly ("you") or in the third person. Short of making every user have a gender, I don't see a better approach than displaying both, i.e. "Actif(ve)"
We are displaying the adjective in isolation, not knowing which noun it's referring to. For example in a table of data, some rows might be dealing with a masculine entity, some feminine.
Scenarios 2 and 3 seem to be the toughest ones. Does anyone have any experience handling these issues? Any tips would be appreciated!
This is complex, because we cannot imagine all the cases, and there is risk to go in "opinion based" answer, so I keep it short and generic.
Usually I prefer to give context in translation (for translator), e.g. providing template: _("active {user_name}" (so also the ordering will be correct if languages want different ordering).
Then you may need to change code and template into _("active {first_name_feminine}") and _("active {first_name_masculine}") (and possibly more for duals, trials, plurals, collectives, honorific, etc.). Note: check that the translator will not mangle the {} and the string inside. Usually you need specific export/import scripts. Or I add a note inside the string, and I quickly translate into English removing the note to the translator). Also this can be automated (be creative on using special Unicode characters which should not be used in normal text, to delimit such text).
But if you cannot know the gender, the Actif(ve) may be the polite version used in such language. You need a native speaker test, and changes back and forth.

HL7 FHIR mark resources as anonymized

I am trying to map an existing domain into HL7 FHIR.
So far it was pretty easy to find FHIR resources that more or less represent the same data and can be used for that purpose. But now I am running into a problem of which I am not sure how to solve it.
The existing domain allows that data can be anonymized depending on the users access level. e.g. a patient's name or address might be removed and marked as anonymized. Other data will be pseudonymised, for example a the birthdate in 1980 will be replaced with 01.01.1980. An Age of 37 will be replaced with a category of 30-40.
So I am unsure how to integrate that into the FHIR domain. I was thinking I could create an extension holding a boolean, indicating if a value was anonymized or not and always replace or remove the original value. This might work, but I will run into big problems when the anonymized value is of a different type than the original value (e.g. Age is replaced by a range of values)
Is that even a valid approach? I thought this might be common problem, but I could not find any examples where people described methods of how to mark data as altered. Unfortunately the documentation at http://build.fhir.org/extensibility-registry.html does not contain anything that would help my case.
You can use security labels for this purpose (Resource.meta.security). Take a look at REDACTED and SUBSETTED in the security label value set: https://www.hl7.org/fhir/valueset-security-labels.html
If you need to convey a data type other than the one allowed by the resource (e.g. wanting to convey a range rather than a birthdate), you'd need to use an extension. (Note that dates are valid even if you only include the year.)

Using did as boolean flag name

I named one of my boolean parameters didInfoChange.
Many people on my team tell me to change it to isInfoChanged, which I don't agree. It maybe because my team members aren't native English speaker (neither am I), but I feel that ifInfoChanged is just isn't right.
didInfoChange -> Did information change? -> True/False
is pretty understand in my oppinion
isInfoChanged -> is info changed?
just does not sound right.
It's probably not a big deal to fight about this, but I did some search and people do not really use did for flag name. I'm ok with hasInfoChanged but has and did is basically the same thing.
I'm wondering why is did not ok?
There are two questions here:
1) Which is better, didInfoChange or isInfoChanged?
The English word "change" can be transitive or intransitive, but in this context it is clear that "the info is changed" and "the info did change" mean exactly the same thing. (There is a subtle difference in connotation, but it is of no importance here.) The two names have the same length. There seems to be no difference except style convention.
2) If your way is better than theirs, what should you do?
Consider the consequences of your actions.
If you have the power to persuade the rest of the team to use your variable name, at no cost, then do so. If doing so would cause stress (e.g. by commanding your subordinates to do something they consider a bad idea), then the improvement in style probably isn't the cost to the group dynamic.
If you cannot persuade them, but you can prolong the argument and prevent the team from doing constructive work, then... don't. Use their variable name.
If you cannot prolong the argument, but you can make yourself unpopular by being argumentative, then... don't. Use their variable name.
Beside is it is also sometimes admissible to use has in naming Boolean getter-methods, depending on which auxiliary verb would be used in spoken language; I never saw did as part of a Boolean identifier.
With hasInfoChanged you would keep the participle ending (e)d. Maybe that satisfies the rest of your team.
infoChanged could be mistaken for an EventHandler-Delegate.
Unfortunately I am not a native English speaker, either.
It depends on the context and what this field "really" expresses in semantics.
didInfoChange puts emphasis on the completed action (past-tense implied) by "did" + action-verb
isInfoChanged puts emphasis and indicates the current-state when asking now by is + state; where the past-tense is indicated by the passive "changed"
Note: Info is the vague part in the name. is is a common indicator for boolean fields or getters - indicating a question (same like has or can). did is rather rarely used because we usually ask for the current state at runtime using is. The completion or history can be expressed by other parts of the name, like specific action-verb in past-tense.
Other ways of recording/asking for change
What about asking more about the context of this change:
Who did change the information ? Like an audit field (changedBy) conveying also who did change something.
When was the information changed ? Like a audit date/time field (changedAt), not only telling that it was changed, but also when.
What information was changed ? Like capturing the change itself (lastChange) which could also be null if not changed at all.
In most ORM-frameworks which capture audit-information like this (when/who/what was changed) we can see fields like createdBy or createdAt for the initial user and timestamp when created, modifiedBy or modifiedAt for the user and timestamp that last updated or changed the object.
Sometimes also a version-indicator helps to keep track of the number of changes.
Keep it simple
One compromise could be inspired by KISS-principle:
Have a boolean field changed which could for enrichment-purpose also hold a timestamp along with a getter named isChanged to query the current state like asking a human question like:
is [this] changed ?
Note: [this] is implied when invoking the method on the object like this.isChanged().

Build a website: should I use a number or random unique string as ID in URLs?

Hi I am building an Internet website with Java and Spring framework. I believe my question is not technology or framework related.
I need to have links in user interface so that visitors can click and to see records. These links have the format of
http://mysite.com?id=number-id-or-random-unique-string
Not all records are allowed to view. For the ID parameter in the URL, I could use the database-generated number as the ID value and so I do not need to have additional programming. Or I could use unique random string (for example: jcTDjhdDUls) as the ID value (I have to program this part). Numbers allow curious people (with good or bad intentions) to EASILY guess and try other IDs. Unique random strings seems better in this regard.
However, no matter numbers or strings as the value for the ID, I have security check in the backend code to see whether a visitor is allowed to see a record. From this perspective, I am not sure what is the real benefit of having random string as the ID.
I hope to have input from experienced people. What design decision do you choose? Or other better ideas?
Thanks and regards.
You certainly can if you want to, but I would not go through the trouble to randomize the ID. This is at its root, "security through obscurity (STO)." Sometimes STO is useful, but in this case I don't think it is worth complicating and bloating the code and memory footprint. It's surprisingly easy to enumerate the valid IDs whether they're randomized or not, using a tool like Burp Suite. All the security controls that really matter should be implemented in the backend.

Is it acceptable to normalize text box content when it loses focus?

I have received requirements that ask to normalize text box content when the user changes the focus to another control on the same data input form. Example normalizations:
whitespace at the start and end of the input is trimmed
If the text box was made empty and this is not valid, replace the content of the text box with the default value
I have a feeling that this is not in line with good GUI design. I have read the Windows UX Guidelines for text boxes but I did not immediately find any relevant rules.
Is normalizing text box content in this way acceptable?
I have definitely seen this before (examples elude me right now) but I personally don't like it when the UI changes my input.
If the UI is smart enough to change my input on me then it should accept it as is and change the value when it needs to process it.
When the input changes auto-magically you are now forcing the user to stop and ask themselves why it changed and if they did something wrong or if the application has an error. Don't make the user think!
Generally, you should accept user input exactly has they entered it. Chances are users did it that way for a good reason. For example, imagine a user entering a foreign address, and then your app screws it up trying to format like a domestic address. At the very least, users entered the input the way they’re used to it being, so changing it can make it hard for them to cross-check it.
However, there are several exceptions:
Add defaults to incomplete input. Adding input the user left off (e.g., years to dates, units to dimensions) provides good feedback on how the app is interpreting the input that would otherwise be ambiguous. This also encourages the user to use defaults, making their input more efficient.
Resolve other ambiguities. Change to an unambiguous format if the user’s format is open to interpretation. For example, if you have international users, you may want to change “9-8-09” to “Sep 8 2009” (or “9 Aug 2009”) to provide feedback on what your app considers the month and day to be.
Add delimiters when none provided. Automagically adding standard or even arbitrary delimiters to long alphanumeric strings (e.g., phone numbers, credit card numbers, serial numbers) provides an input display that the users can crosscheck more easily. Sometimes users may enter a string without delimiters in order to go faster or because they are the victim of web abuse by sites that refuse to accept even standard delimiters.
Spelling, grammar, and capitalization correction. Users often appreciate this, but only if there’s also a means to override it. Some users like to use "i" as the first person pronoun.
If the field is used by more than one user, then you probably should automatically format the value in some standard way that accommodates the majority of your users, but that should be done when the value is stored on the backend, not when focus leaves the field. For example, if a user enters a time of 15:30 it should remain as 15:30 as long as the user views the page. However, the next time a user (any user) retrieves the data, it should appear as 3:30pm (if that’s how most of your users are used to seeing time).
Such backend formatting applies to trimming whitespace so that all users can search, find, and sort on the field consistently. It’s probably not a good idea to replace a blank value (or any invalid value) with the default because users are unlikely to anticipate getting that value. An exception would perhaps be changing blank to 0 for numeric fields in situations where obviously blank == none == zero, but again this probably should be done when storing in the backend, not in the field itself. If blank is ambiguous, (e.g., may mean 0 or may mean "I don't know") then the second bullet above applies, and you may want to autocorrect in the field when focus is lost.
Of course, if your users vary in how they need to have a data type formatted, then you can have different variants of the app that display the data type in different ways for different user groups, or you can make the format of the data type a user preference, but that’s really another issue.
If the user wants it, and the Stakeholder ask for it, then is perfectly safe.
Trimming is very common. and the replace is common when you are talking about filling textbox with numbers. (a 0 instead of a blank).
It's a fairly standard feature, especially the whitespace trimming. The default value replacement raises a larger flag just because it is less common.
I'm pretty sure that I've seen versions of Microsoft Office that do this - putting "pt." after a value in points, for instance. Microsoft's endorsement should be a good sign.
We have quite a few of these kind of requirement. The reason given for forcing a default value rather than a blank space is that it looks better in reports or if the client wants to see the live system. A blank looks a bit like "couldn't be bothered to enter anything". For a similar reason, we often upper-case the text for consistency as the users never use consistent formatting.

Resources