Extract capabilities for a single FHIR resource - hl7-fhir

On the Capabilities interaction (/metadata) what would be the best way to extract the capabilities for a single resource type, like Patient?
The intent is to determine the available search parameters of that resource type. Currently I can get the whole capability statement which weights around 700Kb.

Right now, there's no mechanism to do what you're interested in. For R4 or R5 we're exploring alternatives that allow clients to retrieve more limited information. Feel free to submit your requirements as a change request against the specification.

Related

Intended design on how to implement cordite dgl tokens

I have a few questions regarding the cordite dgl tokens.
I want to tokenize an asset which is represented as LinearState.
Is there a way to only allow a single issuance of tokens?
Is the preferred way to link the tokens to other states through the TokenSymbol?
What is the intended design of accounts? One Account per usecase or one per TokenType?
How do I query if a specific token already exists? Is the only way to query for an account and look at the result (so there is no function to search for balance by TokenSymbol)?
Thanks in advance!
Great questions, thanks!
It's not possible right now but it's definitely recognised as being desirable. It will require a change to the core which is fine. I think this work will require Corda 4, notably reference states and linear state pointer types for it to work. Notably, we will want:
a. There are so many ways of limiting issuance: not only by final amount, but also by distribution rate, signing parties (in the case where multiple are required for the issuance) etc. We need means of inserting the notion of finite issuance into the token type, optionally. This will be either adding additional fields to the existing TokenType. Or better, making TokenType open so that it can be extended. Another way is to provide a field to a base type or interface that will encode the contract rules.
b. we could transfer the token type as an attached StateAndRef, but we need to be conscious of the per tx storage, network, signing and verification cost of doing this. A better approach is to use the reference data feature in Corda 4 that we are eagerly waiting for.
If the other states are part of the same transaction that emits the tokens, then the linking is implicit. If the other state is not in the same transaction as the token, then linking for now, in Corda 3, would need to use TokenType descriptor. Alternatively, it can also reference a StateAndRef in the tx that generates the other state. The most efficient we think is to use Corda 4's reference states (scheduled for Dec/Jan this year I believe)
Accounts are designed to store tokens of multiple token types. They are really aligned for the business use-case and aren't limited by specific token types, unless you want to enfore that in the application layer.
Do you mean you want to get the balance of a TokenType across all accounts? You can certainly use the Corda's API to locate Tokens - this isn't exposed via Braid yet but could certainly could be. Another approach is to tag/alias all your accounts with the same tag e.g. { category:'all-accounts', value:''} You can then do a ledger.balanceForAccountTag({ category: 'all-accounts', value:''}) to get balances across all accounts. However, this returns the balances for all TokenTypes. What would the ideal API look like for you?

go-diameter: support for different AVP dictionary for different network provider (i.e. Ericsson, Nokia) and different nodes (i.e. GGSN, Tango)

We are working on creating a diameter adapter for OCS. Currently our AVP dictionary is as supplied by go-diameter.
We are trying to Provide a configurable dictionary to support Following
Vendor Specific AVPs to support different network providers, like Nokia and Ericsson
Support for different Network traffic, like VoLTE, GGSN, Tango.
Following are the two approaches that we are currently thinking on.
Include a single dictionary with all supported AVPs and have a single release of diameter adapter. The intelligence to be build inside the code for identifying which AVPs are required for which node.
Have different releases for each dictionary that we want to support, and deploy which ever is required by the service provider.
I have search over the internet to see if something similar has been done as a proof of concept. Need help in identifying which is better solution for implementation.
Im not familiar with go-diameter but My suggestion: Use one dictionary
This dictionary should be used by all vendors and providers.
Reasons:
You dont know how many different releases you will end up with and you might need to support many dictionaries at the end.
If you use few dictionaries most of the AVPs will be the same on all
As bigger as your one dictionary will be you will support more AVPs everywhere and you never 100% sure which AVP might arrive from different clients

How to store User Fitness / Fitness Device data in FHIR?

We are currently in the process of evaluating FHIR for use as part of our medical record infrastructure. For the EHR data (Allergies, Visits, Rx, etc..) the HL7 FHIR seems to have an appropriate mapping.
However, lots of data that we deal with is related to personal Fitness - think Fitbit or Apple HealthKit:
Active exercise (aerobic or workout): quantity, energy, heart-rate
Routine activities such as daily steps or water consumption
Sleep patterns/quality (odd case of inter-lapping states within the same timespan)
Other user-provided: emotional rating, eating activity, women's health, UV
While there is the Observation resource, this still seems best fit (!) for the EHR domain. In particular, the user fitness data is not collected during a visit and is not human-verified.
The goal is to find a "standardized FIHR way" to model this sort of data.
Use an Observation (?) with Extensions? Profiles? Domain-specific rules?
FHIR allows extraordinary flexibility, but each extension/profile may increase the cost of being able to exchange the resource directly later.
An explanation on the appropriate use of an FHIR resource - including when to Extend, use Profiles/tags, or encode differentiation via Coded values - would be useful.
Define a new/custom Resource type?
FHIR DSTU2 does not define a way to define a new Resource type. Wanting to do so may indicate that the role of resources - logical concept vs. an implementation interface? - is not understood.
Don't use FHIR at all? Don't use FHIR except on summary interchanges?
It could also be the case that FHIR is not suitable for our messaging format. But would it be any "worse" to go FIHRa <-> FIHRb than x <-> FIHRc when dealing with external interoperability?
The FHIR Registry did not seem to contain any User-Fitness specific Observation Profiles and none of the Proposed Resources seem to add appropriate resource-refinements.
At the end of the day, it would be nice to be able to claim to be able to - with minimal or no translation, ie. in a "standard manner" - be able to exchange User Fitness data as an FHIR stream.
Certainly the intent is to use Observation, and there's lots of projects already doing this.
There's no need for extensions, it's just a straight forward use. Note that this: " In particular the user fitness data is not collected during a visit and is not human-verified" doesn't matter. There's lots of EHR data of dubious provenance...
You just need to use the right codes, and bingo, it all works. I've provided a bit more detail to the answer here:
http://www.healthintersections.com.au/?p=2487

What is the difference between Falcor and GraphQL?

GraphQL consists of a type system, query language and execution
semantics, static validation, and type introspection, each outlined
below. To guide you through each of these components, we've written an
example designed to illustrate the various pieces of GraphQL.
- https://github.com/facebook/graphql
Falcor lets you represent all your remote data sources as a single
domain model via a virtual JSON graph. You code the same way no matter
where the data is, whether in memory on the client or over the network
on the server.
- http://netflix.github.io/falcor/
What is the difference between Falcor and GraphQL (in the context of Relay)?
I have viewed the Angular Air Episode 26: FalcorJS and Angular 2 where Jafar Husain answers how GraphQL compares to FalcorJS. This is the summary (paraphrasing):
FalcorJS and GraphQL are tackling the same problem (querying data, managing data).
The important distinction is that GraphQL is a query language and FalcorJS is not.
When you are asking FalcorJS for resources, you are very explicitly asking for finite series of values. FalcorJS does support things like ranges, e.g. genres[0..10]. But it does not support open-ended queries, e.g. genres[0..*].
GraphQL is set based: give me all records where true, order by this, etc. In this sense, GraphQL query language is more powerful than FalcorJS.
With GraphQL you have a powerful query language, but you have to interpret that query language on the server.
Jafar argues that in most applications, the types of the queries that go from client to server share the same shape. Therefore, having a specific and predictable operations like get and set exposes more opportunities to leverage cache. Furthermore, a lot of the developers are familiar with mapping the requests using a simple router in REST architecture.
The end discussion resolves around whether the power that comes with GraphQL outweighs the complexity.
I have now written apps with both libraries and I can agree with everything in Gajus' post, but found some different things most important in my own use of the frameworks.
Probably the biggest practical difference is that most of the examples and presumably work done up to this point on GraphQL has been concentrated on integrating GraphQL with Relay - Facebook's system for integrating ReactJS widgets with their data requirements. FalcorJS on the other hand tends to act separately from the widget system which means both that it may be easier to integrate into a non-React/Relay client and that it will do less for you automatically in terms of matching widget data dependencies with widgets.
The flip side of FalcorJS being flexible in client side integrations is that it can be very opinionated about how the server needs to act. FalcorJS actually does have a straight up "Call this Query over HTTP" capability - although Jafar Husain doesn't seem to talk about it very much - and once you include those, the way the client libraries react to server information is quite similar except that GraphQL/Relay adds a layer of configuration. In FalcorJS, if you are returning a value for movie, your return value better say 'movie', whereas in GraphQL, you can describe that even though the query returns 'film', you should put that in the client side datastore as 'movie'. - this is part of the power vs complexity tradeoff that Gajus mentioned.
On a practical basis, GraphQL and Relay seems to be more developed. Jafar Husain has mentioned that the next version of the Netflix frontend will be running at least in part on FalcorJS whereas the Facebook team has mentioned that they've been using some version of the GraphQL/Relay stack in production for over 3 years.
The open source developer community around GraphQL and Relay seems to be thriving. There are a large number of well-attended supporting projects around GraphQL and Relay whereas I have personally found very few around FalcorJS. Also the base github repository for Relay (https://github.com/facebook/relay/pulse) is significantly more active than the github repository for FalcorJS (https://github.com/netflix/falcor/pulse). When I first pulled the Facebook repo, the examples were broken. I opened a github issue and it was fixed within hours. On the other hand, the github issue I opened on FalcorJS has had no official response in two weeks.
Lee Byron one of the engineer behind GraphQL did an AMA on hashnode, here is his answer when asked this question:
Falcor returns Observables, GraphQL just values. For how Netflix wanted to use Falcor, this makes a lot of sense for them. They make multiple requests and present data as it's ready, but it also means that the client developer has to work with the Observables directly. GraphQL is a request/response model, and returns back JSON, which is trivially easy to then use. Relay adds back in some of the dynamicism that Falcor presents while maintaining only using plain values.
Type system. GraphQL is defined in terms of a type system, and that's allowed us to built lots of interesting tools like GraphiQL, code generators, error detection, etc. Falcor is much more dynamic, which is valuable in its own right but limits the ability to do this kind of thing.
Network usage. GraphQL was originally designed for operating Facebook's news feed on low end devices on even lower end networks, so it goes to great lengths to allow you to declare everything you need in a single network request in order to minimize latency. Falcor, on the other hand, often performs multiple round trips to collect additional data. This is really just a tradeoff between the simplicity of the system and the control of the network. For Netflix, they also deal with very low end devices (e.g. Roku stick) but the assumption is the network will be good enough to stream video.
Edit: Falcor can indeed batch requests, making the comment about the network usage inaccurate. Thanks to #PrzeoR
UPDATE: I've found the very useful comment under my post that I want to share with you as a complementary thing to the main content:
Regarding lack of examples, you can find the awesome-falcorjs repo userful, there are different examples of a Falcor's CRUD usage:
https://github.com/przeor/awesome-falcorjs ... Second thing, there is a book called "Mastering Full Stack React Development" which includes Falcor as well (good way to learn how to use it):
ORGINAL POST BELOW:
FalcorJS (https://www.facebook.com/groups/falcorjs/) is much more simpler to be efficient in comparison to Relay/GraphQL.
The learning curve for GraphQL+Relay is HUGE:
In my short summary: Go for Falcor. Use Falcor in your next project until YOU have a large budget and a lot of learning time for your team then use RELAY+GRAPHQL.
GraphQL+Relay has huge API that you must be efficient in. Falcor has small API and is very easy to grasp to any front-end developer who is familiar with JSON.
If you have an AGILE project with limited resources -> then go for FalcorJS!
MY SUBJECTIVE opinion: FalcorJS is 500%+ easier to be efficient in full-stack javascript.
I have also published some FalcorJS starter kits on my project (+more full-stack falcor's example projects): https://www.github.com/przeor
To be more in technical details:
1) When you are using Falcor, then you can use both on front-end and backend:
import falcor from 'falcor';
and then build your model based upon.
... you need also two libraries which are simple to use on backend:
a) falcor-express - you use it once (ex. app.use('/model.json', FalcorServer.dataSourceRoute(() => new NamesRouter()))). Source: https://github.com/przeor/falcor-netflix-shopping-cart-example/blob/master/server/index.js
b) falcor-router - there you define SIMPLE routes (ex. route: '_view.length'). Source:
https://github.com/przeor/falcor-netflix-shopping-cart-example/blob/master/server/router.js
Falcor is piece of cake in terms of learning curve.
You can also see documentation which is much simpler than FB's lib and check also the article "why you should care about falcorjs (netflix falcor)".
2) Relay/GraphQL is more likely like a huge enterprise tool.
For example, you have two different documentations that separately are talking about:
a) Relay: https://facebook.github.io/relay/docs/tutorial.html
- Containers
- Routes
- Root Container
- Ready State
- Mutations
- Network Layer
- Babel Relay Plugin
- GRAPHQL
GraphQL Relay Specification
Object Identification
Connection
Mutations
Further Reading
API REFERENCE
Relay
RelayContainer
Relay.Route
Relay.RootContainer
Relay.QL
Relay.Mutation
Relay.PropTypes
Relay.Store
INTERFACES
RelayNetworkLayer
RelayMutationRequest
RelayQueryRequest
b) GrapQL: https://facebook.github.io/graphql/
2Language
2.1Source Text
2.1.1Unicode
2.1.2White Space
2.1.3Line Terminators
2.1.4Comments
2.1.5Insignificant Commas
2.1.6Lexical Tokens
2.1.7Ignored Tokens
2.1.8Punctuators
2.1.9Names
2.2Query Document
2.2.1Operations
2.2.2Selection Sets
2.2.3Fields
2.2.4Arguments
2.2.5Field Alias
2.2.6Fragments
2.2.6.1Type Conditions
2.2.6.2Inline Fragments
2.2.7Input Values
2.2.7.1Int Value
2.2.7.2Float Value
2.2.7.3Boolean Value
2.2.7.4String Value
2.2.7.5Enum Value
2.2.7.6List Value
2.2.7.7Input Object Values
2.2.8Variables
2.2.8.1Variable use within Fragments
2.2.9Input Types
2.2.10Directives
2.2.10.1Fragment Directives
3Type System
3.1Types
3.1.1Scalars
3.1.1.1Built-in Scalars
3.1.1.1.1Int
3.1.1.1.2Float
3.1.1.1.3String
3.1.1.1.4Boolean
3.1.1.1.5ID
3.1.2Objects
3.1.2.1Object Field Arguments
3.1.2.2Object Field deprecation
3.1.2.3Object type validation
3.1.3Interfaces
3.1.3.1Interface type validation
3.1.4Unions
3.1.4.1Union type validation
3.1.5Enums
3.1.6Input Objects
3.1.7Lists
3.1.8Non-Null
3.2Directives
3.2.1#skip
3.2.2#include
3.3Starting types
4Introspection
4.1General Principles
4.1.1Naming conventions
4.1.2Documentation
4.1.3Deprecation
4.1.4Type Name Introspection
4.2Schema Introspection
4.2.1The "__Type" Type
4.2.2Type Kinds
4.2.2.1Scalar
4.2.2.2Object
4.2.2.3Union
4.2.2.4Interface
4.2.2.5Enum
4.2.2.6Input Object
4.2.2.7List
4.2.2.8Non-null
4.2.2.9Combining List and Non-Null
4.2.3The __Field Type
4.2.4The __InputValue Type
5Validation
5.1Operations
5.1.1Named Operation Definitions
5.1.1.1Operation Name Uniqueness
5.1.2Anonymous Operation Definitions
5.1.2.1Lone Anonymous Operation
5.2Fields
5.2.1Field Selections on Objects, Interfaces, and Unions Types
5.2.2Field Selection Merging
5.2.3Leaf Field Selections
5.3Arguments
5.3.1Argument Names
5.3.2Argument Uniqueness
5.3.3Argument Values Type Correctness
5.3.3.1Compatible Values
5.3.3.2Required Arguments
5.4Fragments
5.4.1Fragment Declarations
5.4.1.1Fragment Name Uniqueness
5.4.1.2Fragment Spread Type Existence
5.4.1.3Fragments On Composite Types
5.4.1.4Fragments Must Be Used
5.4.2Fragment Spreads
5.4.2.1Fragment spread target defined
5.4.2.2Fragment spreads must not form cycles
5.4.2.3Fragment spread is possible
5.4.2.3.1Object Spreads In Object Scope
5.4.2.3.2Abstract Spreads in Object Scope
5.4.2.3.3Object Spreads In Abstract Scope
5.4.2.3.4Abstract Spreads in Abstract Scope
5.5Values
5.5.1Input Object Field Uniqueness
5.6Directives
5.6.1Directives Are Defined
5.7Variables
5.7.1Variable Uniqueness
5.7.2Variable Default Values Are Correctly Typed
5.7.3Variables Are Input Types
5.7.4All Variable Uses Defined
5.7.5All Variables Used
5.7.6All Variable Usages are Allowed
6Execution
6.1Evaluating requests
6.2Coercing Variables
6.3Evaluating operations
6.4Evaluating selection sets
6.5Evaluating a grouped field set
6.5.1Field entries
6.5.2Normal evaluation
6.5.3Serial execution
6.5.4Error handling
6.5.5Nullability
7Response
7.1Serialization Format
7.1.1JSON Serialization
7.2Response Format
7.2.1Data
7.2.2Errors
AAppendix: Notation Conventions
A.1Context-Free Grammar
A.2Lexical and Syntactical Grammar
A.3Grammar Notation
A.4Grammar Semantics
A.5Algorithms
BAppendix: Grammar Summary
B.1Ignored Tokens
B.2Lexical Tokens
B.3Query Document
It's your choice:
Simple sweet and short documented Falcor JS VERSUS Huge-enterprise-grade tool with long and advanced documentation as GraphQL&Relay
As I said before, if you are a front-end dev who grasp idea of using JSON, then JSON graph implementation from Falcor's team is best way to do your full-stack dev project.
In short, Falcor or GraphQL or Restful solve the same problem - provide a tool to query/manipulate data effectively.
How they differ is in how they present their data:
Falcor wants you to think their data as a very big virtual JSON tree, and uses get, set and call to read, write data.
GraphQL wants you to think their data as a group of predefined typed objects, and uses queries and mutations to read, write data.
Restful wants you to think their data as a group of resources, and uses HTTP verbs to read, write data.
Whenever we need to provide data for user, we end up with something liked: client -> query -> {a layer translate query into data ops} -> data.
After struggling with GraphQL, Falcor and JSON API (and even ODdata), I wrote my own data query layer. It's simpler, easier to learn, and more equivalent with GraphQL.
Check it out at:
https://github.com/giapnguyen74/nextql
It also integrates with featherjs for real time query/mutation.
https://github.com/giapnguyen74/nextql-feathers
OK, just start from a simple but important difference, GraphQL is a query based while Falcor is not!
But how they help u?
Basically, they both helping us to manage and querying data, but GraphQL has a req/res Model and return the data as JSON, basically the idea in GraphQL is having a single request to get all your data in one goal... Also, have exact response by having an exact request, So something to run on low-speed internet and mobile devices, like 3G networks... So if you have many mobile users or for some reasons you'd like to have less requests and faster response, use GraphQL... While Faclor is not too far from this, so read on...
On the other hand, Falcor by Netflix, usually have extra request (usually more than once) to retrieve all your data, eventhough they trying to improving it to a single req... Falcor is more limited for queries and doesn't have pre-defined query helpers like range and etc...
But for more clarification, let's see how each of them introduce itself:
GraphQL, A query language for your API
GraphQL is a query language for APIs and a runtime for fulfilling
those queries with your existing data. GraphQL provides a complete and
understandable description of the data in your API, gives clients the
power to ask for exactly what they need and nothing more, makes it
easier to evolve APIs over time, and enables powerful developer tools.
Send a GraphQL query to your API and get exactly what you need,
nothing more and nothing less. GraphQL queries always return
predictable results. Apps using GraphQL are fast and stable because
they control the data they get, not the server.
GraphQL queries access not just the properties of one resource but
also smoothly follow references between them. While typical REST APIs
require loading from multiple URLs, GraphQL APIs get all the data your
app needs in a single request. Apps using GraphQL can be quick even on
slow mobile network connections.
GraphQL APIs are organized in terms of types and fields, not
endpoints. Access the full capabilities of your data from a single
endpoint. GraphQL uses types to ensure Apps only ask for what’s
possible and provide clear and helpful errors. Apps can use types to
avoid writing manual parsing code.
Falcor, a JavaScript library for efficient data fetching
Falcor lets you represent all your remote data sources as a single
domain model via a virtual JSON graph. You code the same way no matter
where the data is, whether in memory on the client or over the network
on the server.
A JavaScript-like path syntax makes it easy to access as much or as
little data as you want, when you want it. You retrieve your data
using familiar JavaScript operations like get, set, and call. If you
know your data, you know your API.
Falcor automatically traverses references in your graph and makes
requests as needed. Falcor transparently handles all network
communications, opportunistically batching and de-duping requests.

How do you perform address validation? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
The community reviewed whether to reopen this question 11 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
Is it even possible to perform address (physical, not e-mail) validation? It seems like the sheer number of address formats, even in the US alone, would make this a fairly difficult task. On the other hand it seems like a task that would be necessary for several business requirements.
Here's a free and sort of "outside the box" way to do it. Not 100% perfect, but it should reject blatantly non-existent addresses.
Submit the entire address to Google's geocoding web service. This service attempts to return the exact coordinates of the location you feed it, i.e. latitude and longitude.
In my experience if the address is invalid you will get a result of 602 from the service. There's definitely a possibility of false positives or false negatives, but used in conjunction with other consistency checks it could be useful.
(Yahoo's geocoding web service, on the other hand, will return the coordinates of the center of the town if the town exists but the rest of the address is bogus. Potentially useful as long as you pay close attention to the "precision" field in the result).
There are a number of good answers in here but most of them make the assumption that the user wants an "API" solution where they must write code to connect to a 3rd-party service and/or screen scrape the USPS. This is all well and good, but should be factored into the business requirements and costs associated with the implementation and then weighed against the desired benefits.
Depending upon the business requirements and the way that the data is received into the system, a real-time address processing solution may be the best bet. If a real-time solution is required, you will want to consider the license agreement and technical limitations of the Google Maps/Bing/Yahoo APIs. They typically limit the number of calls you can make each day. The USPS web tools API is the same in additional they restrict how/why you can use their system and how you are allowed to use the data thereafter.
At the same time, there are a handful of great service providers that can easily process a static list of addresses. Essentially, you give the service provider a CSV file or Excel file, they clean it up and get it back to you. It's a one-time deal with no long-term commitment or obligation—usually.
Full disclosure: I'm the founder of SmartyStreets. We do address verification for addresses within the United States. We are easily able to CASS certify a list and we also offer a address verification web service API. We have no hidden fees, contracts, or anything. You use our service until you no longer need it and you can walk away. (Unlike cell phone companies that require a contract.)
USPS has an address cleaner online, which someone has screen scraped into a poor man's webservice. However, if you're doing this often enough, it'd be a better idea to apply for a USPS account and call their own webservice.
I will refer you to my blog post - A lesson in address storage, I go into some of the techniques and algorithms used in the process of address validation. My key thought is "Don't be lazy with address storage, it will cause you nothing but headaches in the future!"
Also, there is another StackOverflow question that asks this question entitled How should international geographic addresses be stored in a relational database.
In the course of developing an in-house address verification service at a German company I used to work for I've come across a number of ways to tackle this issue. I'll do my best to sum up my findings below:
Free, Open Source Software
Clearly, the first approach anyone would take is an open-source one (like openstreetmap.org), which is never a bad idea. But whether or not you can really put this to good and reliable use depends very much on how much you need to rely on the results.
Addresses are an incredibly variable thing. Verifying U.S. addresses is not an easy task, but bearable, but once you're going for Europe, especially the U.K. with their extensive Postal Code system, the open-source approach will simply lack data.
Web Services / APIs
Enterprise-Class Software
Money gets it done, obviously. But not every business or developer can spend ~$0.15 per address lookup (that's $150 for 1,000 API requests) - a very expensive business model the vast majority of address validation APIs have implemented.
What I ended up integrating: streetlayer API
Since I was not willing to take on the programmatic approach of verifying address data manually I finally came to the conclusion that I was in need of an API with a price tag that would not make my boss want to fire me and still deliver solid and reliable international verification results.
Long story short, I ended up integrating an API built by apilayer, called "streetlayer API". I was easily convinced by a simple JSON integration, surprisingly accurate validation results and their developer-friendly pricing. Also, 100 requests/month are entirely free.
Hope this helps!
I have used the services of http://www.melissadata.com Their "address object" works very well. Its pricey, yes. But when you consider costs of writing your own solutions, the cost of dirty data in your application, returned mailers - lost sales, and the like - the costs can be justified.
For us-based address data my company has used GeoStan. It has bindings for C and Java (and we created a Perl binding). Note that it is a commercial product and isn't cheap. It is quite fast though (~300 addresses per second) and offers features like CASS certification (USPS bulk mail discount), DPV (Delivery point verification) flagging, and LON/LAT geocoding.
There is a Perl module Geo::PostalAddress, but it uses heuristics and doesn't have the other features mentioned for GeoStan.
Edit: some have mentioned 'doing it yourself', if you do decide to do this, a good source of information to start with is the US Census Tiger Data Set, which contains a lot of information about the US including address information.
As seen on reddit:
$address = urlencode('1600 Pennsylvania Avenue, Washington, DC');
$json = json_decode(file_get_contents("http://where.yahooapis.com/geocode?q=$address&flags=J"));
print_r($json);
Fixaddress.com service is available that provides following services,
1) Address Validation.
2) Address Correction.
3) Address spell correcting.
4) Correct addresses phonetic mistakes.
Fixaddress.com uses USPS and Tiger data as reference data.
For more detail visit below link,
http://www.fixaddress.com/
One area where address lookups have to be performed reliably is for VOIP E911 services. I know companies reliably using the following services for this:
Bandwidth.com 9-1-1 Access API MSAG Address Validation
MSAG = Master Street Address Guide
https://www.bandwidth.com/9-1-1/
SmartyStreet US Street Address API
https://smartystreets.com/docs/cloud/us-street-api
There are companies that provide this service. Service bureaus that deal with mass mailing will scrub an entire mailing list to that it's in the proper format, which results in a discount on postage. The USPS sells databases of address information that can be used to develop custom solutions. They also have lists of approved vendors who provide this kind of software and service.
There are some (but not many) packages that have APIs for hooking address validation into your software.
However, you're right that its a pretty nasty problem.
http://www.usps.com/ncsc/ziplookup/vendorslicensees.htm
As mentioned there are many services out there, if you are looking to truly validate the entire address then I highly recommend going with a Web Service type service to ensure that changes can quickly be recognized by your application.
In addition to the services listed above, webservice.net has this US Address Validation service. http://www.webservicex.net/WCF/ServiceDetails.aspx?SID=24
We have had success with Perfect Address.
Their database has all the US street names and street number ranges. Also acts as a pretty decent parser for free-form address fields, if you are lucky enough to have that kind of data.
Validating it is a valid address is one thing.
But if you're trying to validate a given person lives at a given address, your only almost-guarantee would be a test mail to the address, and even that is not certain if the person is organised or knows somebody at that address.
Otherwise people could just specify an arbitrary random address which they know exists and it would mean nothing to you.
The best you can do for immediate results is request the user send a photographed / scanned copy of the head of their bank statement or some other proof-of-recent-residence, because at least then they have to work harder to forget it, and forging said things show up easily with a basic level of image forensic analysis.
There is no global solution. For any given country it is at best rather tricky.
In the UK, the PostOffice controlls postal addresses, and can provide (at a cost) address information for validation purposes.
Government agencies also keep an extensive list of addresses, and these are centrally collated in the NLPG (National Land and Property Gazetteer).
Actually validating against these lists is very difficult. Most people don't even know exactly how their address as it is held by the PostOffice. Some businesses don't even know what number they are on a particular street.
Your best bet is to approach a company that specialises in this kind of thing.
Yahoo has also a Placemaker API. It is good only for locations but it has an universal id for all world locations.
It look that there is no standard in ISO list.
You could also try SAP's Data Quality solutions which are available in both a server platform is processing a large number of requests or as an embeddable SDK if you wanted to run it in process with your application. We use it in our application and it's very robust and scalable.
NAICS.com is coming out with an API that will add all kinds of key business data including street address. This would happen on the fly as your site's forms are processed. https://www.naics.com/business-intelligence-api/
You can try Pitney Bowes “IdentifyAddress” Api available at - https://identify.pitneybowes.com/
The service analyses and compares the input addresses against the known address databases around the world to output a standardized detail. It corrects addresses, adds missing postal information and formats it using the format preferred by the applicable postal authority. I also uses additional address databases so it can provide enhanced detail, including address quality, type of address, transliteration (such as from Chinese Kanji to Latin characters) and whether an address is validated to the premise/house number, street, or city level of reference information.
You will find a lot of samples and sdk available on the site and i found it extremely easy to integrate.
For US addresses you can require a valid state, and verify that the zip is valid. You could even check that the zip code is in the right state, but beyond that I don't think there are many tests you could run that wouldn't provide a lot of false negatives.
What are you trying to do -- prevent simple mistakes or enforcing some kind of identity check?

Resources