In Google Cloud Platform «Start training» is disable - google-cloud-automl

I am training a model in GCP's AutoML Natural Language Entity extraction.
I have 50+ annotations for each label but still cant start training a model.
Take a look at a screenshot of the
train section. The Start training button remains grey and cannot be selected.

Looking at the screenshot it seems as if you may be talking about training an AutoML Entity Extraction model. Then, this issue seems the same as in Unable to start training my GCP AutoML Entity Extraction model on Web UI
There are thus a couple of reasons that may result in this behavior:
Your dataset are located in a specific region (e.g. "EU") and you need to specify the proper endpoint, as shown in the official documentation.
You might need to increase the number of "Training items per label" to 100 at minimum (see Natural Language limits).
From the aforementioned post, the solution seems to be the first one.

Related

Using Graphite/Grafana for non time based data

I have been reading through documentation and have been able to get Graphite to receive data I have been sending it. I can definitely see the benefit in tracking things like concurrent users and network load.
But now I have been tasked with implementing it on the client to show things like RAM, and CPU usage. In addition to this, I must also track actions (users buying things). Maybe I am missing a large chunk of the picture here, but I am not sure how I would do these things. Do I need a timestamp? I've also seen plugins for pie charts and such and this would indicate I could perhaps create graphs from devices with different statistics.
What am I missing?
I don't think you're missing anything.
Any data you send on to, say, InfluxDB (as that's what I've used the most) will be timestamped automatically when it arrives - unless you specify an explicit one yourself.
If you're showing, for example, CPU load you can write your query to pick up the latest value, or perhaps an average (mean value) over time, or the max or min over a period of time as appropriate.
Pie charts can also be used successfully to graph relationships over time.
The key is to create a specific query (I use SQL directly) to craft the data used for the panel type. There are excellent examples in the documentation.

Correct recognition results of Google Speech API

Is it possible to train Google Speech API with sample data to help the recognition in my application?
What I mean is an approach like the one provided by wit.ai and described here (even though the example applies to nlp processing). Basically, if you can predict the interactions your users will have with your bot, you can train it to better perform. For instance, I know the subset of cities that will be used, eg: it seems I cannot make the bot understand me when I say Zurich, it becomes Syria or Siberia but I already know that is not possible. So if I, let's say, can upload a list of preferred words to be used first and then if no match is found there fallback to standard recognition or some similar approach I think it will be achieve better results.
Any idea if it is possible and how? I know those APIs are in beta stage and subject to change, but I would still like to give it a try.
I can upload some code sample of what I am currently doing, though it is just sending an audio and analyzing the result so far, so not really close to this problem.
In recognition config you can specify alternatives to return you with maxAlternatives field (up to 30). Once you have 30 alternatives with confidence you'll have Syria with confidence 0.5, Siberia with confidence 0.01 and Zurich with confidence 0.1. Usually the proper answer is present, although it might not be on the top. You can select best alternative according to your current status.
The current Google Cloud Speech-to-Text API allows the user to specify a list of words and phrases that provide hints to the speech recognition task.
From https://cloud.google.com/speech-to-text/docs/basics (mirror):
speechContext - (optional) contains additional contextual information for processing this audio. A context contains the following sub-field:
phrases - contains a list of words and phrases that provide hints to the speech recognition task.
For more details, see: https://cloud.google.com/speech-to-text/docs/basics#phrase-hints (mirror).

Performance Issue with Doctrine, PostGIS and MapFish

I am developing a WebGIS application using Symfony with the MapFish plugin http://www.symfony-project.org/plugins/sfMapFishPlugin
I use the GeoJSON produced by MapFish to render layers through OpenLayers, in a vector layer of course.
When I show layers up to 3k features everything works fine. When I try with layers with 10k features or more the application crash. I don't know the threshold, because I either have layers with 2-3k features or with 10-13k features.
I think the problem is related with doctrine, because the last report in the log is something like:
Sep 02 13:22:40 symfony [info] {Doctrine_Connection_Statement} execute :
and then the query to fetch the geographical records.
I said I think the problem is the number of features. So I used the OpenLayers.Strategy.BBox() to decrease the number of feature to get and to show. The result is the same. The app seems stuck while executing the query.
If I add a limit to the query string used to get the features' GeoJSON the application works. So I don't think this is related to the MapFish plugin but with Doctrine.
Anyone has some enlightenment?
Thanks!
Even if theorically possible, it’s a bad idea to try to show so many vector features on a map.
You'd better change the way features are displayed (eg. raster for low zoom levels, get feature on click…).
Even if your service answer in a reasonable time, your browser will be stuck, or at least will have very bad performance…
I’m the author of sfMapFishPlugin and I never ever tried to query so many features, and even less tried to show them on a OL map.
Check out OpenLayers FAQ on this subject: http://trac.osgeo.org/openlayers/wiki/FrequentlyAskedQuestions#WhatisthemaximumnumberofCoordinatesFeaturesIcandrawwithaVectorlayer , a bit outdated with recent browsers improvements, but 10k vector features on a map is not reasonable.
HTH,

Separated Presentation on a UI Centric Application

I having trouble figuring out the correct architecture for this kind of application: it's a diagramming application, which resembles MS Visio. The diagrams are used to generated data which is passed to another application.
When designing applications, I've always tried to used layering, but now I can't decide how to do this when the data is so tightly coupled with the presentation. For example, a certain object in my canvas has a (X,Y) data, which is used for presentation purposes only, but has to be stored like domain data.
Where I'm getting things wrong? I'm pretty sure I'm looking at this from the wrong angle, but I can't figure out the right one.
Thanks again!
UPDATE:
I'm also aware that maybe I shouldn't be separating UI from domain in this case. If that is so, please provide me with some rational of when to apply separation and when not to.
In a diagramming tool the x/y position of a shape is part of the domain data (the location of the shapes is part of the diagram - you cant draw the diagram without it), the code that use those x/y coordinates and draw a shape on the screen is part of the presentation tier.
I know some people think that data that is only used for display should be saved separately, but in every project I've ever worked on that saved data separately this turned up to be a huge maintenance and support nightmare.
In a simple diagramming tool (if the tool just draws and edit the diagram without any fancy processing based on the diagram) there is no business logic, there's only the code that draws and edit the diagram (that belongs in the presentation tier) and the diagram data (that is the domain model).
If there is no business logic, by using a separate set of objects for domain and presentation you'll have to duplicate all your model data twice (once in the model objects and once in the presentation objects) and you won't get any advantages from separating the business logic from the presentation (because there isn't any).
On the other hand, if you do have some algorithms you run on the data you do have something to gain by separating the graph data from the drawing code - you can run the algorithm outside the tool, you can have better automated tests, etc.
also if you write another system that operates on the same data you can at least share the model definition and save/load code if you separate it from the drawing code.
So, let's summarize:
All the diagram data is part of the model (including data only used for presentation purposes).
Anything that draws to the screen or handles user input is in the presentation tier (obviously).
If those two cover all your code and data than your application don't have any "business logic" and the tier separation is probably overkill.
If you have any code that doesn't fit into those two categories and you think it should be part of the model than you should build the two separate tiers.
If there's any chance for code sharing between systems you should make sure the shared code is not mixed in with the presentation code.
And one last "bonus" point - if this is a project that's likely to be in active development for a long time with new features added in the future - you may want to separate the UI/data anyway just to make future work easier - you have to decide if this future saving is worth the extra time now and if this separation is really likely to help in the future.
I think you need to make sure you're keeping the what and the how separate. What you are displaying is abstract, sets of coordinates, shape types. How you're displaying it is very specific. I'd make sure the domain model dealt purely with the what and the view layer dealt uniquely with the how. It's hard to get into specifics though without knowing more about your app.
You could try to implement some kind of view model, which saves the current layout of your objects. This way, x/y values together with the id of the object are stored in a layout file while pure model data is stores elsewhere.
Maybe this helps a bit,

Documents for a project? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I work for a CMMI level 5 certified company and one thing I hate about is the amount of documents we prepare (As a programmer I already hate documents). We have lots and lots of documents like PID(project initiation doc), Business requirements, System requirements,tech spec, Code review checklist, issue logs, Defect logs, Configuration management plan, Configuration management check list(s), Release documents and lots...
Almost 90% of these docs are just done for the sake of QA audit :) .. What do you think are the most important documents for a project? What documents can be used in the long run by another developer?
Please share your good practices here. I would like to use them for my own projects or the company I am planning to start in the long run.
Thanks
The key document is a good functional spec. There should be one and only one reference document for a system.
Overdoing documentation proliferates a large number of small requirements and spec documents every time someone changes a system or interface. For a system of any complexity, before long you have your spec distributed around several hundred assorted word, excel, visio and even powerpoint files. When this happens you lose clarity about what is current or even whether you have located and identified all pertinent documentation.
The BRD-SRD-Tech spec progression is based on an assumption that the business signs off the BRD, a business analyst signs off the SRD against requirements documented in the BRD and the technical specification is signed off against the SRD. This generates a web of sign-offs, multiple documents with redundant information and makes it difficult and clumsy to keep the spec documents up to date.
Because of this, subsequent requirements documentatation tends to take the form of a series of change request and supplemental requirement and spec docs, each with their own sign-off and audit process. You gain CYA and audit trail (or at least the appearance of an audit trail), but you lose clarity. There is now no definitive reference document for the system and it is difficult to establish what is current or relevant to any particular activity. The net result is that your business analysis process gets bogged down in forensic research, which adds overheads and latency to delivery schedules.
A spec document should be built in such a way that there is one definitive reference for any given system or subsystem. The document should be kept up to date and versioned. Get a good technical documentation tool like Framemaker, so your process can scale, and the document has some structural integrity of the sort lacking on Word.
For me the only real document I ever use is a spec. The more detail the better. However it doesnt need to be all completed at one time, and it doesnt need to be particularly formal. What is far more useful to me than documents that are checked and signed and double checked and double signed is always being able to get the latest version of a document. And being able to talk to people about what they have written, and get a decision in the case of any ambiguity. this is far more useful to me than anything else.
To sum up: a spec is the only document I have ever found useful, however it pales in comparison to having a project manager who knows the proposed system inside out, and can make sensible decisions based on what they know.
Documentation is like tofu -- most people hate it until they realize that under the right conditions, it can be really good.
The problem is that what you consider documentation is mostly made for documentation's sake. You, as a developer, don't see any immediate value in the documents you produce because you know you can do your job without all the TPS reports which you're required to make.
Unfortunately, I'm going to wager that there's not a lot you can do about in a company where you're being forced to eat raw tofu all the time. You'll probably just have to suck it up and write the docs which your company requires, but you can at least do one thing... you can write documents which at least are useful to you, and you can pass them along with your code for others who will maintain it.
Aside from inline documentation, you could set up a wiki to be used by yourself and people on your team. This type of documentation is searchable, which is already a big plus to developers, plus it's more of a living document instead of a homework-like paper you had to write. You already post to SO, so just think of your documentation as pooling your knowledge in a more useful place.
What do you think are the most important documents for a project?
Different people have different needs: for example the documents which the owner needs (e.g. the business contract) aren't the same as the documents which QA needs.
What documents can be used in the long run by another developer?
IMO the most important document (except for the source code) is the functional specification: because what the software is supposed to do (as opposed to, what it is doing) is the one thing that can't necessarily be reverse-engineered. See also How does a good developer keep from creating code with a low bus hit factor?
User Stories, burndown chart, code
I'm a fan of the old 4+1 views:
Use Case view (a/k/a user stories). There are several forms: proper use cases, forward-looking use cases that aren't as well defined and epics which need to be decomposed.
Logical view. The "static" view. UML Class diagrams and the like work well here as a design document. This also includes request and response formats for various protocols. Here is where we document the RESTful requests and responses. This includes the REST URI design.
Process view. The "dynamic" view. UML activity diagrams, sequence diagrams and statecharts and the like for here for design documents. In some cases, simple narratives work well. In other cases, there's a State design pattern, and it requires a combination of class diagrams and statecharts to show how the stateful objects interact.
This also includes protocols (e.g. REST). Here is where we define any special processing for the various REST requests.
This also includes an authentication or authorization rules, and any other cross-cutting aspects like security, logging, etc.
Component view. The pieces we're building for deployment. This includes the stuff we depend on, the structure of the modules and packages, etc. This is often a simple component diagram or a list of components and their dependencies.
Deployment view. We try to generate this from the code as deployed. Since we're using Python, we use epydoc to create the API documentation. We also use Sphinx to import module documentation into this view of the software.
This also includes the parameters, settings, and configuration details.
This, however, isn't sufficient.
When projects start, you have to work up to this through a series of sprints.
The first sprints build just the use case view.
Subsequent sprints build an "architecture" to implement the use cases. The architecture document has 4+1 views, but at a high level of abstraction. It summarizes the structure of the model schemas, the requests and replies, the RESTful processing, other processing, the expected componentry, etc. It never has a Deployment view. We generally reference operator guide and API documents as the deployment view of an architecture.
Then design-and-construction sprints build (and update) detailed 4+1 view documents for various components.
Then release sprints build (and update) the deployment views.
From the project point of view, the most important documents are those that normally include the word Plan, such as the Project Plan, Configuration Management Plan, Quality Plan, etc.
What you are describing is common in process improvements, and normally responds to two major causes. One is that the system really is overeaching and getting in the way of real work being done. Another is actually answered in your question: it is not that the documents are only done for the sake of audits, and your focus should not just be how usefull is the doc for other developers, but for the project or the company as a whole.
One usually looks at things from it's own perspective, sometimes it's necessary to look at the general picture.

Resources