I need to know how can I automate the labeled in Serena Dimensions, because I want to improve my work, and a idea is automate the labeled on it.
Thanks for your help.
It depends what you mean by a label. In Dimensions there are projects/streams and baselines along with 'named branches' (single branch name for a stream, but many available for projects). You can set your defaults under Project | Preferences. If yoi have the privilege then you can create new branch names from the console usng the DVB command - e.g. DVB release_1.0 /desc="Release 1.0". Be careful to have a decent naming convention as branch names must be unique across all your products in your base database. They can be reused across projects though.
Related
I'm trying to help my team streamline a data ingestion process that is taking up a substantial amount of time. We receive data in multiple formats and with attributes arranged differently. Is there a way using RapidMiner to create a process that:
Processes files on a schedule that are dropped into a folder (this
one I think I know but I'd love tips on this as scheduled processes
are new to me)
Automatically identifies input filetype and routes to the correct operator ("Read CSV" for example)
Recognizes a relatively small number of attributes and arranges them accordingly. In some cases, attributes are named the same way as our ingestion format and in others they are not (phone vs phone # vs Phone for example)
The attributes we process mostly consist of name, id, phone, email, address. Also, in some cases names are split first/last and in some they are full name.
I recognize that munging files for such simple attributes shouldn't be that hard but the number of files we receive and lack of order makes it very difficult to streamline a process without a bit of automation. I'm also going to move to a standardized receiving format but for a number of reasons that's on the horizon and not an immediate solution.
I appreciate any tips or guidance you can share.
Your question is relative broad, so unfortunately I can't give you complete answer. But here are some ideas on how I would tackle the points you mentioned:
For a full process scheduling RapidMiner Server is what you are
looking for. In that case you can either define a schedule (e.g.,
check regularly for new files) or even define a web service to
trigger the process.
For selecting the correct operator depending on file type, you could
use a combination of "Loop Files" and macro extraction to get the
correct type and the use either "Branch" or "Select Subprocess" for
switching to different input routes.
The "Select Attributes" operator has some very powerful options to
select specific subsets only. In your example I would go for a
regular expression akin to [pP]hone.* to get the different spelling
variants. Also very helpful in that case would be the "Reorder
Attributes" operator and "Rename by Replacing" to create a common
naming schema.
A general tip when building more complex process pipelines is to organize your different tasks in sub-processes and use the "Execute Process" operator. This makes everything much more readable and maintainable. Also a good error handling strategy is important to handle unforeseen data formats.
For more elaborate answers and tips from many adavanced RapidMiner users, I also highly recommend the RapidMiner community.
I hope this gives a good starting point for your project.
I'm developing a desktop app in `Electron' which allows the "non-pro" user to import (copy) images from their local drive into a project directory which they created earlier. Through the platform dialog (OSX or Windows), the user can select single or multiple images, or single or multiple directories, which could also include sub-directories.
I know how to handle the coding but I am stumped on a strategy to avoid naming conflicts, particularly as images may be coming from camera files which use a simple naming scheme, with batch imports from different camera sessions having the same names.
For a simple example, a user could pick both the "DCIM" directories below, or make selections from within each of the directories of files with the same name.
This is likely a very common programming issue and there must be some solutions which people smarter than me have come up with – but I don't know what this problem is called, in order to search for them.
The solution that I've seen is to look for a naming conflict, and then append something to the name of the thing being imported, before the ending. So you'll see files named foo.txt, foo-001.txt, foo-002.txt, and so on.
If you expect a great many conflicts, the appended text should be random, instead of sequential. That's because it takes 51 duplicate checks before settling on foo-050.txt but only 2.0000214334705... before settling on foo-kyc.txt. The performance difference can be quite noticeable after many conflicts for many files.
I was reluctant to post this question at first as it seems like functionality that could be pretty fundamental to the way in which TFS manages work items, but I cannot find anything documented that covers it; how do I categorise TFS work items (more specifically, new tasks)?
I've created a bunch of tasks. Some may fall under 'solution setup', others fall under 'core development' for example. How do I represent this categorisation in TFS? Is it something I need to consider when I'm creating the new tasks? Or are the work items brought back in this way during the query?
There are a handful of ways that people typically categorize/organize their Tasks:
Group tasks by User Story. This is done by linking the WI's, and this information will show up in WI Queries, and on the Task Board (task board only available in tfs 2012 and up).
Use the Area field and Area hierarchy to categorize your Tasks. The Area hierarchy is typically used to represent a functional breakdown of your application, then applied to WI's to categorize them based on which functional area they affect.
Activity Field. There is a field on Task work items called Activity that by default that has the values: Deployment, Design, Development, Documentation, Testing, Requirements. This list can be customized by editing the Work Item Type Definition.
Surprisingly difficult to find anything on this. I was already using 'linked item' as I should, the problem was the query. I created a new query and set the type to 'Tree...' so that the hierarchical structure was pulled back in away that mimicked a tasks and sub-tasks structure.
I have been optimizing our continuous integration builds, and the remaining bottleneck seems to be the following ClearCase commands:
cleartool.exe mklbtype -nc -ordinary BUILD_ApplicationWorkspace_1.0.0.0#vob_example
For a view with 1800 files, this is taking over 6 minutes to complete. Our MSBuild task takes half that. I am guessing the bulk of the bottleneck is network bandwidth but also how we are labeling the files used in this build.
Baed on this, I have questions:
Are we efficiently labeling the source code files, or is there a more efficient command we can run?
How can I get better metrics to understand where this ClearCase command is spending the bulk of its time?
Do prior labels slow ClearCase labeling down?
Related, does ClearCase have anything similar to Git Sub-modules or svn:externals? Currently we are creating a view of everything, including dependencies, before we do the build.
Thanks for your help.
cleartool mklbtype shouldn't take that long: it is about creating the type of the label, not about applying it on each and every of your file.
If anything, mklabel should takes time.
Applying the UCM methodology (as opposed to your current "Base ClearCase" usage) might help in that:
it forces you to define "components" (coherent groups of files, ie not "1800 files", which is quite large)
it proposes "incremental baseline", which only labels what has changed.
the UCM components are a bit akin to git Submodules (you can group them in a composite baseline), but that is not the same than svn:external, as mentioned here and in "Why are git submodules incompatible with svn externals?".
But if you are stuck with Base ClearCase, you are stuck with labelling everything, and one venue for optimization would be to label only a subset of those files.
I use multiple buildboxes which have StarTeam in them. I see a peculiar thing: all buildboxes differ in checkout label. Is there any specific setting to be done so that I can maintain a uniform default checkout label so that I don't need to change the desired label from the drop down each time for all buildboxes?
Jeremy Murray said:
I don't quite get the question - when you go to the checkout dialog,
the ordering of the labels is different from machine to machine?
In pre-2005R2 clients, the labels would be in order of creation time.
In 2005R2 and later clients, they should be in alphabetical order.
I think this has to do with the java version used, actually,
as we had some weird dialog differences on identical client installs with different java installs.