Music21 and D3.js for music feature extraction and visualization? - d3.js

I am looking for suggestions on what tools could be used for the following scenarios about music feature extraction and visualization (on my Mac):
identify and group notes in a score (from different voices/instruments) that sound concurrently (even if they are attacked in different time offsets, though sound at some point together due to different duration lengths); then connect them graphically (e.g. on a score representation, with a line connecting them)
identify melodic and accompanying parts (assigned to different voices/instruments, perhaps interchangeably within the same voice/instrument)
extract initial tonality and following modulations; then map all extracted tonalities on a scale based on the circle of 5ths (where 0 is the initial tonality, -1 is one 5th lower, +1 one 5th higher, etc.)
I have been thinking of using music21 (the music works I am interested in are part of its corpus), but I am not sure if this is the right way to go. Are there also other tools (e.g. jSymbolic2??) that could help?
And what about visualization? Could the above scenarios be visually "solved" within music21 or would I need an additional tool, like D3.js (which I have briefly used in the past)?
If you would have an advice on any of the above scenarios, that would help me a lot! Thanks, Ilias

Related

Cytoscape vs STRING for long list of proteins

I am mid-way through my university project, and I have run into an issue. I have a long list of around 1000 proteins that I wanted to analyse in STRING, however, my list is too large. I decided to try and utilise Cytoscape (and downloaded the stringApp), but the networks generated are still very messy. I've attached a screenshot here. Is there any way to improve the presentation of the network by downloading any Cytoscape apps or by tweaking the settings?
Thanks in advance
Well, the short answer is "no". A slightly longer answer is "it depends".
Showing a hairball really isn't helpful, usually, so you need to refine things somewhat. What is your data source (i.e. where did the 1000 proteins come from)? What do you hope to see in the network? If you are looking for particular groups of proteins (e.g. complexes), you would probably want to use MCL to cluster them first. If you have some other data you want to map, such as transcriptomic or proteomic data, you could refine your network based on fold change or abundance values.
All that being said, somethings you might try. First, you are seeing the "fast" version of the network. Try clicking on the show graphics details button (the diamond in the network view tool bar). That will give you the full graphics details. Second, you might try spreading the network out a bit by using the Layout->Layout Tools. Turn off the "Selected Only" and then adjust the scale. Finally, depending on your biological question, you might want to eliminate proteins that are only present in the nucleus or cytoplasm, or are only in lung tissue. This is all possible using the sliders provided by the stringApp's Results Panel.
-- scooter

Teller transactions archive - print barcode on papers

I am looking into options of auto indexing of daily documents generated by tellers in bank operations. The documents does not have any reference number and its handwritten by customer.
So to auto index these documents and store in EDMS, we have to put the core bank transaction reference number on each. So what options do i have? Print barcode label contains this trans number and attach to paper? or have a machine that i can feed the paper and it can print barcode on it?
Anyone know what is the right HW or SW for this?
Thanks
Depends on how complex you want to be. Perhaps these documents could be multiple (stapled?) pages. would you want to index each page - and would the documents then form an associated sequence (eg. doc. 00001-01 to -20)
Next caper is to consider the form of the number. It's best to formulate a check-digiting system so that a printed number can be manually entered and the check-digit verifies that the number hasn't been miskeyed.
Now - if these documents may be different sizes for instance, or potentially a wad of paperwork, how would you feed them through a printer?
So I'd suggest that a good choice would be to produce your numbers on a specialist barcode-printer with human-readable line on the same label. Some idiot will want to insist on using cheap thermally-sensitive labels, but these almost inevitably deteriorate with time. I'd choose thermal-transfer labels which are a little more complex - your tellers would need to be able to load label-rolls and also the transfer-ribbon (a little like a typewriter-ribbon, if you remember those) but basically any monkey could do it.
Even then, there are three grades of ribbon - wax, resin and a combination. Problem with wax is that it can become worn - same thing as you get with laser-printing where the pages get stuck together if they are left to their own devices for a while. Another reason you don't use laser printers in this role - apart from the fact that you'd need to produce sheets of labels to attach rather than ones and twos on-demand is that the laser processing will cook the glue on the sheets. Fine for an address label with a lifetime of a few days, but disastrous when you may be storing documents for years. Document goes one way, label another...
Resin is the best but most expensive choice. It has better wearing characteristics.
My choice would be a Zebra TLP2824plus using thermal-transfer paper and resin ribbon. Software is easy - just means you need to go back 20 years in time and forget all about drivers - just send a sring to the printer as if it was a generic text printer. The formatting of the label - well, the manual will show you that...
Other technologies and approaches would probably be more complex than simply producing and attaching barcode labels. For instance, if you were to have an inkjet printer like those that are used to mark (milk/juice) cartons - well, it would have to deal with different sizes of paper, and different weights from near-cardboard to airmail paper. It would also have a substantial footprint since the paper would need to be physically presented to the printer. Then there's all the problems of disassembling and reassembling a stapled wad. And who can control precisely where the printing would occur? What may suit one document may not suit another - it may have inconveniently-placed logos or other artwork in the "standard" position for that-sized paper.
Another issue is colour. There's no restriction on background colour with a label (yellow or fluoro pink for example) - it would be easy to locate when necessary. Contrast that with the-ink's-running-low washed-out ink printing on a grey background. White labels wouldn't stand out all that well on the majority of (white) documents.
BUT a strong alternative technology would be to have reels of labels pre-printed by a commercial printing establishment rather than producing them with a special printer on-demand. Reels are better than sheets - they are easier to use especially for people with short fingernails.

OCR for scanning printed receipts. [duplicate]

Would OCR Software be able to reliably translate an image such as the following into a list of values?
UPDATE:
In more detail the task is as follows:
We have a client application, where the user can open a report. This report contains a table of values.
But not every report looks the same - different fonts, different spacing, different colors, maybe the report contains many tables with different number of rows/columns...
The user selects an area of the report which contains a table. Using the mouse.
Now we want to convert the selected table into values - using our OCR tool.
At the time when the user selects the rectangular area I can ask for extra information
to help with the OCR process, and ask for confirmation that the values have been correct recognised.
It will initially be an experimental project, and therefore most likely with an OpenSource OCR tool - or at least one that does not cost any money for experimental purposes.
Simple answer is YES, you should just choose right tools.
I don't know if open source can ever get close to 100% accuracy on those images, but based on the answers here probably yes, if you spend some time on training and solve table analisys problem and stuff like that.
When we talk about commertial OCR like ABBYY or other, it will provide you 99%+ accuracy out of the box and it will detect tables automatically. No training, no anything, just works. Drawback is that you have to pay for it $$. Some would object that for open source you pay your time to set it up and mantain - but everyone decides for himself here.
However if we talk about commertial tools, there is more choice actually. And it depends on what you want. Boxed products like FineReader are actually targeting on converting input documents into editable documents like Word or Excell. Since you want actually to get data, not the Word document, you may need to look into different product category - Data Capture, which is essentially OCR plus some additional logic to find necessary data on the page. In case of invoice it could be Company name, Total amount, Due Date, Line items in the table, etc.
Data Capture is complicated subject and requires some learning, but being properly used can give quaranteed accuracy when capturing data from the documents. It is using different rules for data cross-check, database lookups, etc. When necessary it may send datafor manual verification. Enterprises are widely usind Data Capture applicaitons to enter millions of documents every month and heavily rely on data extracted in their every day workflow.
And there are also OCR SDK ofcourse, that will give you API access to recognition results and you will be able to program what to do with the data.
If you describe your task in more detail I can provide you with advice what direction is easier to go.
UPDATE
So what you do is basically Data Capture application, but not fully automated, using so-called "click to index" approach. There is number of applications like that on the market: you scan images and operator clicks on the text on the image (or draws rectangle around it) and then populates fields to database. It is good approach when number of images to process is relatively small, and manual workload is not big enough to justify cost of fully automated application (yes, there are fully automated systems that can do images with different font, spacing, layout, number of rows in the tables and so on).
If you decided to develop stuff and instead of buying, then all you need here is to chose OCR SDK. All UI you are going to write yoursself, right? The big choice is to decide: open source or commercial.
Best Open source is tesseract OCR, as far as I know. It is free, but may have real problems with table analysis, but with manual zoning approach this should not be the problem. As to OCR accuracty - people are often train OCR for font to increase accuracy, but this should not be the case for you, since fonts could be different. So you can just try tesseract out and see what accuracy you will get - this will influence amount of manual work to correct it.
Commertial OCR will give higher accuracy but will cost you money. I think you should anyway take a look to see if it worth it, or tesserack is good enough for you. I think the simplest way would be to download trial version of some box OCR prouct like FineReader. You will get good idea what accuracy would be in OCR SDK then.
If you always have solid borders in your table, you can try this solution:
Locate the horizontal and vertical lines on each page (long runs of
black pixels)
Segment the image into cells using the line coordinates
Clean up each cell (remove borders, threshold to black and white)
Perform OCR on each cell
Assemble results into a 2D array
Else your document have a borderless table, you can try to follow this line:
Optical Character Recognition is pretty amazing stuff, but it isn’t
always perfect. To get the best possible results, it helps to use the
cleanest input you can. In my initial experiments, I found that
performing OCR on the entire document actually worked pretty well as
long as I removed the cell borders (long horizontal and vertical
lines). However, the software compressed all whitespace into a single
empty space. Since my input documents had multiple columns with
several words in each column, the cell boundaries were getting lost.
Retaining the relationship between cells was very important, so one
possible solution was to draw a unique character, like “^” on each
cell boundary – something the OCR would still recognize and that I
could use later to split the resulting strings.
I found all this information in this link, asking Google "OCR to table". The author published a full algorithm using Python and Tesseract, both opensource solutions!
If you want to try the Tesseract power, maybe you should try this site:
http://www.free-ocr.com/
Which OCR you are talking about?
Will you be developing codes based on that OCR or you will be using something off the shelves?
FYI:
Tesseract OCR
it has implemented the document reading executable, so you can feed the whole page in, and it will extract characters for you. It recognizes blank spaces pretty well, it might be able to help with tab-spacing.
I've been OCR'ing scanned documents since '98. This is a recurring problem for scanned docs, specially for those that include rotated and/or skewed pages.
Yes, there are several good commercial systems and some could provide, once well configured, terrific automatic data-mining rate, asking for the operator's help only for those very degraded fields. If I were you, I'd rely on some of them.
If commercial choices threat your budget, OSS can lend a hand. But, "there's no free lunch". So, you'll have to rely on a bunch of tailor-made scripts to scaffold an affordable solution to process your bunch of docs. Fortunately, you are not alone. In fact, past last decades, many people have been dealing with this. So, IMHO, the best and concise answer for this question is provided by this article:
https://datascience.blog.wzb.eu/2017/02/16/data-mining-ocr-pdfs-using-pdftabextract-to-liberate-tabular-data-from-scanned-documents/
Its reading is worth! The author offers useful tools of his own, but the article's conclusion is very important to give you a good mindset about how to solve this kind of problem.
"There is no silver bullet."
(Fred Brooks, The Mitical Man-Month)
It really depends on implementation.
There are a few parameters that affect the OCR's ability to recognize:
1. How well the OCR is trained - the size and quality of the examples database
2. How well it is trained to detect "garbage" (besides knowing what's a letter, you need to know what is NOT a letter).
3. The OCR's design and type
4. If it's a Nerural Network, the Nerural Network structure affects its ability to learn and "decide".
So, if you're not making one of your own, it's just a matter of testing different kinds until you find one that fits.
You could try other approach. With tesseract (or other OCRS) you can get coordinates for each word. Then you can try to group those words by vercital and horizontal coordinates to get rows/columns. For example to tell a difference between a white space and tab space. It takes some practice to get good results but it is possible. With this method you can detect tables even if the tables use invisible separators - no lines. The word coordinates are solid base for table recog
We also have struggled with the issue of recognizing text within tables. There are two solutions which do it out of the box, ABBYY Recognition Server and ABBYY FlexiCapture. Rec Server is a server-based, high volume OCR tool designed for conversion of large volumes of documents to a searchable format. Although it is available with an API for those types of uses we recommend FlexiCapture. FlexiCapture gives low level control over extraction of data from within table formats including automatic detection of table items on a page. It is available in a full API version without a front end, or the off the shelf version that we market. Reach out to me if you want to know more.
Here are the basic steps that have worked for me. Tools needed include Tesseract, Python, OpenCV, and ImageMagick if you need to do any rotation of images to correct skew.
Use Tesseract to detect rotation and ImageMagick mogrify to fix it.
Use OpenCV to find and extract tables.
Use OpenCV to find and extract each cell from the table.
Use OpenCV to crop and clean up each cell so that there is no noise that will confuse OCR software.
Use Tesseract to OCR each cell.
Combine the extracted text of each cell into the format you need.
The code for each of these steps is extensive, but if you want to use a python package, it's as simple as the following.
pip3 install table_ocr
python3 -m table_ocr.demo https://raw.githubusercontent.com/eihli/image-table-ocr/master/resources/test_data/simple.png
That package and demo module will turn the following table into CSV output.
Cell,Format,Formula
B4,Percentage,None
C4,General,None
D4,Accounting,None
E4,Currency,"=PMT(B4/12,C4,D4)"
F4,Currency,=E4*C4
If you need to make any changes to get the code to work for table borders with different widths, there are extensive notes at https://eihli.github.io/image-table-ocr/pdf_table_extraction_and_ocr.html

Techniques for visualising change over time in graphs

I'm looking to display a graph (network diagram, not a chart) and show its changes over time. Is there a standard or best way to do this, or any kind of 'network diff' tool?
I'm looking for an overview of the general layout decisions involved, i.e. a list of options and trade-offs to be made, and best-practice guidelines where these exist.
Wow. Not an easy question! I'm curious if anyone can come up with some authoritative resources for you.
I haven't found any standard or best practice documented anywhere from a design standpoint, nor do I know of any tool specifically designed for determining and displaying the changes, but I have some ideas.
First, a few technical notes. There's GraphML, which you can use (and extend) to represent your graph in a standard format, and there are some parsers available, and it works with Prefuse and probably other display libraries. It's just XML, though - nothing too special. Creating the "diff" by comparing two GraphML files should be pretty simple.
The really interesting part is how to communicate the differences to the user.
In all cases, you should have a visual indicator for nodes and edges that are added or removed. You may use color, showing existing nodes as something neutral, say gray, new nodes as green, and removed nodes as red. There are lots of options.
You might find this slideshow interesting.
It's probably obvious, but, over time, the nodes should not move more than necessary to adapt to the new state of the graph - the layout should evolve, not start from scratch for every state. This is crucial for comparing the states.
Side-by-side before/after comparison. Present before and after snapshots of the same graph side-by-side. If your graph is very large and complicated, a side-by-side layout may be impractical. You could try overlaying one graph over the other, though that is likely to be disorienting.
Side-by-side series comparison. AKA small multiples. Same as above but showing as many points in time as is useful. Even more restrictive than before-after in terms of how much space required, and difficult for.
Animate a single graph. I think the most intuitive method is to smoothly animate the graph changes, though a choppy slideshow could work if the changes between slides are not too drastic.
Showing details. If useful, you can spell out the change event details in a few different ways.
Show labels on the graph node (could be interactive if there are too many to show at once)
Show a list in a sidebar / legend. Nice if reading the progression of changes is useful, but harder to connect to the visual.
Show a timeline instead of a list. This shows the 'real' progression of events better than a simple list, which gives the impression that all the events are evenly spaced over time.
What you actually choose to do would depend largely on the nature of your dataset and your goals. A simple graph of a few dozen nodes and a few changes is a much different challenge than a huge network, like say every constellation in the night sky!
Here is an interesting study: http://publik.tuwien.ac.at/files/PubDat_198995.pdf
This paper presents a prototype, and user tests will be published soon in:
P. Federico, W. Aigner, S. Miksch, F. Windhager, M. Smuc:
"Vertigo zoom: combining relational and temporal perspectives on
dynamic networks";
accepted as talk for: 11th International Working Conference on
Advanced Visual Interfaces (AVI2012), Capri Island; 2012-05-21 -
2012-05-25; in: "Proceedings of the 11th International Working
Conference on Advanced Visual Interfaces (AVI2012)", ACM, (2012),
ISBN: 978-1-4503-1287-5.
http://ieg.ifs.tuwien.ac.at/~federico/pub.php
Your question is kind of general, I'm not clear exactly what kinds of analysis you are aiming for. The are several network analysis packages that have some dynamics capacity. Gephi is one. The networkDynamic and ndtv R packages provide tools for representing and visualizing dynamics as animations and static layouts (disclaimer: I'm a maintainer)

Determine the differences between two nearly identical photographs

This is a fairly broad question; what tools/libraries exist to take two photographs that are not identical, but extremely similar, and identify the specific differences between them?
An example would be to take a picture of my couch on Friday after my girlfriend is done cleaning and before a long weekend of having friends over, drinking, and playing rock band. Two days later I take a second photo of the couch; lighting is identical, the couch hasn't moved a milimeter, and I use a tripod in a fixed location.
What tools could I use to generate a diff of the images, or a third heatmap image of the differences? Are there any tools for .NET?
This depends largely on the image format and compression. But, at the end of the day, you are probably taking two rasters and comparing them pixel by pixel.
Take a look at the Perceptual Image Difference Utility.
The most obvious way to see every tiny, normally nigh-imperceptible difference, would be to XOR the pixel data. If the lighting is even slightly different, though, it might be too much. Differencing (subtracting) the pixel data might be more what you're looking for, depending on how subtle the differences are.
One place to start is with a rich image processing library such as IM. You can dabble with its operators interactively with the IMlab tool, call it directly from C or C++, or use its really decent Lua binding to drive it from Lua. It supports a wide array of operations on bitmaps, as well as an extensible library of file formats.
Even if you haven't deliberately moved anything, you might want to use an algorithm such as SIFT to get good sub-pixel quality alignment between the frames. Unless you want to treat the camera as fixed and detect motion of the couch as well.
I wrote this free .NET application using the toolkit my company makes (DotImage). It has a very simple algorithm, but the code is open source if you want to play with it -- you could adapt the algorithm to .NET Image classes if you don't want to buy a copy of DotImage.
http://www.atalasoft.com/cs/blogs/31appsin31days/archive/2008/05/13/image-difference-utility.aspx
Check out Andrew Kirillov's article on CodeProject. He wrote a C# application using the AForge.NET computer vision library to detect motion. On the AForge.NET website, there's a discussion of two frame differences for motion detection.
It's an interesting question. I can't refer you to any specific libraries, but the process you're asking about is basically a minimal case of motion compensation. This is the way that MPEG (MP4, DIVX, whatever) video manages to compress video so extremely well; you might look into MPEG for some information about the way those motion compensation algorithms are implemented.
One other thing to keep in mind; JPEG compression is a block-based compression; much of the benefit that MPEG brings from things is to actually do a block comparison. If most of your image (say the background) is the same from one image to the next, those blocks will be unchanged. It's a quick way to reduce the amount of data needed to be compared.
just use the .net imaging classes, create a new bitmap() x 2 and look at the R & G & B values of each pixel, you can also look at the A (Alpha/transparency) values if you want to when determining difference.
also a note, using the getPixel(y, x) method can be vastly slow, there is another way to get the entire image (less elegant) and for each ing through it yourself if i remember it was called the getBitmap or something similar, look in the imaging/bitmap classes & read some tutes they really are all you need & aren't that difficult to use, dont go third party unless you have to.

Resources