I am writing a complex application (a compiler analysis). To debug it I need to examine the application's execution trace to determine how its values and data structures evolve during its execution. It is quite common for me to generate megabytes of text output for a single run and sifting my way through all that is very labor-intensive. To help me manage these logs I've written my own library that formats them in HTML and makes it easy to color text from different code regions and indent code in called functions. An example of the output is here.
My question is: is there any better solution than my own home-spun library? I need some way to emit debug logs that may include arbitrary text and images and visually structure them and if possible, index them so that I can easily find the region of the output I'm most interested. Is there anything like this out there?
Regardless you didn't mentioned a language applied, I'd like to propose apache Log4XXX family: http://logging.apache.org/
It offers customizable details level as well as tag-driven loggers. GUI tool (chainsaw) can be combined with "old good" GREP approach (so you see only what you're interested in at the moment).
Colorizing, search and filtering using an expression syntax is available in the latest developer snapshot of Chainsaw. The expression syntax also supports regular expressions (using the 'like' keyword).
Chainsaw can parse any regular text log file, not just log files generated by log4j.
The latest developer snapshot of Chainsaw is available here:
http://people.apache.org/~sdeboy
The File, load Chainsaw configuration menu item is where you define the 'format' and location of the log file you want to process, and the expression syntax can be found in the tutorial, available from the help menu.
Feel free to email the log4j users list if you have additional questions.
I created a framework that might help you, https://github.com/pablito900/VisualLogs
Related
I am new to this site and I would like to get some inputs regarding parsing STDF files. Generally speaking, I am trying to parse a STDF file to gather only the results (numbers) and not the rest of the line. If I am able to achieve this, I would then like to compare all the numbers together through a bubble sort or insertion sort and see if any numbers are equal to each other. I am capable of doing this in C/C++ and Java but I have no experience parsing documents using Scripts.
Could anyone push me in the right direction? What should I be reading to learn my way around this?
Are you already using an STDF library?
You did not mention one, so I assume not.
You should find a library you are comfortable with (the list changes over time, but you can find some by Googling or looking at the STDF page on Wikipedia) rather than attempting to parse STDF yourself, unless you have a good reason to recreate the STDF parser wheel.
An STDF file contains many tests. It generally does not make sense to compare the results for different tests, so I assume you are looking for matching values within the set of results for each test.
I would use your chosen STDF parser to read the value of each test for each part. Keep a set of the results for each test. As you read each new result, check the set to see if already exists. If it does, you have found the case you were looking for, otherwise add the result to the set.
There must exist some application to do the following, but I am not even sure how to google for it.
The dilemma is that we have to backtrace defects and in doing so this requires to see how certain fields in the output xml have been generated by the XSL. The hard part is spending hours in the XSL and XML trying to figure out where it was even generated. Even debugging is difficult if you are working with multiple XSL transformation and edits as you still need to find out primary keys that get in the specific scenario for that transform.
Is there some software program that could take an XSL and perhaps do one of two things:
Feed it an output field name and it would generate a list of all
the possible criteria that would generate this field so you can figure out which one of a dozen in the XSL meets your criteria, or
Somehow convert the xsl into some more readable if/then type
format (kind of like how you can use Javadoc to produce readable documentation)
You don't say what tools you are currently using. Tools like oXygen and Stylus Studio have some quite sophisticated XSLT debugging capability. OXygen's output mapping tool (see http://www.oxygenxml.com/xml_editor/working_with_xslt_debugger.html#xsltOutputMapping) sounds very like the thing you are asking for.
Using schema-aware stylesheets can greatly ease debugging. At least in the Saxon implementation, if you declare in your stylesheet that you want the output to be valid against a particular schema, then if it isn't, Saxon will tell you what instruction in the stylesheet caused invalid output to be generated. Sometimes it will show you the error at stylesheet compile time, before you even supply a source document. This capability is greatly under-used, in my view. More details here: http://www.stylusstudio.com/schema_aware.html
It's an interesting question. Your suggestions are also interesting but would be quite challenging to develop; I know of no COTS or FOSS solution to either, but here are some thoughts:
Your first possibility is essentially data-flow analysis from
compiler design. I know of no tools that expose this to the user,
but you might ask XSLT processor developers if they have ever
considered externalizing such an analysis in a manner that would be useful to XSLT
developers.
Your second possibility is essentially a documentation generator
against XSLT source. I have actually helped to complete one for a client in
financial services in the past (see Document XSLT Automatically), but the solution was the property of
the client and was never released publicly as far as I know. It
would be possible to recreate such a meta-transformation between
XSLT input and HTML or Docbook output, but it's not simple to do in the
most general case.
There's another approach that you might consider:
Tighten up your interface definition. In your comment, you mention uncertainty as to whether a problem's source is bad data from the sender or a bug in the XSLT. You would be well-served by a stricter interface definition. You could implement this via better typing in XSD, addition of xsd:assertion statements if XSD 1.1 is an option, or adding a Schematron-based interface checking level, which would allow you the full power of XPath-based assertions over the input. Having such an improved and more specific interface definition would help both you and your clients know what should and should not be sent into your systems.
I am using asciid for an article. In the end of my document I want to have a list of figures. How to I create a list of figures? Did not find something useful in the documentation for me.
Nope there isn't one at the time of answer. I checked the docs (which you indicated you did as well) and I also grepped the codebase. There is good news though! You should be able to do this with an extension.
Extensions can be written in any JVM language if you're using asciidoctorj, or in Ruby if you're using the core asciidoctor (I'm not sure about JavaScript for asciidoctorjs). You'll need to create two extensions probably: a TreeProcessor extension to go through the whole AST looking for images and pulling them out into a storage structure. Then you'll also need to create either an inline or block macro to actually place it within the page.
I strongly recommend examining the API for the nodes and functions you'll want to make use of. There are some other examples of processors that may also be helpful to examine.
Sadly, a project that I have been working on lately has a large amount of copy-and-paste code, even within single files. Are there any tools or techniques that can detect duplication or near-duplication within a single file? I have Beyond Compare 3 and it works well for comparing separate files, but I am at a loss for comparing single files.
Thanks in advance.
Edit:
Thanks for all the great tools! I'll definitely check them out.
This project is an ASP.NET/C# project, but I work with a variety of languages including Java; I'm interested in what tools are best (for any language) to remove duplication.
Check out Atomiq. It finds code that is duplicate that is prime for extracting to one location.
http://www.getatomiq.com/
If you're using Eclipse, you can use the copy paste detector (CPD) https://olex.openlogic.com/packages/cpd.
You don't say what language you are using, which is going to affect what tools you can use.
For Python there is CloneDigger. It also supports Java but I have not tried that. It can find code duplication both with a single file and between files, and gives you the result as a diff-like report in HTML.
See SD CloneDR, a tool for detecting copy-paste-edit code within and across multiple files. It detects exact copyies, copies that have been reformatted, and near-miss copies with different identifiers, literals, and even different seqeunces of statements.
The CloneDR handles many languages, including Java (1.4,1.5,1.6) and C# especially up to C#4.0. You can see sample clone detection reports at the website, also including one for C#.
Resharper does this automagically - it suggests when it thinks code should be extracted into a method, and will do the extraction for you
Check out PMD , once you have configured it (which is tad simple) you can run its copy paste detector to find duplicate code.
One with some Office skills can do following sequence in 1 minute:
use ordinary formatter to unify the code style, preferably without line wrapping
feed the code text into Microsoft Excel as a single column
search and replace all dual spaces with single one and do other replacements
sort column
At this point the keywords for duplicates will be already well detected. But to go further
add comparator formula to 2nd column and counter to 3rd
copy and paste values again, sort and see the most repetitive lines
There is an analysis tool, called Simian, which I haven't yet tried. Supposedly it can be run on any kind of text and point out duplicated items. It can be used via a command line interface.
Another option similar to those above, but with a different tool chain: https://www.npmjs.com/package/jscpd
I would like to represent data that gives an overview but allows them to drill down in an inline fashion - so if you had a grouping of say 6 objects the user could expand the data and it would show the 6 objects immeadiately below it before any more high level data.
It would appear that MSHFlexgrid gives this ability but I can't find any information about actually using it, or what it's limitations are (can you have differing number of fields and/or can they have different spacing, what about column headers, indentation at for the start, etc).
I found this site, but the images are broken (in ie8 and ff3.5). Google searches show people just using the flat data representation but nothing using the hierarchical properties). Does anyone know any good tutorials or forums with a good discussion about pitfalls?
Due to lack of information about using it, I am thinking of coding my own version but if anyone has done work in this area I haven't found it - I would of thought it would be a natural wish for data representation. If someone has coded a version of this (any language) then I wouldn't mind reading about it - maybe my idea of how to do it wouldn't be the best way.
You might want to check out vbAccelerator. He has a Multi-Column Treeview control that sounds like what you may be looking for. He gives you the source and has some pretty decent samples.
The MSHFlexGrid reference pages and the "using the MSHFlexGrid" topic in the Visual Basic manual?
Sorry if you've already looked at these!