Are logs included while creating the dot file in gstreamer? - debugging

I am trying to understand, how dot file is created from gstreamer logs.
When I generated the gstreamer logs with GST_DEBUG=4 it generated huge number of logs.
At the same time when I check the dot file generated by gstreamer, it has specific information about the pipeline creation. Not the log information after pipeline is created like playing paused seeking...
I have some questions:
What information will be having in dot file when compared to complete log file?
If all the logs are not included in dot file, then how we can debug those log information using dotgraph(using tools like graphviz)?

The dot file is a graphical representation of your complete pipeline, the interconnection of different elements in the pipeline along with the information about the caps negotiation. For eg. When your pipeline grows too large, and you need information about the connection of different elements and the flow of data, usage of dot files will prove useful. Follow this link.
With GST_DEBUG=4, all the logs, warnings, errors of different elements will be outputted. This is particularly useful when you want to understand the lower levels of what is going on inside the elements when the dataflow occurs along the pipeline. You can get information about different events, pad information, buffer information ,etc. Follow this link.
To get more information about a specific element you could also use the following:
GST_DEBUG=<element_name>:4

Related

How to create a partially modifiable binary file format?

I'm creating my custom binary file extension.
I use the RIFF standard for encoding data. And it seems to work pretty well.
But there are some additional requirements:
Binary files could be large up to 500 MB.
Real-time saving data into the binary file in intervals when data on the application has changed.
Application could run on the browser.
The problem I face is when I want to save data it needs to read everything from memory and rewrite the whole binary file.
This won't be a problem when data is small. But when it's getting larger, the Real-time saving feature seems to be unscalable.
So main requirement of this binary file could be:
Able to partially read the binary file (Cause file is huge)
Able to partially write changed data into the file without rewriting the whole file.
Streaming protocol like .m3u8 is not an option, We can't split it into chunks and point it using separate URLs.
Any guidance on how to design a binary file system that scales in this scenario?
There is an answer from a random user that has been deleted here.
It seems great to me.
You can claim your answer back and I'll delete this one.
He said:
If we design the file to be support addition then we able to add whatever data we want without needing to rewrite the whole file.
This idea gives me a very great starting point.
So I can append more and more changes at the end of the file.
Then obsolete old chunks of data in the middle of the file.
I can then reuse these obsolete data slots later if I want to.
The downside is that I need to clean up the obsolete slot when I have a chance to rewrite the whole file.

Nifi: how to avoid copying file that are partially written

I am trying to use Nifi to get a file from SFTP server. Potentially the file can be big , so my question is how to avoid getting the file while it is being written. I am planning to use ListSFTP+FetchSFTP but also okay with GetSFTP if it can avoid copying partially written files.
thank you
In addition to Andy's solid answer you can also be a bit more flexible by using the ListSFTP/FetchSFTP processor pair by doing some metadata based routing.
After ListSFTP each flowfile will have attributes such as 'file.lastModifiedTime' and others. You can read about them here https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.3.0/org.apache.nifi.processors.standard.ListSFTP/index.html
You can put a RouteOnAttribute process in between the List and Fetch to detect objects that at least based on the reported last modified time are 'too new'. You could route those to a processor that is just a slow pass through to intentionally wait a bit. You can then run those back through the first router until they are 'old enough'. Now, this is admittedly a power user approach but it does give you a lot of flexibility and control. The approach I'm mentioning here is not fool proof as the source system may not report the last mod time correctly, it may not mean the source file is doing being written, etc.. But it gives you additional options IF you cannot do the definitely correct thing above that Andy talks about.
If you have control over the process which writes the file in, a common pattern to solve this is to initially write the file with a specific naming structure, such as beginning with .. After the successful write operation, the file is renamed without the . and it is picked up by the processor. Both GetSFTP and ListSFTP have a processor property called Ignore Dotted Files which is set to true by default and means those processors will not operate on or return files beginning with the dot character.
There is a minimum file age property you can use. The last modification time gets updated as the file is being written. Setting this value to something other than 0 will help fix the problem:

Apache Nifi GetTwitter

I have a simple question, as I am new to NiFi.
I have a GetTwitter processor set up and configured (assuming correctly). I have the Twitter Endpoint set to Sample Endpoint. I run the processor and it runs, but nothing happens. I get no input/output
How do I troubleshoot what it is doing (or in this case not doing)?
A couple things you might look at:
What activity does the processor show? You can look at the metrics to see if anything has been attempted (Tasks/Time) as well as if it succeeded (Out)
Stop the downstream processor temporarily to make any output FlowFiles visible in the connection queue.
Are there errors? Typically these appear in the top-left corner as a yellow icon
Are there related messages in the logs/nifi-app.log file?
It might also help us help you if you describe the GetTwitter Property settings a bit more. Can you share a screenshot (minus keys)?
In my case its because there are two sensitive values set. According to the documentation when a sensitive value is set, the nifi.properties file's nifi.sensitive.props.key value must be set - it is an empty string by default using HortonWorks DataPlatform distribution. I set this to some random string (literally random_STRING but you can use anything) and re-created my process from the template and it began working.
In general I suppose this topic can be debugged by setting the loglevel to DEBUG.
However, in my case the issue was resolved more easily:
I just set up a new cluster, and decided to copy all twitter keys and secrets to notepad first.
It turns out that despite carefully copying the keys from twitter, one of them had a leading tab. When pasting directly into the GetTwitter processer, this would not show, but fortunately it showed up in notepad and I was able to remove it and make this work.

Bosun adding external collectors

What is the procedure to define new external collectors in bosun using scollector.
Can we write python or shell scripts to collect data?
The documentation around this is not quite up to date. You can do it as described in http://godoc.org/bosun.org/cmd/scollector#hdr-External_Collectors , but we also support JSON output which is better.
Either way, you write something and put it in the external collectors directory, followed by a frequency directory, and then an executable script or binary. Something like:
<external_collectors_dir>/<freq_sec>/foo.sh.
If the directory frequency is zero 0, then the the script is expected to be continuously running, and you put a sleep inside the code (This is my preferred method for external collectors). The scripts outputs the telnet format, or the undocumented JSON format to stdout. Scollector picks it up, and queues that information for sending.
I created an issue to get this documented not long ago https://github.com/bosun-monitor/bosun/issues/1225. Until one of us gets around to that, here is the PR that added JSON https://github.com/bosun-monitor/bosun/commit/fced1642fd260bf6afa8cba169d84c60f2e23e92
Adding to what Kyle said, you can take a look at some existing external collectors to see what they output. here is one written in java that one of our colleagues wrote to monitor jvm stuff. It uses the text format, which is simply:
metricname timestamp value tag1=foo tag2=bar
If you want to use the JSON format, here is an example from one of our collectors:
{"metric":"exceptional.exceptions.count","timestamp":1438788720,"value":0,"tags":{"application":"AdServer","machine":"ny-web03","source":"NY_Status"}}
And you can also send metadata:
{"Metric":"exceptional.exceptions.count","Name":"rate","Value":"counter"}
{"Metric":"exceptional.exceptions.count","Name":"unit","Value":"errors"}
{"Metric":"exceptional.exceptions.count","Name":"desc","Value":"The number of exceptions thrown per second by applications and machines. Data is queried from multiple sources. See status instances for details on exceptions."}`
Or send error messages to stderror:
2015/08/05 15:32:00 lookup OR-SQL03: no such host

JMeter - saving results + configuring "graph results" time span

I am using JMeter and have 2 questions (I have read the FAQ + Wiki etc):
I use the Graph Results listener. It seems to have a fixed span, e.g. 2 hours (just guessing - this is not indicated anywhere AFAIK), after which it wraps around and starts drawing on same canvas from the left again. Hence after a long weekend run it only shows the results of last 2 hours. Can I configure that span or other properties (beyond the check boxes I see on the Graph Results listener itself)?
Can I save the results of a run and later open them? I know I can save the test plan or parts of it. I am unclear if I can save separately just the test results data, and later open them and perform comparisons etc. And furthermore can I open them with different listeners even if they weren't part of original test (i.e. I think of the test as accumulating data, and later on I want to view and interpret the data using different "viewers").
Thanks,
-- Shaul
Don't know about 1. Regarding 2: listeners typically have a configuration field for "Write All Data to a File", which lets you specify the file name. You can use the Simple Data Writer to store results efficiently for later analysis.
You can load results from a previous test into a visualizer by choosing "Write All Data to a File" and browsing for the file you wish to load. Somewhat counterintuitively, selecting a file for writing also loads that file into the visualizer and displays the results. Just make sure you don't run the test again while that file is selected, otherwise you will lose your saved test data. :-)
Well, I later found a JMeter group that was discussing the issue raised in my first question, and B.Ramann gave me an excellent suggestion to use instead a better graph found here.
-- Shaul

Resources