Seaborn boxplot and Violinplot median is different - seaborn

I am comparing a boxplot and violin plot in seaborn and the violin plot has a different inner box than the boxplot itself. I would like to use a violin plot to visualize the distribution of the data at the same time, is it possible the violin plot has a different median due to the kernel density estimation? The boxplots have the correct median from the data.
EDIT:
when I set inner='quartiles' the violin plots show the appropriate scale. I would like to use inner='box' though for appearances if I can.
Here are the graphs themselves:
and now the boxplot:

Related

What is the right scale for highly divergent values in d3 data visualization?

I have two maps that plot values against counties on the United States.
Values being plotted on both are highly divergent.
On the first, around 2.5k of 3.3k counties have values of 0. While the remaining counties have values ranging up to 257,519,000.
This map visualizes just fine using scaleLog likeso:
color = d3
.scaleLog()
.domain(d3.max(data), d3.min(data))
.range(["black", "purple"])
The next map has 0 values for about 3.2k of 3.3k counties. With the remaining c. 100 counties having values that range up to 881,587,418. The values are substantially more divergent.
Using a logarithmic scale to assign color does not work on this second map. All values are black.
What would be the best scale to use here? Or is there another technique for plotting mostly empty, highly divergent values in D3?

multiple ROC curve in R with a matrix of prediction values and labels

I want to plot multiple ROC curves with a matrix of predictions and labels. I have > 100 samples with a matrix of predictions and labels for each sample. The length of the samples is different. How could I get design a single matrix for all the samples and get multiple ROC curves in a single plot? I would appreciate any suggestions. Thanks

Interpretation of Horizontal and Vertical Summations of an Image

I have a binary which has some text on different parts of the image like at the bottom, top, center, right middle center, etc.
Original Image
The areas I would like to focus on are the manually drawn regions shown in red.
I calculated the horizontal and vertical summations of the image and plotted them:
plot(sum(edgedImage1,1))
plot(sum(edgedImage1,2))
Can somebody give me explanation of what these plots are telling me about the original image with regards to the structure of which I explained above?
Moreover, how could these plots help me extracting those regions I just manually drew in red?
There's nothing sophisticated about the sum operation. Simply put, sum(edgedImage1,1) computes the sum of all rows for each column in the image and that is what you are plotting. Effectively, you are computing the sum of all non-zero values (i.e. white pixels) over all rows for each column. The horizontal axis in the plot denotes what row's sum you are observing. Similarly, sum(edgedImage,2) computes the sum of all columns for each row of the image and that is what you are plotting.
Because your text is displayed in a horizontal fashion, sum(edgeImage,1) won't be particularly useful. What is very useful is the sum(edgedImage,2) operation. For lines in your image that are blank, the horizontal sum of columns for each row of your image should be a very small value whereas for lines in your image that contain text or strokes, the sum should be quite large. A good example of what I'm talking about is seen towards the bottom of your image. If you consult between rows 600 and 700, you see a huge spike in your plot as there is a lot of text that is surrounded between those rows.
Using this result, a crude way to determine what areas in your image that contain text or strokes in your case would be to find all rows that surpass some threshold. Combined with finding modes or peaks from the sum operation that was just performed, you can very easily localize and separate out each region of text.
You may want to smooth the curve provided by sum(edgedImage,2) if you decide to determine how many blobs of text there are. Once you smooth out this signal, you will clearly see that there are 5 modes corresponding to 5 lines of text.
The second plot that shows the sum of each row. This can tell you in which rows you have a lot of information and in which you have none.
You can use this plot to find the rectangles by looking for a sharp incline in the value for a start of a rectangle and sharp decline in the value for the end of the rectangle. Before you do it i would low pass filter the data and then look at the derivative of this and look for a big derivative.
You can do the same the first plot but it is more sensitive.
The minimums in your last plot are the gaps between lines of text ...
You just take the graph and align its y axis to y axis of image and then Threshold areas with too small amount of pixels per column. These areas (Red) are the gaps between lines of Text or whatever you got on the image:
Now you should de-skew the image if needed. If the skew is too big you need to apply the de-skew even before the y axis summation.
De-skew operation characters in binary image (matlab)
After this you make the x axis summation plot for each non red region separately and in the same manner detect gaps between characters/words to get the area of each character for OCR. This time you align to x axis
These plots can be also used to OCR if taken on character region instead see
OCR and character similarity
If you do a statistical analysis of the gap/non gap sizes then the most recurrent one is usually the Font spacing/size for regular text.

Matlab: polar coordinates grey scale plot

edit: I decided to split this question into two parts, because it were really two questions: 1. how to make a polar surface plot in MATLAB (this question) and 2. how to put fit polar data points into a coarse (and non-polar) matrix
I have a matrix that contains certain grey values (values between zero and one). These points are stored in a rectangular matrix, but really the data points are acquired by rotating the detector. This means that I actually have polar coordinates (I know the polar coordinates for every single pixel in my starting matrix).
I want to make a polar plot of the data points. I have the example of this below.
Because MATLAB stores images as matrices, the polar coordinates I have do not exactly match the 'bins' of the matrix. Therefore, we currently use an interpolation algorithm to put the polar coordinates into a square matrix. However, this is extremely slow. I see two methods to solve this issue:
let MATLAB directly plot the data points as polar.
calculate once how to convert from the start matrix to the end matrix and let MATLAB do this through matrix multiplication.
Some basic information:
Input matrix size: 512×960
Current output matrix size: 1024×1024
I think there is built in function for polar plot in matlab.
Z = [2+3i 2 -1+4i 3-4i 5+2i -4-2i -2+3i -2 -3i 3i-2i];
polarplot(Z,'*')
this command plots:
plot polar
See this link:
http://www.mathworks.com/help/matlab/ref/polarplot.html
To plot in grayscale, use "pcolor" and specify colormap to "gray"
www.mathworks.com/help/matlab/ref/ pcolor.html
The question was solved (apart from a minor flaw), partially because K.M. Shihab Uddin pointed me in the right direction. Unfortunately, using surf means continuously really plotting the image in a figure, and this is slow as well.
So I have X and Y values both in separate matrices and greyscale values (in a matrix called C) for every X and Y combination.
I found out that pcolor is just surf with a viewpoint from the top. So I used the following code to plot my graph.
surf(X,Y,C*255)
view([0,0,500])
However, this gave me a completely black image. This is because surf (and pcolor) create 960 grid lines radially in my case. The solution is to use:
surf(X,Y,img2*255,'EdgeColor','none')
view([0,0,500])
Now I have an almost perfect image, like I had before. Only, of my 960 radial lines, one is left white, so I still have to solve that. However, I feel this is a technical detail of the function surf, and answering this part does not belong in this question.
The resulting image

multidimensional scatter plot with d3

I have a dataset which has 9 attributes out of which 2 are numerical and the rest are categorical.
I wish to plot as many attributes as possible within a scatter plot matrix. D3 examples have shown me scatter plot matrices with a majority of numerical values.
Are there any ways to plot multidimensional categorical data as scatter plots?
IF yes, are there any samples available on the wbe?

Resources