how to fill a sub area of a density curve with python - seaborn

I want to shade the area under the density curve for the standard normal distribution by the following ranges:
1) mean-2std , mean-std ---> in red
2) mean +std , mean+2std ---> in red
3) mean -std , mean+st --> in blue
This is a variant on the questions "Shade (fill or color) area under density curve by quantile".
the data used to draw the denisty curve is taken from a column of a dataframe.
eg: This is only part of the data. The column has 256 values.
Gap
1 -3.260010
2 -7.790009
3 -1.179993
4 2.270019
5 9.000000
6 -4.930023
7 -7.920014
To draw the plot I did the following code:
sns.kdeplot(TeslaStock18_19['Gap'], label = 'Gap Density', color = 'darkblue')
Considering all the data, I found out that the distribution is normal. This allows me to use the Empricial rule (68-95) to make some statitical consideraton.
What I would like to obtain is the following plot:
https://www.nku.edu/~statistics/images/Using_1.gif
N.B. I am starting to use Python, It is for a Univeristy project.
This is what I tried to do but it does not fill me completely the area
ptx = np.linspace(meanGap-std, meanGap+std) pty = scipy.stats.norm.pdf(ptx,meanGap,stdGap) plt.fill_between(ptx, pty, color='#0b559f', alpha='0.35')

Related

how homogeneity is calculated? [closed]

I'm trying to find an image is homogeneous in matlab. my image contains 5 coins.I used the function improfile to create the intensity but I don't know how to identify the homogeneous circle.
homogeneity is considered using the intensity inside the coin.
how to code it
close all;clear all;
I = imread('coins.png');
bw =( im2bw(I, graythresh(I)));
[L N]= bwlabel(bw);
ele=find(L==3);
Im1=zeros([size(I,1) size(I,2)]);
Im1(ele)=1;
figure,imshow(Im1)
ML=I; ML(Im1==0)= 0; figure, imshow(ML);
figure,imhist(ML(Im1==1))
st = regionprops(L,I,'PixelValues');
pv = st(3).PixelValues;
figure, imhist(pv)
I plot the histogram.but I dont know how to proceed
Standard deviation of histogram and normal histogram is plotting as same
You can draw homogeneity criteria from the histogram of gray-levels or the histogram of gradient intensities.
A low dispersion (small variance) reveals a quasi-constant lightness, but is sensitive to smooth variations.
A low gradient average reveals a lack of detail, but is sensitive to texture/noise.

Automatically group / cluster multi-colored objects by color

I have between 2-20 multi-colored objects per image.
Those objects can belong to up to 5 different color groups.
I need to group similar colored objects into color groups.
I have no prior knowledge of the number of groups, number of colors of the object (could be single color aswell), and color patterns vary so I cannot use pre-defined color templates
What I tried so far:
K-Means, requires knowledge of the number of color groups, wrong k gives bad grouping.
Gaussian Mixture Models, requires knowledge of number of colors per object.
Self-Organizing Map, over-segments the color groups and using simple color templates don't work for multi-colored objects
The closest I've gotten is the following:
Using opencv, I've converted to HSV and taken the histogram of the H & S values only. Then using cv2.histcompare() using correlation to calculate the similarity between the histograms of the objects.
In the image below, I have 17 objects: 10 red-ish, 4 blue-ish and 3 yellow-ish objects in order.
I have manually taken 2 histograms per color group and plotted the distance of its histogram to each of the 17 objects and drawn its histogram underneath.
As you can see, the first red object has a small distance to the first 10 objects which are also red, but the second red object has abit more green which causes its distance to the yellow objects to be small, and the distance plot for the yellow objects show that its distance is relatively close to the red objects aswell which makes it hard to simply threshold the grouping by a fixed distance like the line at 0.6
Given that I have the distance between each object's histograms, how can I automatically create the color groups?
Or is there a better way of clustering the objects into color groups?
For any who is interested, Using the distance matrix I used scipy's hierarchical clustering to cluster the color groups.
My distance matrix was a normalized NxN (N=# of objects) symmetrical matrix with 1's along the diagonal with the correlation of each object's histogram compared to all other object's histograms. To group the objects, I use the dissimilarity which is 1-normalized value
dissimilarityMatrix = 1 - np.array(distanceMatrix)
hierarchy = scipy.cluster.hierarchy.linkage(dissimilarityMatrix, 'single')
using method='single' then clustered them using fcluster and max_d=1.0 worked for me. max_d might differ for different use cases, and it might help to take the squared distance matrix when calculating the dissimialrity
clusters = scipy.cluster.hierarchy.fcluster(hierarchy, max_d, criterion='distance')
No. of color groups = max(clusters)
As you can see, it clusters the red objects together (0-9), the blue (10-13) and yellow (14-16) successfully.

Model 'base-value' from RGB

I have a raster file which only consists of RGB bands. The raster is representing elevation and colors the cells accordingly, but it does not contain any elevation data. Now, I want to change the colors of it in my own custom scale, and therefore I need the elevation-data.
I have a small portion of the raster expressed in sample points (thus i,j and elevation). I can sample the raster-RGB values onto these points so that the points contain the information of (i,j,elevation,r,g,b,a) in a list.
Now I want to analyse the correlation(?) between the set of RGB and elevation using the sample points. Then afterwards I want to go back to the raster and compute the elevation for any cell. Basically I want to have something like this; aR + bG + cB + dA + e = elevation
Is this possible, or is assigning the RGB colourscale just too random? Otherwise, any tips on workflows or proposed correlation formulas to make this work? Preferebly in Python2
Edit:
This is a small snippet of the raster:
Blue is low elevation and Red is high elevation

How to transverse the whole RGB cube for a 16.7 million color gradient

Assume a color space defined by a 256*256*256 RGB cube. A 1d color gradient is formed by transversing the cube over some path between two points e.g. 1, 2. These are examples of a path that covers only part of the color space. I am interested in a 1d path to transverse all points in the cube continuously and form a 16.7 million color gradient. Are there any known formulations for this?
Edit: this is one possible answer: Algorithm for generating a 3D Hilbert space-filling curve in Python
Edit2: this shows some implementations: http://www.alanzucconi.com/2015/09/30/colour-sorting/
6 segment gradient
color paths see fig 10.22

Using pcolor to plot 3 arrays in python

I read an satellite image, and got the data, lat and lon out of the image and put in an array. The dimension of the lat is (135,90) and lon is also (135,90). The dimension of the data was originally (135,90,4,9,8), which 4 represent the band of the image. After processing( which used a for loop to put all band in a single image), the dimension of data is biw (1215,720), which is (135 x9, 90 x 8). I have a piece of code, which is:
x = lat # dimension (135,90)
y = lon # dimension (135,90)
z = data # dimension ( 1215, 720)
plt.figure()
plt.pcolor(x,y,z)
plt.colorbar()
plt.savefig("proj1.png")
But then it produced a very bad image below:
My friend told me before I should take more points in the lat and lon, so to make it same dimension as data. But don't know how to do it. Is the method he said correct?
its me again... The documentation for matplotlib says here http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.pcolor that
X and Y, if given, specify the (x, y) coordinates of the colored quadrilaterals; the quadrilateral for C[i,j] has corners at:
(X[i, j], Y[i, j]),
(X[i, j+1], Y[i, j+1]),
(X[i+1, j], Y[i+1, j]),
(X[i+1, j+1], Y[i+1, j+1]).
Ideally the dimensions of X and Y should be one greater than those of C; if the dimensions are the same, then the last row and column of C will be ignored.
Yet dimension (or shape) of C is totally different from X and Y. The matplotlib idally want you to prepare (1) X, Y being x and y coordinate of grid point (or corner point), and (2) C being value of the tile surouded by 4 adjacent grid points. so you have shape of x,y being 135 and 90. Then the color should be either 134 by 89, or 135 and 90.
My understanding is that the data for this C is from modis pixels, and you already have them 135x90. So you should specify corner points of those 12150 tiles... Make sense? if you know lat/lon of the center point, you shift them by half distance to left/below, then add one row and column at right/above to create grid points. If you use projected coordinate instead of lat/lon, it's same thing. Or you can forget about these half distance deal and just plug that X and Y you already got (135x90) as is, along with C, which has to be 135x90, in order to use pcolor.
What's the meaning of 9,8 in (135,90,4,9,8)? do you have 9*8 different property at each horizontal grid cell? e.g. vertical layer, different kind of chemical species, or physical property? If so, you have to pick one and make a plot one at a time (i.e., feed only C of 135x90 along with your X and Y). Also, you mentioned that 4 is for "band". If this id color band like RGBK and you want to show that color, then probably pcolor is not good, and you have to look for some other function or something that understands those 4 number. pcolor simply read range of number, then apply scale between min and max then apply color scale from blue to red (or whatever you choose if you do)
EDIT
I grabbed a data set for Level-1B, VISIBLE along with documentation from http://disc.sci.gsfc.nasa.gov/AIRS/data-holdings/by-access-method.
This data set is generated from AIRS level 1A digital numbers (DN), including 4 channels in the 0.4 to 1.0 um region of the spectrum. A day's worth of AIRS data is divided into 240 scenes each of 6 minute duration. For the AIRS visible/near IR measurements, an individual scene contains 135 scanlines with a scanline containing 720 cross-track pixels and 9 along-track pixels; there is a total of 720 x 9 x 135 = 874,800 visible/near-IR pixels per scene.
So easiest would be to get average of 8x9 values for each location, and pick one of four tracks one at a time. Alternatively, since these band corresponds to different colors, as shown in wave length below,
Channel 1: 0.41 um - 0.44 um
Channel 2: 0.58 um - 0.68 um
Channel 3: 0.71 um - 0.92 um
Channel 4: 0.49 um - 0.94 um
you may be used these as RGBK values for pylab's imshow() function's input, maybe. You may not like the coarse resolution of output after spatial averaging. In that case you have to take coordinate of each of (9,8) pixel within each location, somehow. There should be a standard way though, the data is a widely used public data.

Resources