How to find center / mid point/ center of mass of vectors [v1,v2,v3,…], vectors of n-dimensions, using Python - python

I have vectors v1, v2, v3, v4, v5, with 100 dim, I need to find a center vector, that will have equal distance with each.

Everything depends on how vectors are presented in Python.
Lets v1, v2, ... , v5 are presented as lists of values. Each list has len = 100.
In this case I would do the following:
np.vstack([v1, v2, v3, v4, v5]).mean(axis=1)
If vectors are already composed as 5x100 array, e.g. arr, arr.shape=(5, 100),
you can get the solution as follows:
arr.mean(axis=1)

Related

How to write a function distance_matrix which computes the distance matrix of a graph without using NetworkX functions in Python?

I have to write a write a function distance matrix in python which computes the distance matrix of a graph. I can use the NetworkX function adjacency_matrix to compute the adjacency matrix of the input graph, but I cannot use any other NetworkX functions.
I know that the function has to computes the distance matrix of a graph. It needs to a matrix, represented as an array of type numpy.ndarray, of the same shape as the adjacency matrix of the graph.
Am = np.eye(48)
A = nx.adjacency_matrix(G).toarray()
A1 = np.eye(48)
def distance_matrix(G):
for m in range(1,49,1):
Am=np.linalg.matrix_power(A,m)
for i in range(48):
for j in range(48):
if Am[i,j]>0 and A1[i, j] == 0 :
A1[i, j] = m and np.diagonal(B)==0
return A1
print(distance_matrix(G))
I know that the diagonal has to be equal to 0 and the rest of the entries have to be shortest path from one node to the other. I think...

which axis for coordinates in a list of points [NumPy]

Say I have a list of N points each with d coordinates. There is a choice of representing this as a numpy array of shape:
- (d, N)
- (N, d)
which are mathematically equivalent.
Question. Are there general guidelines/good practise principles for choosing one over the other? Computationally speaking, is numpy designed with one choice in mind?
A interpretation in favour of (N, d).
If I were to store coordinates of a list of points in a spreadsheet, then I would find it more natural (or is it just me?) to go through the list vertically (downwards) and through the coordinates horizontally. In other words, each row of the spread sheet corresponds to a Python tuple (fixed length, immutable), and the spreadsheet corresponds to a Python list of such tuples. The number of coordinates in the spreadsheet is fixed, but more points could be added (or removed) to the list, and thus the length of this list is unbounded, and I find it easier to scroll vertically than horizontally.
Example.
In the k-means clustering algorithm, one wants to calculate the distance between each of N points with each of k cluster centers (all of which have d coordinates). From this article I am learning a way to do this by exploiting broadcasting. If X is an array of shape (N, d) of sample points and C is an array of shape (k, d), then the distance can be conveniently calculated by taking the element wise length of the array
X - C[:, None]
of shape (k, N, d).
It would be less convenient to do this with arrays of shapes (d, N) and (d, k), respectively.

Calculate eigen value in python as same way(order) in Matlab

This is the Matlab code which is returning eigenvector in V and eigenvalue in D. Consider C is 9*9 matrix then V is 9*9 matrix and D is 9*9 diagonal. matrix.
[V,D] = eig(C);
I want the same thing in Python and in the same order as Matlab. I am using this code:
[V1, D] = np.linalg.eig(C)
V = np.zeros((9,9))
for i in range(9):
V[i][i] = V1[i]
(consider V to be in the for loop)
This code is giving me eigenvalue in V1 and eigenvector in D. I changed V1 to V to get a diagonal 9*9 matrix.
But the problem is that I want the eigenvalue and vector in the same order as Matlab which I am not getting in the python. Please help me in getting the values in the same order as Matlab.
See the link below for the difference in values between Matlab and python.
https://drive.google.com/drive/folders/1zjhbKH0q_XXbBziZhfpL1-qS3B5oDuMb
Matlab will output the eigenvalues to the diagonal elements of the D matrix in ascending order (i.e. lowest eigenvalue is D(1,1) and the largest one is D(9,9)).
Python doesn't follow this convention and the outputs (eigenvalues and eigenvectors) must be sorted with something like;
ind = np.argsort(V1);
V1 = V1[ind];
D = D[:,ind];

Convolve more than two vectors?

numpy.convolve only convolves two vectors. I want to convolve a list of vectors:
v1, v2, ..., vn
Note that using the FFT it is possible to do the complete convolution much more efficiently than the naive:
numpy.convolve(v1, numpy.convolve(v2, ...., numpy.convolve(vn-1, vn)...)
Because you can FT each sequence once, and then invert FT the product of the transforms, which is more efficient.
Is there a way to do this?

Cosine similarity in Theano

What is the easiest way to compute cosine similarity with numpy and theano?
Vectors given as numpy arrays.
I've tried to calculate cosine similarity matrix just using numpy, and it works maddeningly slow. However, I am absolutely new to theano, but suppose that this library may help me to build my cosine similarity matrix.
Well, help! :)
Here's a post about cosine similarity in Python: Cosine Similarity between 2 Number Lists.
I rewrote this answer in Numpy and Theano:
def cos_sim_numpy(v1, v2):
numerator = sum(v1*v2)
denominator = math.sqrt(sum(v1**2)*sum(v2**2))
return numerator/denominator
def compile_cos_sim_theano():
v1 = theano.tensor.vector(dtype=theano.config.floatX)
v2 = theano.tensor.vector(dtype=theano.config.floatX)
numerator = theano.tensor.sum(v1*v2)
denominator = theano.tensor.sqrt(theano.tensor.sum(v1**2)*theano.tensor.sum(v2**2))
return theano.function([v1, v2], numerator/denominator)
cos_sim_theano_fn = compile_cos_sim_theano()
v1 = numpy.asarray([3,45,7,2], dtype=np.float32)
v2 = numpy.asarray([2,54,13,15], dtype=np.float32)
print cos_sim_theano_fn(v1, v2), cos_sim_numpy(v1, v2)
Output: 0.972284251712 0.972284251712

Resources