最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

查找矩阵内最接近相似的值(向量)

SEO心得admin81浏览0评论
本文介绍了查找矩阵内最接近/相似的值(向量)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述

假设我有以下numpy矩阵(简化):

let's say I have the following numpy matrix (simplified):

matrix = np.array([[1, 1], [2, 2], [5, 5], [6, 6]] )

现在我想从最接近搜索"向量的矩阵中获取向量:

And now I want to get the vector from the matrix closest to a "search" vector:

search_vec = np.array([3, 3])

我所做的是以下事情:

min_dist = None result_vec = None for ref_vec in matrix: distance = np.linalg.norm(search_vec-ref_vec) distance = abs(distance) print(ref_vec, distance) if min_dist == None or min_dist > distance: min_dist = distance result_vec = ref_vec

结果有效,但是是否有本机的numpy解决方案来提高效率? 我的问题是,矩阵越大,整个过程就越慢. 还有其他解决方案可以更优雅,更有效地解决这些问题吗?

The result works, but is there a native numpy solution to do it more efficient? My problem is, that the bigger the matrix becomes, the slower the entire process will be. Are there other solutions that handle these problems in a more elegant and efficient way?

推荐答案

方法1

我们可以将 Cython-powered kd-tree用于快速的最近邻居查找,在内存和性能方面都非常有效-

We can use Cython-powered kd-tree for quick nearest-neighbor lookup, which is very efficient both memory-wise and with performance -

In [276]: from scipy.spatial import cKDTree In [277]: matrix[cKDTree(matrix).query(search_vec, k=1)[1]] Out[277]: array([2, 2])

方法2

使用 SciPy's cdist -

In [286]: from scipy.spatial.distance import cdist In [287]: matrix[cdist(matrix, np.atleast_2d(search_vec)).argmin()] Out[287]: array([2, 2])

方法3

使用 Scikit-learn's最近的邻居-

With Scikit-learn's Nearest Neighbors -

from sklearn.neighbors import NearestNeighbors nbrs = NearestNeighbors(n_neighbors=1).fit(matrix) closest_vec = matrix[nbrs.kneighbors(np.atleast_2d(search_vec))[1][0,0]]

方法4

使用 Scikit-learn's kdtree -

With Scikit-learn's kdtree -

from sklearn.neighbors import KDTree kdt = KDTree(matrix, metric='euclidean') cv = matrix[kdt.query(np.atleast_2d(search_vec), k=1, return_distance=False)[0,0]]

方法5

从 eucl_dist 包中(免责声明:我是它的作者)并遵循 wiki contents ,我们可以利用matrix-multiplication-

M = matrix.dot(search_vec) d = np.einsum('ij,ij->i',matrix,matrix) + np.inner(search_vec,search_vec) -2*M closest_vec = matrix[d.argmin()]
发布评论

评论列表(0)

  1. 暂无评论