numpy,用其他矩阵的行填充稀疏矩阵(numpy, fill sparse matrix with rows from other matrix)

我无法弄清楚执行以下操作的最有效方法是什么:

import numpy as np M = 10 K = 10 ind = np.array([0,1,0,1,0,0,0,1,0,0]) full = np.random.rand(sum(ind),K) output = np.zeros((M,K)) output[1,:] = full[0,:] output[3,:] = full[1,:] output[7,:] = full[2,:]

我想构建输出,这是一个稀疏矩阵,其行以密集矩阵(完整)给出,行索引通过二进制向量指定。 理想情况下,我想避免使用for循环。 那可能吗? 如果没有,我正在寻找最有效的循环方式。

我需要执行此操作很多次。 ind和full将不断变化,因此我只提供了一些示例值来说明。 我希望ind非常稀疏(最多10%),M和K都是大数(10e2 - 10e3)。 最终,我可能需要在pytorch中执行此操作,但是对于numpy来说,一些不错的程序已经让我走得很远了。

如果您有一个或多个适合此问题的类别,请帮助我找到更合适的问题标题。

非常感谢,Max

I have trouble figuring out what would be the most efficient way to do the following:

import numpy as np M = 10 K = 10 ind = np.array([0,1,0,1,0,0,0,1,0,0]) full = np.random.rand(sum(ind),K) output = np.zeros((M,K)) output[1,:] = full[0,:] output[3,:] = full[1,:] output[7,:] = full[2,:]

I want to build output, which is a sparse matrix, whose rows are given in a dense matrix (full) and the row indices are specified through a binary vector. Ideally, I want to avoid a for-loop. Is that possible? If not, I'm looking for the most efficient way to for-loop this.

I need to perform this operation quite a few times. ind and full will keep changing, hence I've just provided some exemplar values for illustration. I expect ind to be pretty sparse (at most 10% ones), and both M and K to be large numbers (10e2 - 10e3). Ultimately, I might need to perform this operation in pytorch, but some decent procedure for numpy, would already get me quite far.

Please also help me find a more appropriate title for the question, if you have one or more appropriate categories for this question.

Many thanks, Max

最满意答案

output[ind.astype(bool)] = full

通过将ind的整数值转换为布尔值,您可以执行布尔索引以选择要用full值填充的output中的行。

4x4数组的示例

M = 4 K = 4 ind = np.array([0,1,0,1]) full = np.random.rand(sum(ind),K) output = np.zeros((M,K)) output[ind.astype(bool)] = full print(output) [[ 0. 0. 0. 0. ] [ 0.32434109 0.11970721 0.57156261 0.35839647] [ 0. 0. 0. 0. ] [ 0.66038644 0.00725318 0.68902177 0.77145089]] output[ind.astype(bool)] = full

By converting the integer values in ind to boolean values, you can do boolean indexing to select the rows in output that you want to populate with values in full.

example with a 4x4 array:

M = 4 K = 4 ind = np.array([0,1,0,1]) full = np.random.rand(sum(ind),K) output = np.zeros((M,K)) output[ind.astype(bool)] = full print(output) [[ 0. 0. 0. 0. ] [ 0.32434109 0.11970721 0.57156261 0.35839647] [ 0. 0. 0. 0. ] [ 0.66038644 0.00725318 0.68902177 0.77145089]]numpy,用其他矩阵的行填充稀疏矩阵(numpy, fill sparse matrix with rows from other matrix)

我无法弄清楚执行以下操作的最有效方法是什么:

import numpy as np M = 10 K = 10 ind = np.array([0,1,0,1,0,0,0,1,0,0]) full = np.random.rand(sum(ind),K) output = np.zeros((M,K)) output[1,:] = full[0,:] output[3,:] = full[1,:] output[7,:] = full[2,:]

我想构建输出,这是一个稀疏矩阵,其行以密集矩阵(完整)给出,行索引通过二进制向量指定。 理想情况下,我想避免使用for循环。 那可能吗? 如果没有,我正在寻找最有效的循环方式。

我需要执行此操作很多次。 ind和full将不断变化,因此我只提供了一些示例值来说明。 我希望ind非常稀疏(最多10%),M和K都是大数(10e2 - 10e3)。 最终,我可能需要在pytorch中执行此操作,但是对于numpy来说,一些不错的程序已经让我走得很远了。

如果您有一个或多个适合此问题的类别,请帮助我找到更合适的问题标题。

非常感谢,Max

I have trouble figuring out what would be the most efficient way to do the following:

import numpy as np M = 10 K = 10 ind = np.array([0,1,0,1,0,0,0,1,0,0]) full = np.random.rand(sum(ind),K) output = np.zeros((M,K)) output[1,:] = full[0,:] output[3,:] = full[1,:] output[7,:] = full[2,:]

I want to build output, which is a sparse matrix, whose rows are given in a dense matrix (full) and the row indices are specified through a binary vector. Ideally, I want to avoid a for-loop. Is that possible? If not, I'm looking for the most efficient way to for-loop this.

I need to perform this operation quite a few times. ind and full will keep changing, hence I've just provided some exemplar values for illustration. I expect ind to be pretty sparse (at most 10% ones), and both M and K to be large numbers (10e2 - 10e3). Ultimately, I might need to perform this operation in pytorch, but some decent procedure for numpy, would already get me quite far.

Please also help me find a more appropriate title for the question, if you have one or more appropriate categories for this question.

Many thanks, Max

最满意答案

output[ind.astype(bool)] = full

通过将ind的整数值转换为布尔值,您可以执行布尔索引以选择要用full值填充的output中的行。

4x4数组的示例

M = 4 K = 4 ind = np.array([0,1,0,1]) full = np.random.rand(sum(ind),K) output = np.zeros((M,K)) output[ind.astype(bool)] = full print(output) [[ 0. 0. 0. 0. ] [ 0.32434109 0.11970721 0.57156261 0.35839647] [ 0. 0. 0. 0. ] [ 0.66038644 0.00725318 0.68902177 0.77145089]] output[ind.astype(bool)] = full

By converting the integer values in ind to boolean values, you can do boolean indexing to select the rows in output that you want to populate with values in full.

example with a 4x4 array:

M = 4 K = 4 ind = np.array([0,1,0,1]) full = np.random.rand(sum(ind),K) output = np.zeros((M,K)) output[ind.astype(bool)] = full print(output) [[ 0. 0. 0. 0. ] [ 0.32434109 0.11970721 0.57156261 0.35839647] [ 0. 0. 0. 0. ] [ 0.66038644 0.00725318 0.68902177 0.77145089]]