I need to evaluate my model's performance with limited training data. I am randomly selecting p of original training data. Assume p is 0.2 in this case. Here is some intil lines of code:
p = p*100
data_samples = (data.shape[0] * p)/100 # data.shape= (100, 50, 50, 3)
# for randomly selecting data
import random
random.seed(1234)
filter_indices=[random.randrange(0, data.shape[0]) for _ in range(data_samples)]
Its giving me total filter indices randomly ranging between 0 and total data size.
Now, I want to get those samples of indices from the 'data' that are equivalent to filter_indices but include all dimensions. How can I do that effectively and effeciently?
You can use numpy's integer array indexing to use your generated list of indices directly as index. When used on its own, the trailing dimensions will automatically be tacked on to the result! Smaller example:
Note above that I've used numpy's built-in random module to streamline your code a little bit via
np.random.choice.Results:
outis exactly the 2 shape(3, 3)subarrays indataat indices 5 and 0. So the result has shape(2, 3, 3)instead of(10, 3, 3).