I have 2 questions about tensorflow 2.0, with the focus on how tensorflow handles combined conditional tests in it's operations graph.
The task: cut up a volume of data points into blocks, and store the indices to the samples that belong to the volume (not the samples themselves).
My initial approach: loop all elements and collect the indices of the data points that are inside the 'bounding volume'. This was pretty slow, no matter how I reordered the compares on the coordinates.
# X.shape == [elements,features]
# xmin.shape == xmax.shape == [features]
def getIndices(X, xmin, xmax):
i = 0
indices = tf.zero(shape[0], dtype = tf.int32)
for x in X:
if (x[0] > xmin[0]):
if (x[1] > xmin[1]):
if (x[2] <= xmax[2]):
# ...and so on...
indices = tf.concat([indices, i], axis = 0)
i = i + 1
return indices
I then came up with the idea to produce boolean tensors and logically 'and' them to get the indices of the elements I need. A whole lot faster, as shown in the next sample:
# X.shape == [elements,features]
# xmin.shape == xmax.shape == [features]
def getIndices(X, xmin, xmax):
# example of 3 different conditions to clip to (a part of) the bounding volume
# X is the data and xmin and xmax are tensors containing the bounding volume
c0 = (X[:,0] > xmin[0])
c1 = (X[:,1] > xmin[1]) # processing all elements
c2 = (X[:,2] <= xmax[2]) # idem
# ... there could be many more conditions, you get the idea..
indices = tf.where(tf.math.logical_and(c1, tf.math.logical_and(c2, c3) )
return indices
# ...
indices = getIndices(X, xmin, xmax)
trimmedX = tf.gather(X, indices)
This code produces the correct result, but I wonder if it is optimal.
The first question is about scheduling:
Will the tensorflow graph that holds the operations cull (blocks of) conditional tests if it knows some (blocks of) elements already tested
False. Because of thelogical_andcombining the logical conditionals, no subsequent conditional test on these elements will ever yield aTrue.
Indeed, in the above example c1 and c2 are asking questions on elements that may c0 already excluded from the set. Especially when you have a high number of elements to test, this could be a waste of time, even on parallel hardware platforms
So, what if we cascade the tests based on the results of a previous test? Although it seems like a solved problem, this solution is incorrect, because the final indices tensor will refer to a subset _X, not to the total set X:
# X.shape == [elements,features]
# xmin.shape == xmax.shape == [features]
def getIndices(X, xmin, xmax):
c0 = (X[:,0] > xmin[0])
indices = tf.where(c0)
_X = tf.gather(X, indices)
c1 = (_X[:,1] > xmin[1]) # processing only trimmed elements
indices = tf.where(c1)
_X = tf.gather(_X, indices)
c2 = (_X[:,2] <= xmax[2]) # idem
indices = tf.where(c2)
return indices
...
indices = getIndices(X, xmin, xmax)
trimmedX = tf.gather(X, indices) # fails: indices refer to a trimmed subset, not X
I could of course 'solve' this by simply expanding X, so that each element also includes the index of itself in the original list, and then proceed as before.
So my second question is about functionality:
Does tf have a method to make the GPU/tensor infrastructure provide the bookkeeping without spending memory / time on this seemingly simple problem?
This will return all indices larger than
minimumand less thanmaximumwhen both of these have the same number of features asX