Algorithm to find some rows from a matrix, whose sum is equal to a given row

428 Views Asked by At

For example, here is a matrix:

[1, 0, 0, 0],
[1, 1, 0, 0],
[1, 0, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 1],

I want to find some rows, whose sum is equal to [4, 3, 2, 1]. The expected answer is rows: {0,1,3,4}. Because:

[1, 0, 0, 0] + [1, 1, 0, 0] + [1, 1, 1, 0] + [1, 1, 1, 1] = [4, 3, 2, 1]

Is there some famous or related algrithoms to resolve the problem?

Thank @sascha and @N. Wouda for the comments. To clarify it, here I provide some more details.

In my problem, the matrix will have about 50 rows and 25 columns. But echo row will just have less than 4 elements (other is zero). And every solution has 8 rows.

If I try all combinations, c(8, 50) is about 0.55 billion times of attempt. Too complex. So I want to find a more effective algrithom.

2

There are 2 best solutions below

4
AirSquid On BEST ANSWER

If you want to make the jump to using a solver, I'd recommend it. This is a pretty straightforward Integer Program. Below solutions use python, python's pyomo math programming package to formulate the problem, and COIN OR's cbc solver for Integer Programs and Mixed Integer Programs, which needs to be installed separately (freeware) available: https://www.coin-or.org/downloading/

Here is the an example with your data followed by an example with 100,000 rows. The example above solves instantly, the 100,000 row example takes about 2 seconds on my machine.

# row selection Integer Program
import pyomo.environ as pyo

data1 = [   [1, 0, 0, 0],
            [1, 1, 0, 0],
            [1, 0, 1, 0],
            [1, 1, 1, 0],
            [1, 1, 1, 1],]


data_dict = {(i, j): data1[i][j] for i in range(len(data1)) for j in range(len(data1[0]))}

model = pyo.ConcreteModel()

# sets
model.I = pyo.Set(initialize=range(len(data1)))     # a simple row index
model.J = pyo.Set(initialize=range(len(data1[0])))  # a simple column index

# parameters
model.matrix = pyo.Param(model.I , model.J, initialize=data_dict)   # hold the sparse matrix of values
magic_sum = [4, 3, 2, 1 ]

# variables
model.row_select = pyo.Var(model.I, domain=pyo.Boolean) # row selection variable

# constraints
# ensure the columnar sum is at least the magic sum for all j
def min_sum(model, j):
    return sum(model.row_select[i] * model.matrix[(i, j)] for i in model.I) >= magic_sum[j] 
model.c1 = pyo.Constraint(model.J, rule=min_sum)

# objective function
# minimze the overage
def objective(model):
    delta = 0
    for j in model.J:
        delta += sum(model.row_select[i] * model.matrix[i, j] for i in model.I) - magic_sum[j] 

    return delta

model.OBJ = pyo.Objective(rule=objective)


model.pprint()  # verify everything

solver = pyo.SolverFactory('cbc')  # need to have cbc solver installed

result = solver.solve(model)

result.write()  # solver details

model.row_select.display() # output

Output:

# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  User time: -1.0
  System time: 0.0
  Wallclock time: 0.0
  Termination condition: optimal
  Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
  Statistics: 
    Branch and bound: 
      Number of bounded subproblems: 0
      Number of created subproblems: 0
    Black box: 
      Number of iterations: 0
  Error rc: 0
  Time: 0.01792597770690918
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0
row_select : Size=5, Index=I
    Key : Lower : Value : Upper : Fixed : Stale : Domain
      0 :     0 :   1.0 :     1 : False : False : Boolean
      1 :     0 :   1.0 :     1 : False : False : Boolean
      2 :     0 :   0.0 :     1 : False : False : Boolean
      3 :     0 :   1.0 :     1 : False : False : Boolean
      4 :     0 :   1.0 :     1 : False : False : Boolean

A more stressful rendition with 100,000 rows:

# row selection Integer Program stress test
import pyomo.environ as pyo
import numpy as np

# make a large matrix 100,000 x 8
data1 = np.random.randint(0, 1000, size=(100_000, 8))
# inject "the right answer into 3 rows"
data1[42602]    = [8, 0, 0, 0, 0, 0, 0, 0 ]
data1[3]        = [0, 0, 0, 0, 4, 3, 2, 1 ]
data1[10986]    = [0, 7, 6, 5, 0, 0, 0, 0 ]

data_dict = {(i, j): data1[i][j] for i in range(len(data1)) for j in range(len(data1[0]))}

model = pyo.ConcreteModel()

# sets
model.I = pyo.Set(initialize=range(len(data1)))     # a simple row index
model.J = pyo.Set(initialize=range(len(data1[0])))  # a simple column index

# parameters
model.matrix = pyo.Param(model.I , model.J, initialize=data_dict)   # hold the sparse matrix of values
magic_sum = [8, 7, 6, 5, 4, 3, 2, 1 ]

# variables
model.row_select = pyo.Var(model.I, domain=pyo.Boolean) # row selection variable

# constraints
# ensure the columnar sum is at least the magic sum for all j
def min_sum(model, j):
    return sum(model.row_select[i] * model.matrix[(i, j)] for i in model.I) >= magic_sum[j] 
model.c1 = pyo.Constraint(model.J, rule=min_sum)

# objective function
# minimze the overage
def objective(model):
    delta = 0
    for j in model.J:
        delta += sum(model.row_select[i] * model.matrix[i, j] for i in model.I) - magic_sum[j] 

    return delta

model.OBJ = pyo.Objective(rule=objective)

solver = pyo.SolverFactory('cbc')

result = solver.solve(model)
result.write()


print('\n\n======== row selections =======')
for i in model.I:
    if model.row_select[i].value > 0:
        print (f'row {i} selected')

Output:

# ----------------------------------------------------------
#   Solver Information
# ----------------------------------------------------------
Solver: 
- Status: ok
  User time: -1.0
  System time: 2.18
  Wallclock time: 2.61
  Termination condition: optimal
  Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
  Statistics: 
    Branch and bound: 
      Number of bounded subproblems: 0
      Number of created subproblems: 0
    Black box: 
      Number of iterations: 0
  Error rc: 0
  Time: 2.800779104232788
# ----------------------------------------------------------
#   Solution Information
# ----------------------------------------------------------
Solution: 
- number of solutions: 0
  number of solutions displayed: 0


======== row selections =======
row 3 selected
row 10986 selected
row 42602 selected
1
SirPilan On

This one picks and not picks an element (recursivly). As soon as the tree is impossible to solve (no elements left or any target value negative) it will return false. In case the sum of the target is 0 a solution is found and returned in form of the picked elements.

Feel free to add time and memory complexity in the comments. Worst case should be 2^(n+1)

Please let me know how it performs on your 8/50 data.

const elements = [
  [1, 0, 0, 0],
  [1, 1, 0, 0],
  [1, 0, 1, 0],
  [1, 1, 1, 0],
  [1, 1, 1, 1]
];

const target = [4, 3, 2, 1];

let iterations = 0;

console.log(iter(elements, target, [], 0));
console.log(`Iterations: ${iterations}`);

function iter(elements, target, picked, index) {
  iterations++;
  const sum = target.reduce(function(element, sum) {
    return sum + element;
  });

  if (sum === 0) return picked;
  if (elements.length === 0) return false;

  const result = iter(
    removeElement(elements, 0),
    target,
    picked,
    index + 1
  );

  if (result !== false) return result;

  const newTarget = matrixSubtract(target, elements[0]);
  const hasNegatives = newTarget.some(function(element) {
    return element < 0;
  });

  if (hasNegatives) return false;

  return iter(
    removeElement(elements, 0),
    newTarget,
    picked.concat(index),
    index + 1
  );
}

function removeElement(target, i) {
  return target.slice(0, i).concat(target.slice(i + 1));
}

function matrixSubtract(minuend, subtrahend) {
  let i = 0;
  return minuend.map(function(element) {
    return minuend[i] - subtrahend[i++]
  });
}