Optimize numbers so that the sum of the rounded numbers is equal to zero

80 Views Asked by At

Within python3, if I have a dictionary containing for example:

d = {'C1': 0.68759, 'C2': -0.21432, 'H1': 0.49062, 'H2': -0.13267, 'H3': 0.08092, 'O1': -0.8604}

And I have the equation:

n1*d['C1'] + n2*d['H2'] + n3*d['C2'] + n4*d['H3'] + n5*d['O1'] + n6*d['H1'] = small_number

that I want to equal zero when rounding each d[key] value to 5 digits, for some scalars n1, n2, n3... that are positive integers. I also want to limit the amount that each dictionary value changes, say only change each value by a max of +- 0.0005. In this specific example: n1, n2, n3, n4, n6 = 2, 2, 2, 4, 4, 2

Is there an easy, generalizable way to do this? By generalizable I mean the dictionary d could contain more key : value pairs, and the scalars n1, n2,... may be different (but always positive integers) as well. The length of dictionary d and number of scalars n will always be equal. It is fair to assume that in each case the value of small_number is on the order of 1E-5. If necessary I could possibly round to +- 1 more than 5 digits

I have thought about trying to subtract the nonzero portion of the equation from one specific key in the dictionary d, but I run into problems if for example the equation sums to 1E-5 and like in my example there are no scalars = 1. I could also use scipy.optimize.minimize and somehow define an objective function on a per-case basis as a function of the dictionary d and scalar values n, however I am not sure how to do this properly.

The scalar values n should not change

2

There are 2 best solutions below

0
Reinderien On

What is a "good solution"? This isn't as obvious as it seems. Goodness could be low error, or low coefficients, or a weighted mix of both. This demonstrates a solution where scipy tries to minimize error while treating coefficients as non-cost bounded decision variables.

import numpy as np
from scipy.optimize import milp, Bounds, LinearConstraint


def solve(
    d: dict[str, float],
    small: float,
    nmax: int = 20,
) -> list[int]:
    N = len(d)

    # Variables: n (positive integers), absolute error (continuous)

    # Minimize absolute error.
    c = np.concatenate((np.zeros(N), (1,)))

    # n are integral, error is continuous
    integrality = np.ones(N+1, dtype=np.uint8)
    integrality[-1] = 0

    # 1 <= n <= some reasonable max
    bounds = Bounds(
        lb=np.concatenate((np.ones(N), (-np.inf,))),
        ub=np.concatenate((np.full(shape=N, fill_value=nmax), (+np.inf,))),
    )

    d_array = np.array(tuple(d.values()))
    #  n.d - small <= error   -d, error >= -small
    # -n.d + small <= error   +d, error >= +small
    constraints = LinearConstraint(
        A=np.block([
            [-d_array, np.ones(1)],
            [+d_array, np.ones(1)],
        ]),
        lb=[-small, small], ub=[np.inf, np.inf],
    )
    result = milp(
        c=c, integrality=integrality, bounds=bounds, constraints=constraints,
    )

    if not result.success:
        raise ValueError(result.message)
    n, (error,) = np.split(result.x, (-1,))
    return n, error


def demo() -> None:
    small = 1e-5
    d = {
        'C1': 0.68759, 'C2': -0.21432,
        'H1': 0.49062, 'H2': -0.13267,
        'H3': 0.08092, 'O1': -0.86040,
    }
    n, error = solve(d=d, small=small)
    print(
        ' + '.join(
            f'{xn:.0f}*{xd:.5f}'
            for xn, xd in zip(n, d.values())
        ),
        f'= {small} + {error}'
    )


if __name__ == '__main__':
    demo()
1*0.68759 + 3*-0.21432 + 12*0.49062 + 18*-0.13267 + 20*0.08092 + 6*-0.86040  = 1e-05 + -0.0
0
scaldedolive On

I have found a solution that works for me. If the first digit of the leftover sum small_number is equal to any of the coefficients n, I can just subtract the small_number from the d[key] that corresponds to that coefficient.

In the case where the first digit of small_number is not equal to any of the coefficients n, I can take the smallest coefficient n and divide small_number by that coefficient, and subtract this from the appropriate value in the dictionary, while rounding to the necessary number of digits. It is a little messy, but the code I found to work is:

import numpy as np

# Atom types and charges of each atom
d = {'C1': 0.68759, 'C2': -0.21432, 'H1': 0.49062, 'H2': -0.13267, 'H3': 0.08092, 'O1': -0.8604}
# Number of each atom occuring in molecule
n = (2, 2, 2, 4, 4, 2)

# calculate leftover charge
def calc_sum(n, d):
    repeats = np.repeat(np.array(list(d.values())), n)
    return round(np.sum(repeats), 5)

# Get first digit of leftover charge
def get_divisor(final_sum):
    final_sum = str(final_sum)
    final_sum = final_sum.replace("-", '')
    final_sum = final_sum.replace("0", '')
    final_sum = final_sum.replace(".", '')
    return int(final_sum[0])

# Get the power of a number in scientific notation; assumed to be less than 1E-04
def get_power(number):
    number = str(number)
    return(int(number[-2:].replace('0', '')))


final_sum = calc_sum(n, d)

# quit if leftover charge is too large or too small
if abs(final_sum) > 1E-04 or abs(final_sum) < 1E-05:
    exit()


divisor = get_divisor(final_sum)

# if the divisor is equal to the number of one atom type, then subtract the leftover charge from
# the first atom type found with that number of atoms
if divisor in n:
    divisor_index = n.index(divisor)
    divisor_key = list(d)[divisor_index]
    new_value = round(d[divisor_key] - final_sum/divisor, 5)
    d[divisor_key] = new_value

# if the divisor is not equal to any items in n, divide leftover charge
# by the smallest number n, subtract this value from the d[key] that corresponds to the 
# smallest number n, and round to appropriate digits so that the linear combination is zero
else:
    min_value_n = min(n)
    min_value_index = n.index(min_value_n)
    to_subtract = final_sum/min_value_n
    round_to = get_power(to_subtract)
    divisor_key = list(d)[min_value_index]
    new_value = round(d[divisor_key] - to_subtract, round_to)
    d[divisor_key] = new_value

print("Final rounded linear combination: ", calc_sum(n, d)) # Will be zero if everything is correct
print(d)

I believe this will work for any dictionary d and linear combination values n as long as len(n) = len(d), and the linear combination gives a value that is in the range of 1E-04 and 1E-05