Merge similar objects together based on object elements is O(n²). How to make it simpler?

294 Views Asked by At

Problem Description

We have a vector<A> v containing for 4 objects. Each object has the following members x, y, z and w. Two objects are considered equal if the have the same x and y. In that case we merge the objects: we merge the vector w and we change the value of z if and only if the value of object that we want to check if it exists is different from zero. Else, we consider that it's a new object.

In the following source code, I was able to implement the algorithm, but the main issue that it is O(n²) (because I am looping over each object of the vector then using find_if to check if we have a similar object or not in the merged vector).

Question

Is it possible to make it simpler (that is, less time complexity)? I can't find a way.


Source Code

#include <iostream>
#include <vector>
#include <algorithm>

class A{
    public:
      int x, y, z;
      std::vector<int> w;
};

int main() {
    A a1, a2, a3, a4;
    a1.x = 1; a1.y =2; a1.z = 3; a1.w = {1,2,3,4,5};
    a2.x = 4; a2.y =5; a2.z = 6; a2.w = {6,7,8,9};
    a3.x = 13;a3.y =14; a3.z = 14; a3.w = {10,11,12};
    a4.x = 1; a4.y =2; a4.z = 0;a4.w = {44,45,46,47,48};
   
     std::vector<A> v = {a1, a2, a3, a4};
     std::vector<A> merged;
     
     /* If 2 objects have the same x and y then merge objects */
     for(const A&a:v){
         auto it = std::find_if(merged.begin(),merged.end(),[&](const A&ma){
             /*2 objects are the same if they have the same x and y*/
             return a.x == ma.x and a.y == ma.y;
         });
         
         /* if 2 objects have the same x and y then merge*/
         if(it != merged.end()){
             /* Replace z in the merged vector only if a.z is different from 0*/
             if(a.z != 0){
                 it->z = a.z;
             }
             /* Merge vectors*/
            std::vector<int> mergedws;
            std::set_union(a.w.begin(),a.w.end(),it->w.begin(),it->w.end(), std::back_inserter(mergedws));
            it->w = mergedws;
         } else {
             /*We consider that a is a new object, since we couldn't find a similar object in the merged vector*/
             merged.push_back(a);
         }
     }
     /* merged vector should have 3 objects because a1 and a4 the same*/
     std::cout <<"Number of Objects is: "<< merged.size() << std::endl;
     for(const auto&m:merged){
         std::cout <<"Element "<< m.x <<", "<< m.y <<","<<m.z << std::endl;
     }
      
    return 0;
}
2

There are 2 best solutions below

0
Caleth On BEST ANSWER

You can do it in O(NlogN * MlogM) if you sort the input, and then do a linear pass to merge.

N is the length of v, and M is the length of the A::ws.

bool compare(const A & lhs, const A & rhs) {
    return std::tie(lhs.x, lhs.y) < std::tie(rhs.x, rhs.y);
}

std::sort(v.begin(), v.end(), compare);
for (auto first = v.begin(), last = {}; it != v.end(); it = last) {
    A result = *first;
    last = std::upper_bound(first++, v.end(), result, compare);
    for (; first != last; ++first) {
        if (first->z) {
            result.z = first->z;
        }

        // this is an in-place set_union
        std::merge(result.w.begin(), result.w.end(), first->w.begin(), first->w.end());
        auto unique_end = std::unique(result.w.begin(), result.w.end());
        result.w.erase(unique_end, result.w.end());
    }
    merged.push_back(result);
}
1
Marek R On

The best way is to use std::unordered_map this will allow to find matching item in constant time, so final time complexity will be O(n):

class A {
public:
    int x, y, z;
    std::vector<int> w;
};

struct Point2d {
    int x, y;

    Point2d(int x, int y)
        : x { x }
        , y { y }
    {
    }
    Point2d(const A& a)
        : x { a.x }
        , y { a.y }
    {
    }
};

bool operator==(const Point2d& l, const Point2d& r)
{
    return l.x == r.x && l.y == r.y;
}

template <>
struct std::hash<Point2d> {
    size_t operator()(const Point2d& p) const
    {
        std::hash<int> sub_hash {};
        return (sub_hash(p.x) * 16777619) ^ sub_hash(p.y);
    }
};

template <typename T, typename F>
std::unordered_map<Point2d, A> merge_collection(const T& collection, F f)
{
    std::unordered_map<Point2d, A> r;
    for (const auto& item : collection) {
        f(r[item], item);
    }

    return r;
}

void merge_a(A& dest, const A& toMerge)
{
    std::vector<int> w;
    w.reserve(dest.w.size() + toMerge.w.size());

    std::merge(dest.w.begin(), dest.w.end(), toMerge.w.begin(), toMerge.w.end(), std::back_inserter(w));
    dest = {dest.x, dest.y, dest.z, std::move(w)};
}

template <typename T>
std::unordered_map<Point2d, A> merge_collection(const T& collection)
{
    return merge_collection(collection, merge_a);
}

https://godbolt.org/z/Wz8vbz4h1