How to implement a general pointer type in C++

2.8k Views Asked by At

In C, one can assign a data pointer to a void pointer and then cast it back to the original type, that data pointer will be recovered. The language standard guarantees that such transformation does not lose information. This often means (not necessarily, but true for most platforms) that the size of void pointer is the same with that of data pointers. Thus one can count on these facts to use void pointers as general pointers to heterogeneous types while void pointers themselves are of uniform size and representation. For example, one has an array of void pointers, with its elements pointing to dynamically allocated objects of different types. Constructing such an array makes certain things convenient. My question is: How does one implement something similar, a general pointer type in C++, which comply with the following: (assume g_pointer is the class name )

  • Constructed from any pointer types, one can write code like

    g_pointer g_ptr =  g_pointer(new T())
    
  • Recover the original pointer

    T* ptr = g_ptr.recover(), or
    auto* ptr = g_tr.recover() 
    
  • Update: According to some comments, the above couldn't be done in C++, then something like

    recover<Type>(g_ptr)
    

    should suffice, throwing an exception Type is not compatible.

  • g_pointer can be contained in std::vector or a plain array, that is basically means

    sizeof(g_pointer) // a predetermined constant number,
    

    (Update: This is always true, provided such a class can be correctly implemented, thanks for pointing out.)

I have just found boost::any, a peek into its introduction seems suggeesting that it may be what I want, although it might not be the case. So anyone who is familiar with boost::any is welcomed to comment.

Update: (response to some comments)

  • A g_pointer type object should be aware of the underlying type of the object to which it points. thus the recover method should always return a pointer of that type.
  • A general pointer type, meaning a reference to ANY object, IMHO, is a reasonable thing to ask to any language supporting object-oriented paradigm.

Update: Thanks @Caleth, std::any is great.

2

There are 2 best solutions below

5
xskxzr On

It is impossible in C++. Because the type of the expression g_ptr.recover() is determined at compile time, it cannot store information of the underlying type, which is determined at runtime.


If you can tolerate expressions like g_ptr.recover<T>(), you can implement g_pointer by wrapping a void* and a const std::type_info& that stores the information of the actual type the pointer points to, e.g.

class g_pointer {
public:
    template <class T>
    constexpr g_pointer(T *data) noexcept : _data(data), _object_type(typeid(T)) {}

    template <class T>
    T* recover() const {
         if (typeid(T) == _object_type) return static_cast<T*>(_data);
         else throw std::bad_cast{};
    }
private:
    void *_data;
    const std::type_info &_object_type;
};

Note this g_pointer behaves like a raw pointer rather than a smart pointer, which means it does not own the object it points to.


There is still a defect in the implementation above: const T* cannot be implicitly converted to void*, thus the general pointer cannot hold const T*. To handle const-qualifiers, you can change the type of _data to const void* and use const_cast when recovering. In addition, recover shall reject to return a pointer to non-const object from a g_pointer holding a pointer to const object. However, typeid operator ignores top const-qualifiers, so we need an additional data member to record whether the pointer points to an originally const object.

class g_pointer {
public:
    template <class T>
    constexpr g_pointer(T *data) noexcept : _data(data), 
                                            _object_type(typeid(T)),
                                            _is_const(std::is_const_v<T>) 
                                         // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ change here
    {
    }

    template <class T>
    T* recover() const {
         if (
             typeid(T) != _object_type ||
             (_is_const && !std::is_const_v<T>) // try to obtain T* while const T* is held
         ) {
             throw std::bad_cast{};
         }
         else return static_cast<T*>(const_cast<void*>(_data));
                                  // ^^^^^^^^^^^^^^^^^ change here
    }
private:
    const void *_data;
 // ^^^^^ change here
    const std::type_info &_object_type;
    bool _is_const; // <-- record whether the pointer points to const T
};
5
txtechhelp On

There's nothing stopping you from using C constructs, like a void*, in C++. It's generally frowned upon, however, because it can open the door for various bugs should the code be used in ways unintended, or the consequences of said actions not being fully documented.

That being said, you're essentially asking to wrap a void* in a class that can then be used in a std::vector and then accessed later.

Here's some code from a framework I wrote some time ago to sort of achieve a similar effect:

generic_ptr.hpp

#include <exception>
#include <typeinfo>
#include <map>

namespace so {
    std::map<std::size_t, std::size_t> type_sizes;

    template < typename T >
    std::size_t type_id()
    {
        static char tid;
        std::size_t sz = reinterpret_cast<std::size_t>(&tid);
        so::type_sizes[sz] = sizeof(T);
        return sz;
    }
    
    template < typename T >
    inline std::size_t type_id(const T& t)
    {
        return so::type_id<T>();
    }
    
    template < typename T >
    inline std::size_t type_id(const T *const t)
    {
        return so::type_id<T>();
    }
    
    template < typename T, typename C >
    inline bool type_of()
    {
        return so::type_id<T>() == so::type_id<C>();
    }
    
    template < typename T, typename C >
    inline bool type_of(const C& c)
    {
        return so::type_of<T, C>();
    }
    
    template < typename T, typename C >
    inline bool type_of(const C *const c)
    {
        return so::type_of<T, C>();
    }
    
    template < typename T, typename C >
    inline bool type_of(const T& t, const C& c)
    {
        return so::type_of<T, C>();
    }
    
    template < typename T, typename C >
    inline bool type_of(const T *const t, const C *const c)
    {
        return so::type_of<T, C>();
    }

    class generic_ptr
    {
        public:
            generic_ptr() : m_ptr(0), m_id(0) { }

            template < typename T >
            generic_ptr(T *const obj) : 
                m_ptr(obj), m_id(so::type_id<T>())
            {
            }

            generic_ptr(const generic_ptr &o) : 
                m_ptr(o.m_ptr), m_id(o.m_id)
            {
            }
            
            ~generic_ptr()
            {
                this->invalidate();
            }

            static generic_ptr null()
            {
                return generic_ptr();
            }
            
            void invalidate()
            {
                this->m_ptr = 0;
                this->m_id = 0;
            }

            template < typename T >
            bool is_type() const
            {
                return this->m_id == so::type_id<T>();
            }

            template < typename T >
            void gc()
            {
                delete ((T*)this->m_ptr);
                this->invalidate();
            }
            
            bool valid() const
            {
                return (this->m_ptr != 0);
            }
            
            operator bool() const
            {
                return (this->m_ptr != 0);
            }
            
            bool operator!() const
            {
                return (!operator bool());
            }
            
            generic_ptr& operator=(const generic_ptr &o)
            {
                this->m_ptr = o.m_ptr;
                this->m_id = o.m_id;
                return *this;
            }

            template < typename T >
            const generic_ptr& operator=(T *const obj)
            {
                this->m_ptr = obj;
                this->m_id = so::type_id<T>();
                return *this;
            }

            template < typename T >
            operator T *const() const
            {
                if (this->m_id != so::type_id<T>()) {
                    throw std::bad_cast();
                }
                return static_cast<T *const>(
                    const_cast<void *const>(this->m_ptr)
                );
            }

            template < typename T >
            operator const T *const() const
            {
                if ((this->m_id != so::type_id<T>()) && (this->m_id != so::type_id<const T>())) {
                    throw std::bad_cast();
                }
                return static_cast<const T *const>(this->m_ptr);
            }
            
            operator void *const() const
            {
                return const_cast<void*>(this->m_ptr);
            }
            
            operator const void *const() const
            {
                return this->m_ptr;
            }
            
            bool operator==(const generic_ptr& o) const
            {
                return (this->m_ptr == o.m_ptr && this->m_id == o.m_id);
            }
            
            bool operator!=(const generic_ptr& o) const
            {
                return !(*this == o);
            }
            
            std::size_t hash() const
            {
                return this->m_id;
            }

        private:
            const void* m_ptr;
            std::size_t m_id;
    };
}

Then to use it:

main.cpp

#include <iostream>
#include <vector>
#include "generic_ptr.hpp"

class MyClass {
    public:
        MyClass() : m_val1(10), m_val2(20), m_val3(10), m_val4(2) {}
        MyClass(int a, int b, int c, int d) : m_val1(a), m_val2(b), m_val3(c), m_val4(d) {}

        friend std::ostream& operator<<(std::ostream& os, const MyClass& mc)
        {
            os << mc.m_val1 << " + " <<
                mc.m_val2 << " + " <<
                mc.m_val3 << " + " <<
                mc.m_val4 << " = " <<
                (mc.m_val1 + mc.m_val2 + mc.m_val3 + mc.m_val4);
            return os;
        }
    private:
        int m_val1;
        int m_val2;
        int m_val3;
        int m_val4;
};

template < typename T >
void print(so::generic_ptr& g_ptr)
{
    std::cout << "sizeof = " << so::type_sizes[g_ptr.hash()]
            << ", val = " << *((T*)g_ptr) << std::endl;
}

template < typename T >
void cleanup(so::generic_ptr& g_ptr)
{
    delete ((T*)g_ptr);
}

int main(int argc, char* argv[])
{
    std::vector<so::generic_ptr> items;
    items.push_back(new int(10));
    items.push_back(new double(3.14159));
    items.push_back(new MyClass());
    items.push_back(new char(65));
    items.push_back(new MyClass(42,-42,65536,9999));
    items.push_back(new int(999));
    
    for (auto i : items) {
        if (i.is_type<int>()) { print<int>(i); }
        else if (i.is_type<char>()) { print<char>(i); }
        else if (i.is_type<double>()) { print<double>(i); }
        else if (i.is_type<MyClass>()) { print<MyClass>(i); }
    }
    
    int* i = (int*)items[0];
    std::cout << "i = " << *i << std::endl;
    *i = 500;
    std::cout << "i = " << *i << std::endl;

    try {
        double* d = (double*)items[0];
        std::cout << "d = " << *d << std::endl;
    } catch (std::bad_cast& ex) {
        std::cout << ex.what() << std::endl;
    }

    for (auto i : items) {
        if (i.is_type<int>()) {
            print<int>(i);
            cleanup<int>(i);
        } else if (i.is_type<char>()) {
            print<char>(i);
            cleanup<char>(i);
        } else if (i.is_type<double>()) {
            print<double>(i);
            cleanup<double>(i);
        } else if (i.is_type<MyClass>()) {
            print<MyClass>(i);
            cleanup<MyClass>(i);
        }
    }

    return 0;
}

Of course, you still have to know the type and keep track of memory, but you could modify the code to handle that; using the operator overloads, you don't need a recover function in this manner, you can just do a cast, like in the print code: *((T*)g_ptr), and can access it via raw pointers, like before the last for..each statement:

int* i = (int*)items[0];
*i = 500;
print<int>(items[0]);

This class also has invalid type casting built in, in the event you try and cast between invalid types:

try {
    double* d = (double*)items[0];
    // items[0] is an int, so this will throw a std::bad_cast
    std::cout << "d = " << *d << std::endl;
} catch (std::bad_cast& ex) {
    std::cout << ex.what() << std::endl;
}

To be honest though, convenience could trump security in this instance, so if you need an array of types that are not consistent or that can not be defined using a base class, you might need to rethink what you're trying to achieve in a C++ manner.

I hope that can help you get some clarity.