What is the purpose of using a helper function to build nodes for data structures like tries and linked lists?

158 Views Asked by At

I've been working on Leetcode questions trying to improve my coding skills and I became curious about this when doing this problem https://leetcode.com/problems/implement-trie-prefix-tree/

Most answers I found online implement the trie by creating a node struct and then using a build method to instantiate the node and fill it with default values. But I'm wondering why this is necessary and they don't just define the Node with those default values? I tried it and it seems to work exactly the same without the build method so I've got to assume it's some sort of best practices reason. I can understand why a build method would be better for a more complicated node structure. But, what I'm really interested in is what can go wrong with what I did because I haven't been able to find any conclusive answers to that.

Typical Solution:


class Trie {

private:

    struct Node {
        bool isWord = false;
        Node* children[26];
    };


    Node* buildNode(){// helper method to build the default node
        Node* root = new Node();
        root-> isWord = false;
        for (int i = 0; i<26; i++){
            root->children[i] = NULL;
        }
        return root;
    }

    Node* root;

public:
    Trie() {
        root = buildNode();
    }
    
    void insert(string word) {
        Node* curr = root;
        for (char ch : word){
            int index = ch - 'a'; // convert char to int 0-25
            if (!curr->children[index]){//if NULL
                curr->children[index] = buildNode();
            }
            curr = curr->children[index];
        }
        curr->isWord = true;
    }

My Solution:

class Trie {

private:

    struct Node {
        bool isWord = false;
        Node* children[26] = {};
    };

    Node* root;

public:
    Trie() {
        root = new Node();
    }
    
    void insert(string word) {
        Node* curr = root;
        for (char ch : word){
            int index = ch - 'a'; // convert char to int 0-25
            if (!curr->children[index]){//if NULL
                curr->children[index] = new Node();
            }
            curr = curr->children[index];
        }
        curr->isWord = true;
    }

both implementations seem to work and perform the same way. So why is the first one preferred over the 2nd?

1

There are 1 best solutions below

0
tbxfreeware On

LeetCode has issues

As was noted in the comments, your code gets the job done. I expect it passes all LeetCode tests.

But it sure does not pass mine! Although I like your solution better than the one with a separate "build" routine, you would probably get into trouble if you submitted your code to a job interviewer. On the up-side, you have gotten the insertion algorithm exactly right. That's better than about half of the small sample of solutions I checked on the LeetCode website. Unfortunately, however, your use of raw pointers and operator new, with no matching calls to operator delete, has created a gigantic memory leak. When you use operator new, you have a responsibility to write a proper destructor to clean up, when your objects are deallocated.

The fact that LeetCode allows you to skip this is the sort of thing that makes folks say:

...because leetcode doesn't teach C++ and good C++ practices.

Most of leetcode solutions (including yours unfortunately) would never make it to production in a decent engineering environment.

Avoid competitive coding sites when learning a language. Many (most) promote horrible basic coding practices...

Even if you had supplied a destructor, I think your solution should still be rejected by LeetCode, because it does not use one of the smart pointers from the Standard Library. In a production setting, class Trie would almost certainly be coded using std::unique_ptr.

It is revealing that LeetCode does not ask programmers to code a function to erase keys from a Trie. That would open the issue of deleting nodes. With a smart pointer, it is not too difficult (see below), but with raw pointers you will find yourself coding some sort of "tear-down" routine that is analogous to the "build" function cited in the OP.

So, it is not just the destructor that is missing in the raw-pointer implementation. It is also the "tear-down" function that deallocates a Node. The good news is that once you have a tear-down routine, you can call it as a helper from the destructor function, as well as from function erase!

Is anything else missing? Unfortunately, yes. For a class like this, which manages dynamically allocated memory, you need to follow the "Rule of Three" or the "Rule of Five." At a minimum, that means coding a copy constructor and a copy-assignment operator, to go along with the destructor. See CppReference.

Use the tools in the Standard Library

When I solved this, I side-stepped the whole issue of cleanup, by using std::unique_ptr and std::array. This allowed me to follow the "Rule of Zero." Not only did I avoid the task of writing a destructor, I also obviated the need to write any sort of "build" or "tear-down" code at all. That's a whole bunch of error-prone tedium that I ducked!

My code follows the example on Wikipedia, so it uses the name is_terminal rather than isWord. Other than that, our insert routines are basically identical.

Mine adds a check for valid characters, but that may be a defect, because the spec promises that all the characters in a word are lower-case letters. My double-checks just slow things down. If I could prove to my satisfaction that all the clients of class Trie respected the precondition, I would remove the extra checks.

As an exercise, I tossed in an erase function, and a stream insertion operator. Function erase is substantially different than the one on Wikipedia. It is more aggressive about deallocating nodes, when they are no longer in use. operator<< outputs a list of the keys stored in a Trie. It was helpful during testing of class Trie.

// StackOverflow_77491953_LeetCode_208_Trie_Answer.ixx
// "What is the purpose of using a helper function to build nodes for 
// data structures like tries and linked lists?"
// https://stackoverflow.com/q/77491953
// https://stackoverflow.com/a/77505923
// 
// LeetCode 208. Implement Trie (Prefix Tree)
// https://leetcode.com/problems/implement-trie-prefix-tree/
//
// Wikipedia
// https://en.wikipedia.org/wiki/Trie#Operations
// 

export module main;
import std;

// This using declaration allows the LeetCode `Trie` to be used 
// without modifying the function signatures provided by LeetCode.
using std::string;

class Trie
{
    // This implementation anticipates that the letters 'a' to 'z' 
    // have consectutive code points in the character set used 
    // by std::string. 
    // 
    // When such is not true, it will still work, albeit, with a 
    // lot of wasted storage, so long as 'a' precedes all other 
    // lower-case letters, and `z` comes after all others.
    struct Node
    {
        enum : std::size_t { alphabet_size = 'z' - 'a' + 1u };
        std::array<std::unique_ptr<Node>, alphabet_size> child;
        bool is_terminal{ false };
        bool is_leaf_node() const noexcept
        {
            for (auto const& c : child) if (c) return false;
            return true;
        }
        void put(std::ostream& ost, std::string& key) const
        {
            if (is_terminal)
                ost << "    { \"key\": \"" << key << "\" }" << "\n";
            for (char ch{ 'a' }; ch <= 'z'; ++ch)
                if (auto const i{ subscript(ch) }; child[i])
                {
                    key.push_back(ch);
                    child[i]->put(ost, key);
                    key.pop_back();
                }
        }
        std::size_t static subscript(char const ch)
        {
            auto const i{ std::tolower(static_cast<unsigned char>(ch)) };
            if (!std::isalpha(i))
                throw std::runtime_error(std::format(
                    "ERROR: Trie character is not alphabetic: '{}'", ch));
            return static_cast<std::size_t>(i - 'a');
        }
    };
    std::unique_ptr<Node> root;
public:
    Trie()
        : root{ std::make_unique<Node>() }
    {}
    bool contains(std::string_view key) const noexcept
    {
        Node* p{ root.get() };
        for (auto const& ch : key)
        {
            auto const i{ Node::subscript(ch) };
            if (!p->child[i])
                return false;
            p = p->child[i].get();
        }
        return p->is_terminal;
    }
    bool contains_prefix(std::string_view prefix) const noexcept
    {
        Node* p{ root.get() };
        for (auto const& ch : prefix)
        {
            auto const i{ Node::subscript(ch) };
            if (!p->child[i])
                return false;
            p = p->child[i].get();
        }
        return true;
    }
    void erase(std::string_view key)
    {
        erase(key, root.get());
    }
private:
    bool erase(std::string_view key, Node* const p)
    {
        if (key.empty())
        {
            // found key
            if (!p->is_terminal)
                return false;        // key is not in trie
            p->is_terminal = false;  // erase key
        }
        else
        {
            auto const i{ Node::subscript(key.front()) };
            if (!p->child[i])
                return false;  // key is not in trie
            if (erase(key.substr(1u), p->child[i].get()))
                p->child[i] = nullptr;  // deallocate non-terminal leaf nodes
            if (p->is_terminal)
                return false;  // partial key is also a key
        }
        return p->is_leaf_node();  // deallocate non-terminal leaf nodes
    }
public:
    void insert(string word)
    {
        Node* p{ root.get() };
        for (auto const& ch : word)
        {
            auto const i{ Node::subscript(ch) };
            if (!p->child[i])
                p->child[i] = std::make_unique<Node>();
            p = p->child[i].get();
        }
        p->is_terminal = true;
    }
    bool search(string word)
    {
        // Really, LeetCode?
        // This is how you teach people to code?
        // There are at least three things wrong with this 
        // function prototype.
        //    1. It apparently relies on `using namespace std`, 
        //       something professional programmers shun.
        //    2. Argument is passed by value, even though the 
        //       "proper" algorithm does not need to modify it.
        //       This is a missed opportunity to use `std::string_view`,
        //       but failing that, this should be `std::string const&`.
        //    3. NO SUPPORT FOR CONSTANT OBJECTS. This function cannot 
        //       be invoked on a const Trie. It should be declared 
        //       `bool search(string const& word) const`.
        // Compare with function `contains`, which corrects all three 
        // of these mistakes.
        //
        return contains(word);
    }
    bool startsWith(string prefix)
    {
        // See foregoing remarks.
        return contains_prefix(prefix);
    }
    friend auto operator<< (std::ostream& ost, Trie const& trie)
        -> std::ostream&
    {
        std::string key;
        trie.root->put(ost, key);
        return ost;
    }
};
void put_trie(
    std::ostream& ost,
    std::string_view heading,
    Trie& trie)
{
    ost << std::boolalpha
        << heading << '\n' << trie
        << "\n   trie.search(\"a\") returns " << trie.search("a")
        << "\n   trie.search(\"app\") returns " << trie.search("app")
        << "\n   trie.search(\"appl\") returns " << trie.search("appl")
        << "\n   trie.search(\"apple\") returns " << trie.search("apple")
        << "\n   trie.search(\"apples\") returns " << trie.search("apples")
        << "\n   trie.startsWith(\"a\") returns " << trie.startsWith("a")
        << "\n   trie.startsWith(\"app\") returns " << trie.startsWith("app")
        << "\n   trie.startsWith(\"appl\") returns " << trie.startsWith("appl")
        << "\n   trie.startsWith(\"apple\") returns " << trie.startsWith("apple")
        << "\n   trie.startsWith(\"apples\") returns " << trie.startsWith("apples")
        << "\n\n";
}
void LeetCode_0208_Implement_Trie_Prefix_Tree()
{
    std::cout << std::boolalpha
        << "LeetCode 208. Implement Trie (Prefix Tree)"
        "\nhttps://leetcode.com/problems/implement-trie-prefix-tree/"
        "\n"
        "\nExample 1"
        "\n\n";
    Trie trie;
    put_trie(std::cout, "Trie contents after construction", trie);

    trie.insert("apple");
    put_trie(std::cout, "Trie contents after `trie.insert(\"apple\")`", trie);

    trie.insert("app");
    put_trie(std::cout, "Trie contents after `trie.insert(\"app\")`", trie);

    trie.erase("apple");
    put_trie(std::cout, "Trie contents after `trie.erase(\"apple\")`", trie);

    trie.erase("app");
    put_trie(std::cout, "Trie contents after `trie.erase(\"app\")`", trie);
}
export int main()
{
    LeetCode_0208_Implement_Trie_Prefix_Tree();
    return 0;
}
// end file: StackOverflow_77491953_LeetCode_208_Trie_Answer.ixx

Here is the output.

LeetCode 208. Implement Trie (Prefix Tree)
https://leetcode.com/problems/implement-trie-prefix-tree/

Example 1

Trie contents after construction

   trie.search("a") returns false
   trie.search("app") returns false
   trie.search("appl") returns false
   trie.search("apple") returns false
   trie.search("apples") returns false
   trie.startsWith("a") returns false
   trie.startsWith("app") returns false
   trie.startsWith("appl") returns false
   trie.startsWith("apple") returns false
   trie.startsWith("apples") returns false

Trie contents after `trie.insert("apple")`
{ key: "apple" }

   trie.search("a") returns false
   trie.search("app") returns false
   trie.search("appl") returns false
   trie.search("apple") returns true
   trie.search("apples") returns false
   trie.startsWith("a") returns true
   trie.startsWith("app") returns true
   trie.startsWith("appl") returns true
   trie.startsWith("apple") returns true
   trie.startsWith("apples") returns false

Trie contents after `trie.insert("app")`
{ key: "app" }
{ key: "apple" }

   trie.search("a") returns false
   trie.search("app") returns true
   trie.search("appl") returns false
   trie.search("apple") returns true
   trie.search("apples") returns false
   trie.startsWith("a") returns true
   trie.startsWith("app") returns true
   trie.startsWith("appl") returns true
   trie.startsWith("apple") returns true
   trie.startsWith("apples") returns false

Trie contents after `trie.erase("apple")`
{ key: "app" }

   trie.search("a") returns false
   trie.search("app") returns true
   trie.search("appl") returns false
   trie.search("apple") returns false
   trie.search("apples") returns false
   trie.startsWith("a") returns true
   trie.startsWith("app") returns true
   trie.startsWith("appl") returns false
   trie.startsWith("apple") returns false
   trie.startsWith("apples") returns false

Trie contents after `trie.erase("app")`

   trie.search("a") returns false
   trie.search("app") returns false
   trie.search("appl") returns false
   trie.search("apple") returns false
   trie.search("apples") returns false
   trie.startsWith("a") returns false
   trie.startsWith("app") returns false
   trie.startsWith("appl") returns false
   trie.startsWith("apple") returns false
   trie.startsWith("apples") returns false