Using two objects as hash key for an unordered_map or alternatives

Question

Having defined my objects myType, I need to store relations between these objects. These relations are stored on a matrix.

The number of elements is not known in advance, not all elements have a relation (element1 can have a relation with element3, but may not have one with 5) and memory is an issue. For example it could look like:

element45 is connected with:

element3 with characteristic [3,1;1,4]
element12 with characteristic [1,1;1,1]
element1780 with characteristic [8,1;1,4]

element1661 is connected with:

element3 with characteristic [3,1;6,4]
element1 with characteristic [1,1;1,9]
element1780 with characteristic [8,1;1,1]

Having:

myType* element1;
myType* element2;

I would like to have something like (properly pointed the elements):

my_table[element1][element2][1][2]=7;

I have thought on creating a nested hash table using boost library:

boost::unordered_map<myType*, boost::unordered_map<myType*,
                std::vector<std::vector <unsigned short int> > > > my_table;

However, even if the code compiles, it crashes (Segmentation fault, it points to a line calculating the hash key) running a simple line like:

my_table[element1][element2].resize(2);
for(int t=0; t<2; ++t)
    my_table[element1][element2][t].resize(2);

Anyone can give me some light about this? Is this practically or conceptually wrong?

Any other approach to this problem is welcome.

Thank you

_`myType* element1;`_ Hmm, I don't think that's ever going to work well with pointers. — πάντα ῥεῖ, Aug 28 '15 at 20:31
In case you're interested: https://www.livecoding.tv/sehe/ -- I'm gonna attempt to treat this the Boost Graph way (as it should :)) — sehe, Aug 28 '15 at 22:54
why do you think it's not going to work well with pointers? I have been using the hash table with pointers and had no problems, my question here is how to use a pair of them as hash key — gui, Aug 30 '15 at 12:03

score 6 · Accepted Answer · answered Aug 29 '15 at 00:03

Right off the bat it seems obvious to me that your datastructure represent a graph (with attributed vertices and edges connecting them).

Furthermore when you say "These relations are stored on a matrix." you apparently mean "I visualize this as a matrix", since a true matrix representation¹ would become horrifically space-inefficient for larger number of vertices and sparse edge coverage.

Boost has a library for that: Boost Graph Library (BGL)

If we assume you want to be able to read a graph like²

graph X {
    element1; element12; element166; element1780; element3; element4; 

    element45   -- element3    [ label="[3,1;1,4]" ];
    element45   -- element12   [ label="[1,1;1,1]" ];
    element45   -- element1780 [ label="[8,1;1,4]" ];

    element1661 -- element1    [ label="[1,1;1,9]" ];
    element1661 -- element3    [ label="[3,1;6,4]" ];
    element1661 -- element1780 [ label="[8,1;1,1]" ];
}

Into a BGL compatible model, use typedefs like e.g.:

struct Vertex { 
    std::string node_id;
};
struct Edge {
    Box box; 
};

using Graph = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, Vertex, Edge>;

Now you leverage the full facilities of the BGL:

Reading the graph from a file

Reading from a graphviz file is a feature³:

std::ifstream ifs("input.txt");
Graph result;

boost::dynamic_properties dp;
dp.property("node_id", boost::get(&Vertex::node_id, result));
dp.property("label",   boost::get(&Edge::box, result));

read_graphviz(ifs, result, dp);

Manipulating the graph

The many algorithms (dijkstra, flow, spanning trees, connected components, etc.) are at your disposal. Or you can mix and match. For example let's filter the nodes that have no connections out:

struct Filter {
    Graph const* _g;

    bool operator()(Graph::vertex_descriptor v) const {
        return boost::size(boost::adjacent_vertices(v, *_g))>0;
    }

    template <typename T>
    bool operator()(T&&) const { return true; /*catch-all*/ }
};

using Filtered = filtered_graph<Graph, Filter, Filter>;
Filter filter { &graph };
Filtered filtered(graph, filter, filter);

Let's write it to graphviz again:

boost::dynamic_properties dp;
dp.property("node_id", boost::get(&Vertex::node_id, filtered));
dp.property("label",   boost::get(&Edge::box, filtered));

write_graphviz_dp(std::cout, filtered, dp);

DEMO TIME

The full demo takes your input graph:

And filters it into:

Full Code

Live On Coliru

// http://stackoverflow.com/questions/32279268/using-two-objects-as-hash-key-for-an-unordered-map-or-alternatives
#include <cassert>
#include <iostream>

template <typename T> struct BasicBox {
    struct Point { T x, y; };

    Point tl, br;

    friend std::ostream& operator<<(std::ostream& os, Point const& p) { return os << p.x << ',' << p.y;                 } 
    friend std::ostream& operator<<(std::ostream& os, BasicBox const& b) { return os << '[' << b.tl << ';' << b.br << ']'; } 

    friend std::istream& operator>>(std::istream& is, Point& p) {
        char comma;

        if (!(is >> p.x >> comma >> p.y) && (comma == ',')) {
            is.setstate(std::ios::failbit | is.rdstate());
        }
        return is;
    } 
    friend std::istream& operator>>(std::istream& is, BasicBox& b) {
        char lbrace, semi, rbrace;
        if (!(
            (is >> lbrace >> b.tl >> semi >> b.br >> rbrace) &&
            (lbrace == '[' && semi == ';' && rbrace == ']')
        )) {
            is.setstate(std::ios::failbit | is.rdstate());
        }
        return is;
    }
};
using Box = BasicBox<int>;

#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graphviz.hpp>
#include <libs/graph/src/read_graphviz_new.cpp>

struct Vertex { 
    std::string node_id;
}; 
struct Edge {
    Box box; 
};

using Graph = boost::adjacency_list<boost::vecS, boost::vecS, boost::undirectedS, Vertex, Edge>;

#include <fstream>
#include <boost/graph/filtered_graph.hpp>

struct Filter {
    Graph const* _g;

    bool operator()(Graph::vertex_descriptor v) const {
        return boost::size(boost::adjacent_vertices(v, *_g))>0;
    }

    template <typename T>
    bool operator()(T&&) const { return true; /*catch-all*/ }
};

int main() {
    using namespace boost;

    Graph const graph = []{ 
        std::ifstream ifs("input.txt");
        Graph result;

        boost::dynamic_properties dp;
        dp.property("node_id", boost::get(&Vertex::node_id, result));
        dp.property("label",   boost::get(&Edge::box, result));

        read_graphviz(ifs, result, dp);
        return result;
    }();

    // let's do some random task. Like. You know. Like... Filter out the unconnected nodes
    using Filtered = filtered_graph<Graph, Filter, Filter>;
    Filter filter { &graph };
    Filtered filtered(graph, filter, filter);

    boost::dynamic_properties dp;
    dp.property("node_id", boost::get(&Vertex::node_id, filtered));
    dp.property("label",   boost::get(&Edge::box, filtered));

    write_graphviz_dp(std::cout, filtered, dp);
}

¹ like e.g. BGL's AdjacencyMatrix

² the format chosen being Graphviz' DOT format: http://www.graphviz.org/

³ Of course you can also use Boost Serialization with BGL models, so you can opt for a more compact binary representation e.g.

In case you're still interested in how I stumble my way through the BGL: [part #1](http://tinyurl.com/bgl-rw-1), [part #2](http://tinyurl.com/bgl-rw-2) and [part #3](http://tinyurl.com/bgl-rw-3) of the recorded livestreams ([experiment](http://chat.stackoverflow.com/transcript/10?m=24182469#24182469)) — sehe, Aug 29 '15 at 00:16
Thank you very much for the thorough reply. The relations between elements (here nodes) I learn them using machine learning algorithms, so I would need to access these values and modify them "on the fly", is this possible with this data structure? Relations between elements are not bidirectional, so element3 to element5 is different from elements, extension to bidirectional graph is it straightforward? And the one than concerns me the most... Practically I will have the order of 10000 nodes and I will need to create structures of about 300 graphs, is it likely to be a memory problem here? — gui, Aug 30 '15 at 11:56
Yes. `adjacency_list` models [MutableGraph](http://www.boost.org/doc/libs/1_59_0/libs/graph/doc/MutableGraph.html) so these operations are supported by this. — sehe, Aug 30 '15 at 12:30
The bidirectionality was a choice (because you didn't mention it, and a "connection" in English is intuitively bi-directional unless mentioned otherwise). Here's a directed-graph version: **[Live On Coliru](http://coliru.stacked-crooked.com/a/778cdde1d397446c)**, which also does some random mutations for demo purposes: **[after.png](http://i.imgur.com/Pfh3m2N.png)** — sehe, Aug 30 '15 at 12:30
About the practical memory requirements, it's impossible to tell without knowing the usage patterns. In general, adjacency list is not a bad choice for memory efficiency. You could run some tests with randomly generated graphs. You're reaching the realm of datastructure design and we need (lots) more information to be able to think along there — sehe, Aug 30 '15 at 12:32

score 2 · Answer 2 · answered Aug 28 '15 at 21:03

2

You could use a boost::unordered_map with key std::pair<myType*, myType*> in combine with boost::hash. You could declare it as:

boost::unordered_map<std::pair<myType*, myType*>, 
                     std::vector<std::vector<char>>, 
                     boost::hash<std::pair<myType*, myType*>>> dictionary;

Then you could load the characteristics of each pair in the dictionary like in the example below:

dictionary[std::make_pair(&a, &b)] = std::vector<std::vector<char>>(1, {1, 2, 3, 4, 5});

And access them as:

dictionary[std::make_pair(&a, &b)][0][0];

LIVE DEMO

answered Aug 28 '15 at 21:03

101010

41,839
11
94
168

so, this solution is just making a custom hash key using pair of data? if this works, I find it easy to implement and elegant, but not so sure about memory – gui Aug 30 '15 at 12:06
@gui What about the memory? – 101010 Aug 31 '15 at 09:03
sorry, my wrong, nothing about the memory. Btw, make_pair(&a, &b), the use of memory addresses there confuses me, what should define the key of the hash table is the memory address stored on the pointer, so make_pair(a,b) should be fine, shouldn't? – gui Aug 31 '15 at 14:48
@gui The hash value is defined by the the two addresses stored in the `std::pair` object (i.e., by the address of `a` and `b` respectively) returned by `std::make_pair` and not by the returned `std::pair` object itself. – 101010 Aug 31 '15 at 15:03

Using two objects as hash key for an unordered_map or alternatives

2 Answers2

Reading the graph from a file

Manipulating the graph

DEMO TIME

Full Code

Linked