I am working on a spatial analysis problem using Python 2.7. I have a dictionary edges
representing edges in a graph, where key is the edgeID and the value is the start/end points:
{e1: [(12.8254, 55.3880), (12.8343, 55.3920)],
e2: [(12.8254, 55.3880), (12.8235, 55.3857)],
e3: [(12.2432, 57.1120), (12.2426, 57.1122)]}
And I have another dictionary nodes
where key is the nodeID and the value is the node coordinates:
{n14: (12.8254, 55.3880),
n15: (12.8340, 55.3883),
n16: (12.8235, 55.3857),
n17: (12.8343, 55.3920)}
I need to get a list which will look like (the 'n' and 'e' in the keys is just for illustration purposes for this question, I have integers there):
[(e1,n14,n17),(e2,n14,n16)..]
That is, I iterate over the edges dict, take every key, find the value that exist in the nodes
dict and add to a tuple. This is how I do it now:
edgesList = []
for featureId in edges:
edgeFeatureId = [k for k, v in edges.iteritems() if k == featureId][0]
edgeStartPoint = [k for k, v in nodes.iteritems() if v == edges[featureId][0]][0]#start point
edgeEndPoint = [k for k, v in nodes.iteritems() if v == edges[featureId][1]][0]#end point
edgesList.append((edgeFeatureId,edgeStartPoint,edgeEndPoint))
This is working, but is very slow when working with large datasets (with 100K edges and 90K nodes it takes ca 10 mins).
I've figured out how to use a list comprehension while getting each of the tuple's items, but is it possible to get my 3 list comprehensions into one to avoid iterating the edges with the for
loop (if this will speed things up)?
Is there another way I can build such a list faster?
UPDATE
As Martin suggested, I've inverted my nodes dict:
nodesDict = dict((v,k) for k,v in oldnodesDict.iteritems())
having node coordinates tuple as key and the nodeID as value. Unfortunately, it did not speed up the lookup process (here is the updated code - I flipped the k
and v
for edgeStartPoint
and edgeEndPoint
):
edgesList = []
for featureId in edges:
edgeFeatureId = [k for k, v in edges.iteritems() if k == featureId][0]
edgeStartPoint = [v for k, v in nodes.iteritems() if k == edges[featureId][0]][0]#start point
edgeEndPoint = [v for k, v in nodes.iteritems() if k == edges[featureId][1]][0]#end point
edgesList.append((edgeFeatureId,edgeStartPoint,edgeEndPoint))