Performance with property map

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance with property map

Petit Lepton
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:
_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:
_network.list_properties()
destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:
_edgeIds = _network.edge_properties['_graphml_edge_id']
_times = _network.edge_properties["time"]
_speeds = _network.edge_properties["speed"]
 
_origin = _network.vertex_properties['origin']
_destination = _network.vertex_properties['destination']
In [8]:
_edges = [_e for _e in _network.edges()]
%time for _e in _network.edges() : a = _speeds[_e]
 
_speedDict = {}
for _e in _edges :
    _speedDict[_network.edge_index[_e]] = _speeds[_e]
_indexes = [_network.edge_index[_e] for _e in _network.edges()]
%time for _n in _indexes : a = _speedDict[_n]
CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Elliot

You know you can get the edges, vertices, and property maps as numpy arrays by using the .a method. You should be able to do whatever you need much faster with arrays than dictionaries.

On Jul 16, 2014 3:50 AM, "Flavien Lambert" <[hidden email]> wrote:
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:

_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')

_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:

_network.list_properties()

destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:

_edgeIds = _network.edge_properties['_graphml_edge_id']

_times = _network.edge_properties["time"]

_speeds = _network.edge_properties["speed"]

 

_origin = _network.vertex_properties['origin']

_destination = _network.vertex_properties['destination']
In [8]:

_edges = [_e for _e in _network.edges()]

%time for _e in _network.edges() : a = _speeds[_e]

 

_speedDict = {}

for _e in _edges :

    _speedDict[_network.edge_index[_e]] = _speeds[_e]

_indexes = [_network.edge_index[_e] for _e in _network.edges()]

%time for _n in _indexes : a = _speedDict[_n]

CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Elliot

Also, I can't check right now, but I'd guess the time difference is because integers are much more efficient to hash and compare than long strings.  Just a barely educated guess.

-elliot

On Jul 16, 2014 9:02 AM, "..." <[hidden email]> wrote:

You know you can get the edges, vertices, and property maps as numpy arrays by using the .a method. You should be able to do whatever you need much faster with arrays than dictionaries.

On Jul 16, 2014 3:50 AM, "Flavien Lambert" <[hidden email]> wrote:
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:


_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')


_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:


_network.list_properties()


destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:


_edgeIds = _network.edge_properties['_graphml_edge_id']


_times = _network.edge_properties["time"]


_speeds = _network.edge_properties["speed"]


 


_origin = _network.vertex_properties['origin']


_destination = _network.vertex_properties['destination']
In [8]:


_edges = [_e for _e in _network.edges()]


%time for _e in _network.edges() : a = _speeds[_e]


 


_speedDict = {}


for _e in _edges :


    _speedDict[_network.edge_index[_e]] = _speeds[_e]


_indexes = [_network.edge_index[_e] for _e in _network.edges()]


%time for _n in _indexes : a = _speedDict[_n]


CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Petit Lepton
In reply to this post by Elliot

Hi Elliott, I know that the get_array is very efficient but the thing is I have to know exactly what are the edges I am dealing with.

To be more precise, I run loops over the shortest paths which I computed before and stored in a file. Therefore, each iteration makes access to a tiny fraction of the network and I must keep track of the edges involved. That is why I was giving the example of single access instead of global one though .a.

Best,
F.

On 16 Jul 2014 22:04, "..." <[hidden email]> wrote:

You know you can get the edges, vertices, and property maps as numpy arrays by using the .a method. You should be able to do whatever you need much faster with arrays than dictionaries.

On Jul 16, 2014 3:50 AM, "Flavien Lambert" <[hidden email]> wrote:
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:


_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')


_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:


_network.list_properties()


destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:


_edgeIds = _network.edge_properties['_graphml_edge_id']


_times = _network.edge_properties["time"]


_speeds = _network.edge_properties["speed"]


 


_origin = _network.vertex_properties['origin']


_destination = _network.vertex_properties['destination']
In [8]:


_edges = [_e for _e in _network.edges()]


%time for _e in _network.edges() : a = _speeds[_e]


 


_speedDict = {}


for _e in _edges :


    _speedDict[_network.edge_index[_e]] = _speeds[_e]


_indexes = [_network.edge_index[_e] for _e in _network.edges()]


%time for _n in _indexes : a = _speedDict[_n]


CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Gareth Simons
Flavian,

I’m not exactly sure what you’re trying to do, but it seems you want to select only a handful of vertices based on shortest distance calculations?

You could try something like this, which is IMHO very fast:

# You already have your shortest paths, so you can probably skip this step, but it is possible to
# search all shortest paths from a source vertex to all vertices within a certain distance, or to
# search the shortest path from and to a specific vertex using the built-in shortest_distance function:
v_localDist = shortest_distance(g, source=v, weights=e_dist, max_dist=dist)

# You can then use the built-in masks functionality to filter out only relevant nodes.

# First create the mask property map:
v_mask = g.new_vertex_property('bool')

# Then assign the property map values from another property map array, using .a  / get_array() to access arrays:
v_mask.a = v_localDist.a <= dist  # set the array values to 1 (boolean property map) where distance parameter is met

# Then activate the filter
g.set_vertex_filter(v_mask)  # activate mask based on currently selected vertices

# And deactivate when done
g.set_vertex_filter(None)

I use this method to iterate calculations on local vertex clusters (based on distance) within some fairly big graphs.
Gareth

On 16 Jul 2014, at 15:49, Flavien Lambert <[hidden email]> wrote:

Hi Elliott, I know that the get_array is very efficient but the thing is I have to know exactly what are the edges I am dealing with.

To be more precise, I run loops over the shortest paths which I computed before and stored in a file. Therefore, each iteration makes access to a tiny fraction of the network and I must keep track of the edges involved. That is why I was giving the example of single access instead of global one though .a.

Best,
F.

On 16 Jul 2014 22:04, "..." <[hidden email]> wrote:

You know you can get the edges, vertices, and property maps as numpy arrays by using the .a method. You should be able to do whatever you need much faster with arrays than dictionaries.

On Jul 16, 2014 3:50 AM, "Flavien Lambert" <[hidden email]> wrote:
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:

_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')

_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:

_network.list_properties()

destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:

_edgeIds = _network.edge_properties['_graphml_edge_id']

_times = _network.edge_properties["time"]

_speeds = _network.edge_properties["speed"]

 

_origin = _network.vertex_properties['origin']

_destination = _network.vertex_properties['destination']
In [8]:

_edges = [_e for _e in _network.edges()]

%time for _e in _network.edges() : a = _speeds[_e]

 

_speedDict = {}

for _e in _edges :

    _speedDict[_network.edge_index[_e]] = _speeds[_e]

_indexes = [_network.edge_index[_e] for _e in _network.edges()]

%time for _n in _indexes : a = _speedDict[_n]

CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Elliot
In reply to this post by Petit Lepton

Are the indices in the array not the indices of the edges?  I'm not sure, but I think you could be accessing edges based on indices you precomputed rather than edge objects.

On Jul 16, 2014 9:51 AM, "Flavien Lambert" <[hidden email]> wrote:

Hi Elliott, I know that the get_array is very efficient but the thing is I have to know exactly what are the edges I am dealing with.

To be more precise, I run loops over the shortest paths which I computed before and stored in a file. Therefore, each iteration makes access to a tiny fraction of the network and I must keep track of the edges involved. That is why I was giving the example of single access instead of global one though .a.

Best,
F.

On 16 Jul 2014 22:04, "..." <[hidden email]> wrote:

You know you can get the edges, vertices, and property maps as numpy arrays by using the .a method. You should be able to do whatever you need much faster with arrays than dictionaries.

On Jul 16, 2014 3:50 AM, "Flavien Lambert" <[hidden email]> wrote:
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:



_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')



_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:



_network.list_properties()



destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:



_edgeIds = _network.edge_properties['_graphml_edge_id']



_times = _network.edge_properties["time"]



_speeds = _network.edge_properties["speed"]



 



_origin = _network.vertex_properties['origin']



_destination = _network.vertex_properties['destination']
In [8]:



_edges = [_e for _e in _network.edges()]



%time for _e in _network.edges() : a = _speeds[_e]



 



_speedDict = {}



for _e in _edges :



    _speedDict[_network.edge_index[_e]] = _speeds[_e]



_indexes = [_network.edge_index[_e] for _e in _network.edges()]



%time for _n in _indexes : a = _speedDict[_n]



CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Petit Lepton

That is a good question. I was wondering if so.
On the same topic, I did not find the function for getting the edge from the edge index. There is one for vertices like vertex(n) but for edges?

On 16 Jul 2014 23:22, "..." <[hidden email]> wrote:

Are the indices in the array not the indices of the edges?  I'm not sure, but I think you could be accessing edges based on indices you precomputed rather than edge objects.

On Jul 16, 2014 9:51 AM, "Flavien Lambert" <[hidden email]> wrote:

Hi Elliott, I know that the get_array is very efficient but the thing is I have to know exactly what are the edges I am dealing with.

To be more precise, I run loops over the shortest paths which I computed before and stored in a file. Therefore, each iteration makes access to a tiny fraction of the network and I must keep track of the edges involved. That is why I was giving the example of single access instead of global one though .a.

Best,
F.

On 16 Jul 2014 22:04, "..." <[hidden email]> wrote:

You know you can get the edges, vertices, and property maps as numpy arrays by using the .a method. You should be able to do whatever you need much faster with arrays than dictionaries.

On Jul 16, 2014 3:50 AM, "Flavien Lambert" <[hidden email]> wrote:
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:




_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')




_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:




_network.list_properties()




destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:




_edgeIds = _network.edge_properties['_graphml_edge_id']




_times = _network.edge_properties["time"]




_speeds = _network.edge_properties["speed"]




 




_origin = _network.vertex_properties['origin']




_destination = _network.vertex_properties['destination']
In [8]:




_edges = [_e for _e in _network.edges()]




%time for _e in _network.edges() : a = _speeds[_e]




 




_speedDict = {}




for _e in _edges :




    _speedDict[_network.edge_index[_e]] = _speeds[_e]




_indexes = [_network.edge_index[_e] for _e in _network.edges()]




%time for _n in _indexes : a = _speedDict[_n]




CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Petit Lepton
In reply to this post by Gareth Simons

Thanks Gareth, I will try to dig into this method!

On 16 Jul 2014 23:14, "Gareth Simons" <[hidden email]> wrote:
Flavian,

I’m not exactly sure what you’re trying to do, but it seems you want to select only a handful of vertices based on shortest distance calculations?

You could try something like this, which is IMHO very fast:

# You already have your shortest paths, so you can probably skip this step, but it is possible to
# search all shortest paths from a source vertex to all vertices within a certain distance, or to
# search the shortest path from and to a specific vertex using the built-in shortest_distance function:
v_localDist = shortest_distance(g, source=v, weights=e_dist, max_dist=dist)

# You can then use the built-in masks functionality to filter out only relevant nodes.

# First create the mask property map:
v_mask = g.new_vertex_property('bool')

# Then assign the property map values from another property map array, using .a  / get_array() to access arrays:
v_mask.a = v_localDist.a <= dist  # set the array values to 1 (boolean property map) where distance parameter is met

# Then activate the filter
g.set_vertex_filter(v_mask)  # activate mask based on currently selected vertices

# And deactivate when done
g.set_vertex_filter(None)

I use this method to iterate calculations on local vertex clusters (based on distance) within some fairly big graphs.
Gareth

On 16 Jul 2014, at 15:49, Flavien Lambert <[hidden email]> wrote:

Hi Elliott, I know that the get_array is very efficient but the thing is I have to know exactly what are the edges I am dealing with.

To be more precise, I run loops over the shortest paths which I computed before and stored in a file. Therefore, each iteration makes access to a tiny fraction of the network and I must keep track of the edges involved. That is why I was giving the example of single access instead of global one though .a.

Best,
F.

On 16 Jul 2014 22:04, "..." <[hidden email]> wrote:

You know you can get the edges, vertices, and property maps as numpy arrays by using the .a method. You should be able to do whatever you need much faster with arrays than dictionaries.

On Jul 16, 2014 3:50 AM, "Flavien Lambert" <[hidden email]> wrote:
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:


_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')


_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:


_network.list_properties()


destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:


_edgeIds = _network.edge_properties['_graphml_edge_id']


_times = _network.edge_properties["time"]


_speeds = _network.edge_properties["speed"]


 


_origin = _network.vertex_properties['origin']


_destination = _network.vertex_properties['destination']
In [8]:


_edges = [_e for _e in _network.edges()]


%time for _e in _network.edges() : a = _speeds[_e]


 


_speedDict = {}


for _e in _edges :


    _speedDict[_network.edge_index[_e]] = _speeds[_e]


_indexes = [_network.edge_index[_e] for _e in _network.edges()]


%time for _n in _indexes : a = _speedDict[_n]


CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Elliot
In reply to this post by Petit Lepton

I'm new to graph tool and still not sure about how to use it idiomatically.  So, why would one go through the process of creating a property map, activating and deactivating a filter rather than just creating a numpy mask and masking an array of edges. (I guess edges is an iterator so masking it isn't as super simple and efficient as with an array)

Flavian, I dont see one either. Look at Graph.edges.__doc__

On Jul 16, 2014 10:47 AM, "Flavien Lambert" <[hidden email]> wrote:

That is a good question. I was wondering if so.
On the same topic, I did not find the function for getting the edge from the edge index. There is one for vertices like vertex(n) but for edges?

On 16 Jul 2014 23:22, "..." <[hidden email]> wrote:

Are the indices in the array not the indices of the edges?  I'm not sure, but I think you could be accessing edges based on indices you precomputed rather than edge objects.

On Jul 16, 2014 9:51 AM, "Flavien Lambert" <[hidden email]> wrote:

Hi Elliott, I know that the get_array is very efficient but the thing is I have to know exactly what are the edges I am dealing with.

To be more precise, I run loops over the shortest paths which I computed before and stored in a file. Therefore, each iteration makes access to a tiny fraction of the network and I must keep track of the edges involved. That is why I was giving the example of single access instead of global one though .a.

Best,
F.

On 16 Jul 2014 22:04, "..." <[hidden email]> wrote:

You know you can get the edges, vertices, and property maps as numpy arrays by using the .a method. You should be able to do whatever you need much faster with arrays than dictionaries.

On Jul 16, 2014 3:50 AM, "Flavien Lambert" <[hidden email]> wrote:
Hi everyone, to compute some functions I needed to loop over a bunch of edges. I realized that the call to property maps seems slow compare to a dictionary. I am a bit surprised since I was told - I am not an expert in python - that a query in a dictionary was already. So I was wondering if I made a mistake in using graph_tool. Following is an example of comparison.
Best,
F.


In [3]:





_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')





_network = gt.load_graph(_dataFolder + 'networkLTA-2.0-scc.xml')
In [4]:





_network.list_properties()





destination    (vertex)  (type: long double)
_graphml_vertex_id (vertex)  (type: string)
origin         (vertex)  (type: long double)
_graphml_edge_id (edge)    (type: string)
speed          (edge)    (type: long double)
name           (edge)    (type: string)
time           (edge)    (type: long double)
In [5]:





_edgeIds = _network.edge_properties['_graphml_edge_id']





_times = _network.edge_properties["time"]





_speeds = _network.edge_properties["speed"]





 





_origin = _network.vertex_properties['origin']





_destination = _network.vertex_properties['destination']
In [8]:





_edges = [_e for _e in _network.edges()]





%time for _e in _network.edges() : a = _speeds[_e]





 





_speedDict = {}





for _e in _edges :





    _speedDict[_network.edge_index[_e]] = _speeds[_e]





_indexes = [_network.edge_index[_e] for _e in _network.edges()]





%time for _n in _indexes : a = _speedDict[_n]





CPU times: user 102 ms, sys: 5 ms, total: 107 ms
Wall time: 103 ms
CPU times: user 2 ms, sys: 0 ns, total: 2 ms
Wall time: 1.94 ms

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Tiago Peixoto
Administrator
In reply to this post by Petit Lepton
On 07/16/2014 05:44 PM, Flavien Lambert wrote:
> On the same topic, I did not find the function for getting the edge
> from the edge index. There is one for vertices like vertex(n) but for
> edges?

There is no such function, and this is related to the speed difference
you are seeing. The edges are not stored in one big vector, and thus
cannot be addressed simply by its index. Instead, they are stored in
different vectors across all nodes (i.e. and adjacency list), and thus
one needs to iterate first through the vectors.

Now lets look at your code:

> _edges = [_e for _e in _network.edges()]
> %time for _e in _network.edges() : a = _speeds[_e]
>
> _speedDict = {}
> for _e in _edges :
>     _speedDict[_network.edge_index[_e]] = _speeds[_e]
> _indexes = [_network.edge_index[_e] for _e in _network.edges()]
> %time for _n in _indexes : a = _speedDict[_n]

What makes the second loop slower has little to do with property maps vs
dicts, but instead with the loop over the Graph.edges() iterator and the
loop ofter the list you created in the second part. Whenever you loop
over the edges, not only is the loop slightly slower because it is not a
simple list, but a "list of lists", but also (more importantly) because
it has to *create* edge descriptor objects at each iteration! In the
second loop you created these objects and stored them in a list, and
then looped over this list, which does not involve object creation, and
is therefore much faster.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (919 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Petit Lepton
Thanks a lot for the explanation! So is there a best way to have access to the properties?


On 18 July 2014 21:39, Tiago de Paula Peixoto <[hidden email]> wrote:
On 07/16/2014 05:44 PM, Flavien Lambert wrote:
> On the same topic, I did not find the function for getting the edge
> from the edge index. There is one for vertices like vertex(n) but for
> edges?

There is no such function, and this is related to the speed difference
you are seeing. The edges are not stored in one big vector, and thus
cannot be addressed simply by its index. Instead, they are stored in
different vectors across all nodes (i.e. and adjacency list), and thus
one needs to iterate first through the vectors.

Now lets look at your code:

> _edges = [_e for _e in _network.edges()]
> %time for _e in _network.edges() : a = _speeds[_e]
>
> _speedDict = {}
> for _e in _edges :
>     _speedDict[_network.edge_index[_e]] = _speeds[_e]
> _indexes = [_network.edge_index[_e] for _e in _network.edges()]
> %time for _n in _indexes : a = _speedDict[_n]

What makes the second loop slower has little to do with property maps vs
dicts, but instead with the loop over the Graph.edges() iterator and the
loop ofter the list you created in the second part. Whenever you loop
over the edges, not only is the loop slightly slower because it is not a
simple list, but a "list of lists", but also (more importantly) because
it has to *create* edge descriptor objects at each iteration! In the
second loop you created these objects and stored them in a list, and
then looped over this list, which does not involve object creation, and
is therefore much faster.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool



_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Elliot

Well, if you build a list of edge objects before hand, you can get and set properties from maps quickly all day, or until you create or destroy edges is the graph and have to rebuild the list.  If building that list takes a long time and you're changing edges often, you could use property map masks to keep track of new and deleted edges and add to and subtract from the list.

Elliot

On Jul 19, 2014 2:21 AM, "Flavien Lambert" <[hidden email]> wrote:
Thanks a lot for the explanation! So is there a best way to have access to the properties?


On 18 July 2014 21:39, Tiago de Paula Peixoto <[hidden email]> wrote:
On 07/16/2014 05:44 PM, Flavien Lambert wrote:
> On the same topic, I did not find the function for getting the edge
> from the edge index. There is one for vertices like vertex(n) but for
> edges?

There is no such function, and this is related to the speed difference
you are seeing. The edges are not stored in one big vector, and thus
cannot be addressed simply by its index. Instead, they are stored in
different vectors across all nodes (i.e. and adjacency list), and thus
one needs to iterate first through the vectors.

Now lets look at your code:

> _edges = [_e for _e in _network.edges()]
> %time for _e in _network.edges() : a = _speeds[_e]
>
> _speedDict = {}
> for _e in _edges :
>     _speedDict[_network.edge_index[_e]] = _speeds[_e]
> _indexes = [_network.edge_index[_e] for _e in _network.edges()]
> %time for _n in _indexes : a = _speedDict[_n]

What makes the second loop slower has little to do with property maps vs
dicts, but instead with the loop over the Graph.edges() iterator and the
loop ofter the list you created in the second part. Whenever you loop
over the edges, not only is the loop slightly slower because it is not a
simple list, but a "list of lists", but also (more importantly) because
it has to *create* edge descriptor objects at each iteration! In the
second loop you created these objects and stored them in a list, and
then looped over this list, which does not involve object creation, and
is therefore much faster.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool



_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Performance with property map

Elliot

That's assuming you really need to loop through the edges, and that getting edges from masked graphs won't do.

On Jul 19, 2014 12:01 PM, "..." <[hidden email]> wrote:

Well, if you build a list of edge objects before hand, you can get and set properties from maps quickly all day, or until you create or destroy edges is the graph and have to rebuild the list.  If building that list takes a long time and you're changing edges often, you could use property map masks to keep track of new and deleted edges and add to and subtract from the list.

Elliot

On Jul 19, 2014 2:21 AM, "Flavien Lambert" <[hidden email]> wrote:
Thanks a lot for the explanation! So is there a best way to have access to the properties?


On 18 July 2014 21:39, Tiago de Paula Peixoto <[hidden email]> wrote:
On 07/16/2014 05:44 PM, Flavien Lambert wrote:
> On the same topic, I did not find the function for getting the edge
> from the edge index. There is one for vertices like vertex(n) but for
> edges?

There is no such function, and this is related to the speed difference
you are seeing. The edges are not stored in one big vector, and thus
cannot be addressed simply by its index. Instead, they are stored in
different vectors across all nodes (i.e. and adjacency list), and thus
one needs to iterate first through the vectors.

Now lets look at your code:

> _edges = [_e for _e in _network.edges()]
> %time for _e in _network.edges() : a = _speeds[_e]
>
> _speedDict = {}
> for _e in _edges :
>     _speedDict[_network.edge_index[_e]] = _speeds[_e]
> _indexes = [_network.edge_index[_e] for _e in _network.edges()]
> %time for _n in _indexes : a = _speedDict[_n]

What makes the second loop slower has little to do with property maps vs
dicts, but instead with the loop over the Graph.edges() iterator and the
loop ofter the list you created in the second part. Whenever you loop
over the edges, not only is the loop slightly slower because it is not a
simple list, but a "list of lists", but also (more importantly) because
it has to *create* edge descriptor objects at each iteration! In the
second loop you created these objects and stored them in a list, and
then looped over this list, which does not involve object creation, and
is therefore much faster.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool



_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool