Hi all,
I haven't been on the list very long but I see this question keeps coming up. I just thought I put some numbers out so people know. Graph tool is super fast at doing the c++ stuff it does, and super convenient for picking through the data on a python level also, but it doesn't make python fast. So, here are some examples of working with a mask in Python. Notice that numpy is not fast just because it is numpy. I chose an Nx3 array because numpy masking is extra slow with shapes other than Nx1. ================ import numpy as np a_generator = ((x,x+2,x*x) for x in range(1000)) a_list =[(x,x+2,x*x) for x in range(1000)] a_array = np.array(a_list) mask = np.ones((len(a_array)),dtype=np.bool) mask[::3] = False def tupled(): for x,b in zip(a_generator,mask): if b: c,d,e =x def looped(): for x,b in zip(a_list,mask): if b: c,d,e =x def masked(): for x in a_array[mask]: c,d,e = x #IPython magic function: timeit %timeit tupled() 1000000 loops, best of 3: 510 ns per loop %timeit looped() 10000 loops, best of 3: 76.6 µs per loop %timeit masked() 1000 loops, best of 3: 445 µs per loop ================ Notice the nano seconds vs micro seconds. The generator is the clear winner. Now, here are some graph tool specific examples. I left out the mask, but clearly you can create and manipulate a mask as you see fit. ================= >>>graph <Graph object, directed, reversed, with 32183 vertices and 199381 edges at 0x7f62842ee9d0> >>>def graph_loop0(): for i in range(10): #Only 10 times because this is soooo slooow for e in graph.edges(): v1,v2 = e.source(),e.target() >>>def graph_loop1(): edges = [[e.source(),e.target()] for e in graph.edges()] for i in range(1000): for e in edges: v1,v2 = e[0],e[1] >>>def graph_loop2(): edges = [[e.source(),e.target()] for e in graph.edges()] gen = (e for e in edges) for x in range(1000): for e in gen: v1,v2 = e[0],e[1] >>>from time import time >>>a=time(); graph_loop0(); print time()a 23.1095559597 >>>a=time(); graph_loop1(); print time()a 15.721350193 >>>a=time(); graph_loop2(); print time()a 2.80044198036 ======================= The loop1 and loop2 are doing 100x more, so the generator is 1000x faster. If you're just looping through once, it makes sense to use the convenience graph_tool provides. But if you are implementing a graph algorithm, just grab what you need from the graph_tool graph into a list or whatever python object makes sense for what you're doing. Elliot _______________________________________________ graphtool mailing list [hidden email] http://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

On 06.08.2014 21:02, ... wrote:
> If you're just looping through once, it makes sense to use the > convenience graph_tool provides. But if you are implementing a graph > algorithm, just grab what you need from the graph_tool graph into a > list or whatever python object makes sense for what you're doing. Could you try those examples again with the newest version in git? I've made some changes which should make the loops slightly faster. However nothing is going to beat iterating through a list of edges, since using the Graph.edges() iterator will include all the time necessary for the creation of the edge descriptors. Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] http://lists.skewed.de/mailman/listinfo/graphtool signature.asc (836 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Sorry, I don't want to ignore you but I don't think I'll get around to upgrading any time soon. Should be easy enough for anyone to check for themselves though.Thanks, Elliot On Wed, Aug 6, 2014 at 5:33 PM, Tiago de Paula Peixoto <[hidden email]> wrote:
_______________________________________________ graphtool mailing list [hidden email] http://lists.skewed.de/mailman/listinfo/graphtool 
Free forum by Nabble  Edit this page 