Fastest way to iterate through edges

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Fastest way to iterate through edges

Elliot
Hi all,

I haven't been on the list very long but I see this question keeps coming up.  I just thought I put some numbers out so people know.  Graph tool is super fast at doing the c++ stuff it does, and super convenient for picking through the data on a python level also, but it doesn't make python fast. 

So, here are some examples of working with a mask in Python.  Notice that numpy is not fast just because it is numpy.  I chose an Nx3 array because numpy masking is extra slow with shapes other than Nx1.

================

import numpy as np
a_generator = ((x,x+2,x*x) for x in range(1000))
a_list =[(x,x+2,x*x) for x in range(1000)]
a_array = np.array(a_list)
mask = np.ones((len(a_array)),dtype=np.bool)
mask[::3] = False

def tupled():
    for x,b in zip(a_generator,mask):
        if b:
            c,d,e =x

def looped():
    for x,b in zip(a_list,mask):
        if b:
            c,d,e =x

def masked():
    for x in a_array[mask]:
        c,d,e = x

#IPython magic function: timeit

%timeit tupled()
1000000 loops, best of 3: 510 ns per loop

%timeit looped()
10000 loops, best of 3: 76.6 µs per loop

%timeit masked()
1000 loops, best of 3: 445 µs per loop

================

Notice the nano seconds vs micro seconds.  The generator is the clear winner.

Now, here are some graph tool specific examples.  I left out the mask, but clearly you can create and manipulate a mask as you see fit.

=================

>>>graph
    <Graph object, directed, reversed, with 32183 vertices and 199381 edges at 0x7f62842ee9d0>

>>>def graph_loop0():
    for i in range(10):  #Only 10 times because this is soooo slooow
        for e in graph.edges():
            v1,v2 = e.source(),e.target()

>>>def graph_loop1():
    edges = [[e.source(),e.target()] for e in graph.edges()]
    for i in range(1000):
        for e in edges:
            v1,v2 = e[0],e[1]

>>>def graph_loop2():
    edges = [[e.source(),e.target()] for e in graph.edges()]
    gen = (e for e in edges)
    for x in range(1000):
        for e in gen:
            v1,v2 = e[0],e[1]

>>>from time import time

>>>a=time(); graph_loop0(); print time()-a
23.1095559597

>>>a=time(); graph_loop1(); print time()-a
15.721350193

>>>a=time(); graph_loop2(); print time()-a
2.80044198036

=======================

The loop1 and loop2 are doing 100x more, so the generator is 1000x faster.

If you're just looping through once, it makes sense to use the convenience graph_tool provides.   But if you are implementing a graph algorithm, just grab what you need from the graph_tool graph into a list or whatever python object makes sense for what you're doing.

-Elliot

_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Fastest way to iterate through edges

Tiago Peixoto
Administrator
On 06.08.2014 21:02, ... wrote:
> If you're just looping through once, it makes sense to use the
> convenience graph_tool provides.  But if you are implementing a graph
> algorithm, just grab what you need from the graph_tool graph into a
> list or whatever python object makes sense for what you're doing.

Could you try those examples again with the newest version in git? I've
made some changes which should make the loops slightly faster.

However nothing is going to beat iterating through a list of edges,
since using the Graph.edges() iterator will include all the time
necessary for the creation of the edge descriptors.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (836 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Fastest way to iterate through edges

Elliot
Sorry,

I don't want to ignore you but I don't think I'll get around to upgrading any time soon.  Should be easy enough for anyone to check for themselves though.

Glad it's getting better though.  The edge iterator isn't near as bad slow as generating some other objects I've been working with (gdal.OGRGeometry for instance).

Thanks,
  Elliot


On Wed, Aug 6, 2014 at 5:33 PM, Tiago de Paula Peixoto <[hidden email]> wrote:
On 06.08.2014 21:02, ... wrote:
> If you're just looping through once, it makes sense to use the
> convenience graph_tool provides.  But if you are implementing a graph
> algorithm, just grab what you need from the graph_tool graph into a
> list or whatever python object makes sense for what you're doing.

Could you try those examples again with the newest version in git? I've
made some changes which should make the loops slightly faster.

However nothing is going to beat iterating through a list of edges,
since using the Graph.edges() iterator will include all the time
necessary for the creation of the edge descriptors.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool



_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool