Hello,
I try to dig up an old idea that we've discussed previously (the thread is here). summary: when one runs shortest distance on a graph with several million vertices the distance initialization cost is high with respect to the actual Dijsktra cost. We want to circumvent the initialization [see L326 of src/graph/topology/graph_distance.cc] The proposed approach (as per this boost thread) is to use the `dijkstra_shortest_paths_no_color_map_no_init`. The distance initialization is done "on the fly" by the `examine_edge` function. More precisely, the Dijkstra visitor maintains a set of discovered vertices. This set is read in the `examine_edge` function to set the distance to infinity when a the target vertex does not belong to the set of discovered vertices. The `examine_edge` function is also responsible to update the set of discovered vertices. As proposed in the initial thread, we would like to add a initornot switch to the shortest_distance parameters (see L1568 of /src/graph_tool/topology/__init__.py). I plan to :
How does it sounds ? Am I missing something ? I have two side questions :
Best regards, François. _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

On 28.06.2017 15:00, François Kawala wrote:
> Hello, > > I try to dig up an old idea that we've discussed previously (the thread is > here > <http://maindiscussionlistforthegraphtoolproject.982480.n3.nabble.com/Shortestdistancecomplexitywhenusedwithmaxdisttd4026018.html>). > > summary: when one runs shortest distance on a graph with several million > vertices the distance initialization cost is high with respect to the actual > Dijsktra cost. We want to circumvent the initialization [see L326 of > src/graph/topology/graph_distance.cc > <https://git.skewed.de/count0/graphtool/blob/master/src/graph/topology/graph_distance.cc#L326>] > > The proposed approach (as per this boost thread > <https://groups.google.com/forum/#!topic/boostlist/7nCzvkkEXCk>) is to use > the `dijkstra_shortest_paths_no_color_map_no_init`. The distance > initialization is done "on the fly" by the `examine_edge` function. More > precisely, the Dijkstra visitor maintains a set of discovered vertices. This > set is read in the `examine_edge` function to set the distance to infinity > when a the target vertex does not belong to the set of discovered vertices. > The `examine_edge` function is also responsible to update the set of > discovered vertices. > > As proposed in the initial thread, we would like to add a initornot switch > to the shortest_distance parameters (see L1568 of > /src/graph_tool/topology/__init__.py > <https://git.skewed.de/count0/graphtool/blob/master/src/graph_tool/topology/__init__.py#L1568>). > > I plan to : > > * define a new Dijkstra visitor named `djk_max_visitor_no_init` (and maybe > a second one to handle multiple targets scenario) > * update the `do_djk_search` function to use > `dijkstra_shortest_paths_no_color_map_no_init` > * update the python shortest_distance (add the initornot switch) > > How does it sounds ? Am I missing something ? visitor that always does implicit initialization. The init=True version would be redundant, but at almost no extra cost. If the extra cost is important, the user should be using init=False anyways... > I have two side questions : > > * what is the "standard" development environment ? I tried to use CLION > <https://www.jetbrains.com/clion/> but I cannot figure out how to > configure it properly so that all the boost dependencies are correctly > resolved. There is no such thing... I just use a text editor (emacs). > * how do you debug during development ? GDB, valgrind, etc. Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Administrator

In reply to this post by François Kawala
On 28.06.2017 15:00, François Kawala wrote:
> > I try to dig up an old idea that we've discussed previously (the thread is > here > <http://maindiscussionlistforthegraphtoolproject.982480.n3.nabble.com/Shortestdistancecomplexitywhenusedwithmaxdisttd4026018.html>). > > summary: when one runs shortest distance on a graph with several million > vertices the distance initialization cost is high with respect to the actual > Dijsktra cost. We want to circumvent the initialization [see L326 of > src/graph/topology/graph_distance.cc > <https://git.skewed.de/count0/graphtool/blob/master/src/graph/topology/graph_distance.cc#L326>] > > The proposed approach (as per this boost thread > <https://groups.google.com/forum/#%21topic/boostlist/7nCzvkkEXCk>) is to > use the `dijkstra_shortest_paths_no_color_map_no_init`. The distance > initialization is done "on the fly" by the `examine_edge` function. More > precisely, the Dijkstra visitor maintains a set of discovered vertices. This > set is read in the `examine_edge` function to set the distance to infinity > when a the target vertex does not belong to the set of discovered vertices. > The `examine_edge` function is also responsible to update the set of > discovered vertices. https://git.skewed.de/count0/graphtool/commit/bf174831 that replaces the search functions by no_init ones. The initialization is still done, but only once via a single numpy call in the Python side of things. The speed of the numpy call should be comparable to what we get with allocation, which is unavoidable. Please see if the performance improvements you observe with this is acceptable. Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Hello Tiago,
Thanks a lot for swift update. I timed the 2.22 against commit bf174831 (compiled with CXXFLAGS=O3). That does not give an accurate drill On my typical ~7e6 vertices graph, for a typical search that reach about 1e5 vertices, the improvement is about 20%. We have 83ms (average over 100 runs) for the 2.22 release versus 69ms for commit bf174831. Interestingly, the bf174831 version is more stable with std = 4ms versus 6ms for the 2.22 release. What I understand from the boost thread is that we could have only initialization for the whole lifespan of the program. If I get it right, their proposal would, prior to a search, to "reset" the values from distances vector and predecessors vector only for the reached vertices of the last search. I think it worth a try because, from what I timed the python side of initialization could cost around 18ms (5ms for distance map, 13ms for predecessor map) see timeit snipet below. import timeit # pred map timeit.timeit(setup='import numpy as np; x=np.zeros(int(7e6),dtype="int64")', stmt='x[:] = np.arange(7e6)', number=1000)/1000 # distances timeit.timeit(setup='import numpy as np; x=np.zeros(int(7e6),dtype="int64")', stmt='x[:] = np.inf)', number=1000)/1000 I did the above timings on a modern work station, to switch to a cloud provider induces a non negligible overhead (+63% for the distances / +469% for the pred map). What do you think ? F. Le mer. 28 juin 2017 à 18:58, Tiago de Paula Peixoto <[hidden email]> a écrit : On 28.06.2017 15:00, François Kawala wrote: _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

On 29.06.2017 11:58, François Kawala wrote:
> On my typical ~7e6 vertices graph, for a typical search that reach about 1e5 > vertices, the improvement is about 20%. We have 83ms (average over 100 runs) > for the 2.22 release versus 69ms for commit bf174831. Interestingly, the > bf174831 version is more stable with std = 4ms versus 6ms for the 2.22 release. I wonder how much time is spent in the actual search vs. initialization. If you use something like cProfile you can see how much time is spent in the actual C++ function vs. the rest. This would give us an idea of how much actual improvement can be done. > What I understand from the boost thread > <https://groups.google.com/forum/#%21topic/boostlist/7nCzvkkEXCk> is that > we could have only initialization for the whole lifespan of the program. If > I get it right, their proposal would, prior to a search, to "reset" the > values from distances vector and predecessors vector only for the reached > vertices of the last search. I think it worth a try because, from what I > timed the python side of initialization could cost around 18ms (5ms for > distance map, 13ms for predecessor map) see timeit snipet below. In order for this to work, the function would need to keep and return a list of reached vertices. This is is not difficult to do, but I wonder how much it would actually improve... Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Administrator

In reply to this post by François Kawala
On 29.06.2017 11:58, François Kawala wrote:
> I did the above timings on a modern work station, to switch to a cloud > provider induces a non negligible overhead (+63% for the distances / +469% > for the pred map). This I don't understand at all. Why would the behavior in a "cloud" environment be any different?  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
About the run time in cloud computing scenario my measure might be useless. A cloud instance is a VM that runs on I don't know what hardware and might be shared with I don't know who. So even if inside my VM the CPU was idle, I don't now how the CPU was used by other VMs. That being said, for our particular cloud service provider we observe (on a daily basis for several months) that the performances are below what we have on our modern workstations. Just to be clear : I'm neither complaining about graph tool performances, nor thinking about improvements that would be cloud computing driven. I just wanted to make clear that modest improvement (as measured on a modern work station) might be very meaningful in another environment. 23484 function calls (23331 primitive calls) in 6.733 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 100 3.723 0.037 6.731 0.067 graph_tool/topology/__init__.py:1512(shortest_distance) 100 0.001 0.000 1.833 0.018 graph_tool/topology/__init__.py:1507(init_pmap) 100 0.000 0.000 1.833 0.018 graph_tool/decorators.py:1(copy_property) 200/100 0.003 0.000 1.832 0.018 graph_tool/decorators.py:126(wrapper) 100 1.275 0.013 1.827 0.018 graph_tool/__init__.py:2348(copy_property) 100 0.001 0.000 1.163 0.012 graph_tool/topology/__init__.py:1500(init_dist) 100 0.000 0.000 0.645 0.006 graph_tool/__init__.py:661(<lambda>) 100 0.643 0.006 0.644 0.006 graph_tool/__init__.py:614(__get_set_f_array) 700 0.001 0.000 0.540 0.001 graph_tool/__init__.py:166(_prop) 700 0.003 0.000 0.538 0.001 graph_tool/__init__.py:392(_get_any) 700 0.534 0.001 0.534 0.001 graph_tool/__init__.py:814(reserve) 200 0.000 0.000 0.518 0.003 graph_tool/__init__.py:559(get_array) 200 0.515 0.003 0.517 0.003 graph_tool/__init__.py:581(_get_data) 200 0.004 0.000 0.011 0.000 graph_tool/__init__.py:2289(new_vertex_property) 200 0.002 0.000 0.010 0.000 graph_tool/__init__.py:3071(__init__) 600 0.002 0.000 0.008 0.000 graph_tool/__init__.py:377(__init__) 200 0.002 0.000 0.007 0.000 graph_tool/__init__.py:1531(__init__) 100 0.000 0.000 0.006 0.000 graph_tool/__init__.py:2274(new_property) 800 0.002 0.000 0.004 0.000 graph_tool/__init__.py:202(_type_alias) 599 0.001 0.000 0.003 0.000 graph_tool/__init__.py:437(__del__) 600 0.001 0.000 0.002 0.000 graph_tool/__init__.py:264(_converter) 599 0.002 0.000 0.002 0.000 graph_tool/__init__.py:432(__unregister_map) 200 0.001 0.000 0.002 0.000 graph_tool/__init__.py:909(__new__) The complete cprofile output is attached to this mail. Le jeu. 29 juin 2017 à 13:16, Tiago de Paula Peixoto <[hidden email]> a écrit : On 29.06.2017 11:58, François Kawala wrote: _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool profile (9K) Download Attachment 
To elaborate on this, a quick an dirty profiling (with the timeit module) gives 0.27 ms to reset 10e5 random items to np.inf in a numpy array with 7e6 items. I'm confident that this improvement worth to be done and I can try to do it. Once initialized, the distance map, the predecessor map, and the "last reached index" would be internal property maps, correct ? Would it be relevant to do the "reset" in the CPP part ? François _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
I realize that I'm not familiar enough with boost to do this change.
From what I get, I'll add three private members : _distance and _predecessor they would be initialized as follows : _distance(get(vertex_distance, *_mg)), _predecessor(get(predecessor, *_mg)), I don't know if the _distance member should be filled with zeros ? The last private member would be _reach a std::vector<size_t> From that I'll declare two functions : set_reached to update the reached vertices reset_distance to reset the _distance, _predecessor and _reach to their default values. Does that sounds right ? If so, I'll guess that I'll have to make the do_djk_search function to call the set_reached and reset_distance functions. Am I missing something ? Sorry for this stuttering approach. François. Le jeu. 29 juin 2017 à 18:22, François Kawala <[hidden email]> a écrit :
_______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

On 30.06.2017 18:44, François Kawala wrote:
> I realize that I'm not familiar enough with boost to do this change. > > From what I get, I'll add three private members : _distance and _predecessor > they would be initialized as follows : > > _distance(get(vertex_distance, *_mg)), > _predecessor(get(predecessor, *_mg)), > > I don't know if the _distance member should be filled with zeros ? > > The last private member would be _reach a std::vector<size_t> > > From that I'll declare two functions : > > set_reached to update the reached vertices > reset_distance to reset the _distance, _predecessor and _reach to their > default values. > > Does that sounds right ? If so, I'll guess that I'll have to make > the do_djk_search function to call the set_reached and reset_distance functions. > > Am I missing something ? > > Sorry for this stuttering approach. for the details. In short, you can now do multiple searches as follows: dist_map = g.new_vp("double", numpy.inf) # must be infinity pred_map = g.vertex_index.copy() # must be identity for source in sources: # the following does not initialize dist_map and pred_map, and # returns an array of reached vertices dist_map, pred_map, reached = \ shortest_distance(g, source, weights=w, pred_map=pred_map, dist_map=dist_map, return_reached=True) # reset property maps for next search; this is O(len(reached)) dist_map.a[reached] = numpy.inf pred_map.a[reached] = reached Please tell me if this brings the improvements you were seeking. Best, Tiago _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
I cProfiled old versus your last commit. The graph has 7 953 158 vertices, I sample 100 vertices and do shortest_distance from each one. I specify no target, and set the max_dist parameter. After each call to shortest_distance, I use the reached array to reset the predecessor map. In average each search reaches 120177 vertices.
That more than twice faster, great news ! About the third configuration (no dist map init), I removed the distance map initialization (graph_tool/topology/__init__.py L1779) and used the reached array to reset dist_map to infinity. We could have a do_initialization parameter, I think it would be more explicit, see my proposal. François. Le ven. 30 juin 2017 à 21:59, Tiago de Paula Peixoto <[hidden email]> a écrit : On 30.06.2017 18:44, François Kawala wrote: _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

On 03.07.2017 12:16, François Kawala wrote:
> I cProfiled old versus your last commit. The graph has 7 953 158 vertices, I > sample 100 vertices and do shortest_distance from each one. I specify no > target, and set the max_dist parameter. After each call to > shortest_distance, I use the reached array to reset the predecessor map. In > average each search reaches 120177 vertices. > > * 2.22 : 9.101 seconds > * commit 26404ef4 : 4.536 seconds > * commit 26404ef4 + no dist map init : 4.141 seconds Well, the last improvement is only about 9%... We're clearly seeing diminishing returns. As I had expected, numpy initialization is pretty fast. > About the third configuration (no dist map init), I removed the distance map > initialization (graph_tool/topology/__init__.py L1779 > <https://git.skewed.de/count0/graphtool/blob/master/src/graph_tool/topology/__init__.py#L1779>) > and used the reached array to reset dist_map to infinity. Did you do the same with the predecessor map as well? > We could have a do_initialization parameter, I think it would be more > explicit, see my proposal > <https://git.skewed.de/francois/graphtool/commit/946826546a80f1f080681a4eaaa5fb2590b5abd4>. I don't like this very much. The list of reached vertices might be useful even if the user is not interested in the no_init optimization, so I think it is better decoupled. Furthermore, you are ignoring the predecessor map, which will give a cryptic error if the user chooses do_initialization=False and forgets to pass pred_map, and will ignore the value passed by the user otherwise.  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Le lun. 3 juil. 2017 à 13:00, Tiago de Paula Peixoto <[hidden email]> a écrit : On 03.07.2017 12:16, François Kawala wrote: Yes, numpy does its job quite well, still 9% is nonnegligible to me. > About the third configuration (no dist map init), I removed the distance map Yep (see above). > We could have a do_initialization parameter, I think it would be more I agree. We could split the function into "shortest_distance" and "shortest_distance_no_init". But, if I'm the sole user of this function it'll be much more sensible that I implement it my projet. Another related question, why it is necessary to copy the reached array ? Regards, François  _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

On 03.07.2017 14:06, François Kawala wrote:
> > Another related question, why it is necessary to copy the reached array ? Because ownership issues. The function works with a std::vector<> internally, because numpy's ndarrays cannot be dynamically increased. We could return a wrapped std::vector (called Vector_size_t internally in graphtool), but these are not documented. So instead, the std::vector's contents are copied as a ndarray.  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Free forum by Nabble  Edit this page 