我正在尝试解决与图论相关的问题,但似乎无法记住/发现/理解正确/最佳方法,因此我想请教专家...
I am trying to solve a problem related to graph theory but can't seem to remember/find/understand the proper/best approach so I figured I'd ask the experts...
我有一个来自两个节点的路径列表(示例代码中为1和10).我正在尝试找到要删除以削减所有路径的最小节点数.我也只能删除某些节点.
I have a list of paths from two nodes (1 and 10 in example code). I'm trying to find the minimum number of nodes to remove to cut all paths. I'm also only able to remove certain nodes.
我目前已将其实现为暴力搜索.这在我的测试集上工作正常,但在按比例缩放到在100K中具有路径而在100中具有可用节点的图形时,这将是一个问题(因果关系).现在,我不在乎删除节点的顺序,但是在某些时候我会考虑这一点(将开关设置为在下面的代码中列出).
I currently have it implemented (below) as a brute force search. This works fine on my test set but is going to be an issue when scaling up to a graphs that have paths in the 100K and available nodes in the 100 (factorial issue). Right now, I'm not caring about the order I remove nodes in, but I will at some point want to take that into account (switch sets to list in code below).
我相信应该有一种使用最大流量/最小切割算法来解决此问题的方法.我正在阅读的所有内容都以某种方式超出了我的脑海.自从进行此类工作已经有好几年了,我似乎什么也记不起来了.
I believe there should be a way to solve this using a max flow/min cut algorithm. Everything I'm reading though is going way over my head in some way. It's been several (SEVERAL) years since doing this type of stuff and I can't seem to remember anything.
所以我的问题是:
1)除了测试所有组合并选择最小的组合之外,还有解决此问题的更好方法吗?
1) Is there a better way to solve this problem other than testing all combinations and taking the smallest set?
2)如果是这样,您可以解释它还是最好提供伪代码来帮助解释?我猜想可能已经有一个库以某种方式做到了这一点(我最近一直在寻找并使用networkX,但对其他人开放)
2) If so, can you either explain it or, preferably, give pseudo code to help explain? I'm guessing there is probably a library that already does this in some way (I have been looking and using networkX lately but am open to others)
3)如果不是(或什至不是),那么有关如何多线程/进程解决方案的建议?我想尝试从计算机中获得所有性能.(我在这个问题上发现了一些很好的线索,我只是没有机会实施,所以我想趁机问一下.我首先想让所有东西在优化之前都能正常工作.)
3) If not (or even of so), suggestions for how to multithread/process solution? I want to try to get every bit of performance I can from computer. (I have found a few good threads on this question I just haven't had a chance to implement so figured I'd ask at same time just in chance. I first want to get everything working properly before optimizing.)
4)关于使代码更具"Pythonic"性的一般建议(可能也会对性能有所帮助).我知道我可以做一些改进,但是对Python来说仍然是新的.
4) General suggestions on making code more "Pythonic" (probably will help with performance too). I know there are improvements I can make and am still new to Python.
感谢您的帮助.
#!/usr/bin/env python def bruteForcePaths(paths, availableNodes, setsTested, testCombination, results, loopId): #for each node available, we are going to # check if we have already tested set with node # if true- move to next node # if false- remove the paths effected, # if there are paths left, # record combo, continue removing with current combo, # if there are no paths left, # record success, record combo, continue to next node #local copy currentPaths = list(paths) currentAvailableNodes = list(availableNodes) currentSetsTested = set(setsTested) currentTestCombination= set(testCombination) currentLoopId = loopId+1 print "loop ID: %d" %(currentLoopId) print "currentAvailableNodes:" for set1 in currentAvailableNodes: print " %s" %(set1) for node in currentAvailableNodes: #add to the current test set print "%d-current node: %s current combo: %s" % (currentLoopId, node, currentTestCombination) currentTestCombination.add(node) # print "Testing: %s" % currentTestCombination # print "Sets tested:" # for set1 in currentSetsTested: # print " %s" % set1 if currentTestCombination in currentSetsTested: #we already tested this combination of nodes so go to next node print "Already test: %s" % currentTestCombination currentTestCombination.remove(node) continue #get all the paths that don't have node in it currentRemainingPaths = [path for path in currentPaths if not (node in path)] #if there are no paths left if len(currentRemainingPaths) == 0: #save this combination print "successful combination: %s" % currentTestCombination results.append(frozenset(currentTestCombination)) #add to remember we tested combo currentSetsTested.add(frozenset(currentTestCombination)) #now remove the node that was add, and go to the next one currentTestCombination.remove(node) else: #this combo didn't work, save it so we don't test it again currentSetsTested.add(frozenset(currentTestCombination)) newAvailableNodes = list(currentAvailableNodes) newAvailableNodes.remove(node) bruteForcePaths(currentRemainingPaths, newAvailableNodes, currentSetsTested, currentTestCombination, results, currentLoopId) currentTestCombination.remove(node) print "-------------------" #need to pass "up" the tested sets from this loop setsTested.update(currentSetsTested) return None if __name__ == '__main__': testPaths = [ [1,2,14,15,16,18,9,10], [1,2,24,25,26,28,9,10], [1,2,34,35,36,38,9,10], [1,3,44,45,46,48,9,10], [1,3,54,55,56,58,9,10], [1,3,64,65,66,68,9,10], [1,2,14,15,16,7,10], [1,2,24,7,10], [1,3,34,35,7,10], [1,3,44,35,6,10], ] setsTested = set() availableNodes = [2, 3, 6, 7, 9] results = list() currentTestCombination = set() bruteForcePaths(testPaths, availableNodes, setsTested, currentTestCombination, results, 0) print "results:" for result in sorted(results, key=len): print result更新:我使用itertool修改了代码以生成组合.它将使代码更简洁,更快捷(并且应该更易于进行多进程.现在尝试找出建议的支配节点和多进程功能.
UPDATE: I reworked the code using itertool for generating the combinations. It make the code cleaner and faster (and should be easier to multiprocess. Now to try to figure out the dominate nodes as suggested and multiprocess function.
def bruteForcePaths3(paths, availableNodes, results): #start by taking each combination 2 at a time, then 3, etc for i in range(1,len(availableNodes)+1): print "combo number: %d" % i currentCombos = combinations(availableNodes, i) for combo in currentCombos: #get a fresh copy of paths for this combiniation currentPaths = list(paths) currentRemainingPaths = [] # print combo for node in combo: #determine better way to remove nodes, for now- if it's in, we remove currentRemainingPaths = [path for path in currentPaths if not (node in path)] currentPaths = currentRemainingPaths #if there are no paths left if len(currentRemainingPaths) == 0: #save this combination print combo results.append(frozenset(combo)) return None 推荐答案此处是一个忽略路径列表的答案.它只需要一个网络,一个源节点和一个目标节点,并找到网络中最小的节点集,而不是源节点或目标节点,因此删除这些节点会使源与目标断开连接.
Here is an answer which ignores the list of paths. It just takes a network, a source node, and a target node, and finds the minimum set of nodes within the network, not either source or target, so that removing these nodes disconnects the source from the target.
如果我想找到最小的边集,我可以通过搜索Max-Flow最小切割来找出边.请注意,维基百科文章位于 en.wikipedia/wiki/Max-flow_min-cut_theorem#Generalized_max-flow_min-cut_theorem 指出,存在一个广义的最大流最小割定理,该定理考虑了顶点容量以及边缘容量,这至少是令人鼓舞的.还要注意,边缘容量以Cuv给出,其中Cuv是从u到v的最大容量.在图中,它们似乎绘制为u/v.因此,向前的边缘容量可以不同于向后的边缘容量.
If I wanted to find the minimum set of edges, I could find out how just by searching for Max-Flow min-cut. Note that the Wikipedia article at en.wikipedia/wiki/Max-flow_min-cut_theorem#Generalized_max-flow_min-cut_theorem states that there is a generalized max-flow min-cut theorem which considers vertex capacity as well as edge capacity, which is at least encouraging. Note also that edge capacities are given as Cuv, where Cuv is the maximum capacity from u to v. In the diagram they seem to be drawn as u/v. So the edge capacity in the forward direction can be different from the edge capacity in the backward direction.
为了将最小的顶点切割问题伪装成最小的边缘切割问题,我建议利用这种不对称性.首先,为所有现有的边缘提供巨大的容量-例如,图形中节点数的100倍.现在,将每个顶点X替换为两个顶点Xi和Xo,我将它们称为传入和传出顶点.为X和Y之间的每个边创建Xo和Yi之间的边,现有容量向前移动,而0容量向后移动-这是单向边缘.现在为每个X在Xi和Xo之间创建一条边,容量1向前,容量0向后.
To disguise a minimum vertex cut problem as a minimum edge cut problem I propose to make use of this asymmetry. First of all give all the existing edges a huge capacity - for example 100 times the number of nodes in the graph. Now replace every vertex X with two vertices Xi and Xo, which I will call the incoming and outgoing vertices. For every edge between X and Y create an edge between Xo and Yi with the existing capacity going forwards but 0 capacity going backwards - these are one-way edges. Now create an edge between Xi and Xo for each X with capacity 1 going forwards and capacity 0 going backwards.
现在在结果图上运行最大流最小切割.因为所有原始链接都具有巨大的容量,所以最小切割必须全部由1个链接组成(实际上,最小切割定义为将节点集分为两部分:您真正想要的是成对的节点集节点Xi,Xo的一半在Xi的另一半,但Xo却在另一半).如果断开这些链接,则将图形分为两部分,如标准的最大流最小切割一样,因此删除这些节点将使源与目标断开连接.因为您拥有最少的切割,所以这是最小的此类节点集.
Now run max-flow min-cut on the resulting graph. Because all the original links have huge capacity, the min cut must all be made up of the capacity 1 links (actually the min cut is defined as a division of the set of nodes into two: what you really want is the set of pairs of nodes Xi, Xo with Xi in one half and Xo in the other half, but you can easily get one from the other). If you break these links you disconnect the graph into two parts, as with standard max-flow min-cut, so deleting these nodes will disconnect the source from the target. Because you have the minimum cut, this is the smallest such set of nodes.
如果您可以找到最大流量最小切割的代码,例如 www.cs.sunysb.edu/~algorith/files/network-flow.shtml 我希望它会为您提供最低限度的服务.如果不是这样(例如,由于碰巧拥有线性规划求解器而通过解决线性规划问题来实现),请例如从 www.cse.yorku.ca/~aaw/Wang/MaxFlowMinCutAlg.html 最小切割的一半是可从其到达的节点集修改图以减去解决方案实际使用的边缘容量后的来源-因此,仅考虑最大流量下使用的边缘容量,您就可以很容易地找到它.
If you can find code for max-flow min-cut, such as those pointed to by www.cs.sunysb.edu/~algorith/files/network-flow.shtml I would expect that it will give you the min-cut. If not, for instance if you do it by solving a linear programming problem because you happen to have a linear programming solver handy, notice for example from www.cse.yorku.ca/~aaw/Wang/MaxFlowMinCutAlg.html that one half of the min cut is the set of nodes reachable from the source when the graph has been modifies to subtract out the edge capacities actually used by the solution - so given just the edge capacities used at max flow you can find it pretty easily.