Note
Finding dense k-subgraphs using Gaussian Boson Sampling¶
In this tutorial, we roughly follow Ref. [1] to implement a stochastic algorithm for finding dense k-subgraphs of a graph.
Background¶
By embedding an adjacency matrix \(A\) into a Gaussian Boson Sampling [2] (GBS) setup via the Takagi-Autonne decomposition described in Ref. [3], one obtains a sampler that preferentially produces outcomes corresponding to subgraphs with many perfect matchings. This property can be utilized in applications of GBS to graph problems, such as finding dense subgraphs, since graphs with many perfect matchings are typically dense. The density of a \(k\)-subgraph can be characterized by the ratio of the number of edges and the number of vertices, and a perfect matching is a matching (a set of pairwise non-adjacent edges) that covers every vertex of the graph. Consequently, GBS can accelerate heuristic classical algorithms that rely on subgraph sampling as a subroutine, including a stochastic algorithm for the densest \(k\)-subgraph problem [1]: given a graph \(G\) and \(k < |G|\), find a subgraph of \(k\) vertices with the largest density! This problem has a natural connection to clustering problems with the goal of finding highly correlated subsets of data, having a wide range of applications. However, this problem is NP-hard [4], and acquiring an approximation for the densest \(k\)-subgraph is also believed to be inefficient [5]. One alternative approach is to make use of stochastic algorithms to provide dense \(k\)-subgraphs.
We will introduce a hybrid stochastic algorithm, which uses GBS to provide guesses for dense subgraphs, originally described in Ref. [1]. In Gaussian Boson Sampling, the probability of observing a given photon configuration is proportional to the number of perfect matchings in the corresponding subgraph. Also, intuitively, a graph with many perfect matchings is expected to contain many edges. The idea has been studied rigorously in Ref. [6].
We can embed the adjacency matrix in the GBS circuit, and we can obtain samples corresponding to subgraphs with a high number of perfect matchings. However, we can also exploit the obtained samples, i.e., we can provide an enhanced strategy that tweaks the samples obtained from GBS, possibly increasing the density. This algorithm can be used within an optimization algorithm searching for dense subgraphs, e.g., in a simulated annealing algorithm [7]. It is important to mention that from the resulting samples, we have to postselect the samples that correspond to \(k\)-subgraphs, i.e., samples that only contain \(0\)s and \(1\)s (called collision-free samples), and the number of \(1\)s is exactly \(k\). In such a sample, the indices where the \(1\)s appear correspond to the indices of the vertices in the sampled subgraph.
Implementation¶
Let us first start with the core of the algorithm, the dense \(k\)-subgraph sampling with GBS. This can be done by using the Graph instruction. We also use `dask <https://www.dask.org/>`__ (an open source Python library for parallel computing) with the use_dask=True configuration variable, so make sure to have it installed, or just set use_dask=False. We run the algorithm for a random Erdős-Rényi graph with 20
edges with \(p=0.5\) probability for edge creation.
[ ]:
import networkx as nx
import numpy as np
import random
import piquasso as pq
from functools import partial
# networkx object for representing the graph
MAIN_GRAPH = nx.erdos_renyi_graph(20, p=0.5)
k = 8 # The number of vertices in the dense subgraph
shots = 50 # Number of samples to be taken from the GBS distribution
def is_collision_free_length_k(nodes, k):
return len(nodes) == len(np.unique(nodes)) and len(nodes) == k
def postselect(subgraph_nodes, k):
return list(filter(partial(is_collision_free_length_k, k=k), subgraph_nodes))
def generate_dense_subgraph(graph, k):
"""Generates a dense k-subgraph using the GBS distribution.
Args:
graph (networkx.Graph): The graph to be embedded into the GBS circuit.
k (int): Number of vertices in the sampled subgraph.
Returns:
networkx.Graph: A dense k-subgraph.
"""
adjacency_matrix = nx.adjacency_matrix(graph).toarray()
with pq.Program() as program:
pq.Q() | pq.Vacuum()
pq.Q() | pq.Graph(adjacency_matrix, mean_photon_number=k / len(graph))
pq.Q() | pq.ParticleNumberMeasurement()
simulator = pq.GaussianSimulator(d=len(graph), config=pq.Config(use_dask=True))
result = simulator.execute(program, shots=shots)
dense_subgraph_nodes = result.to_subgraph_nodes()
filtered_dense_subgraph_nodes = postselect(dense_subgraph_nodes, k)
if len(filtered_dense_subgraph_nodes) == 0:
# No dense subgraph is found, call again!
return generate_dense_subgraph(graph, k)
randomly_chosen_nodes = random.choice(filtered_dense_subgraph_nodes)
return graph.subgraph(np.array(graph.nodes)[randomly_chosen_nodes])
We can use this function to get a \(k\)-subgraph corresponding to a high hafnian value, implying high density:
[ ]:
subgraph = generate_dense_subgraph(MAIN_GRAPH, k)
print("Subgraph nodes:", subgraph.nodes)
print("Subgraph density:", nx.density(subgraph))
Subgraph nodes: [5, 6, 7, 9, 11, 14, 17, 19]
Subgraph density: 0.6428571428571429
Now, we aim to tweak this subgraph, using the GBS-Tweak algorithm described in Ref. [1]:
[ ]:
def try_generate_tweaked_subgraph(subgraph, l):
"""
Tries to tweak the subgraph by using the GBS distribution.
Note: the resulting subgraph might be smaller than the original.
Args:
subgraph (networkx.Graph): The initial subgraph.
l (int): Number of vertices to keep.
Returns:
networkx.Graph: The tweaked subgraph.
"""
k = len(subgraph)
R = generate_dense_subgraph(subgraph, l) # l-subgraph of the original k-subgraph
T = generate_dense_subgraph(MAIN_GRAPH, k - l) # k-l-subgraph of the original graph
# Adding m extra nodes randomly
m = np.random.randint(0, k - l)
# Add m nodes to R randomly from the original subgraph
subgraph_minus_R = subgraph.copy()
subgraph_minus_R.remove_nodes_from(R)
extra_nodes = np.random.choice(subgraph_minus_R.nodes, size=m, replace=False)
R_with_extra_nodes = R.copy()
R_with_extra_nodes.add_nodes_from(extra_nodes)
# Remove m nodes randomly from T
reject_nodes = np.random.choice(T.nodes, size=m, replace=False)
T_with_rejected_nodes = T.copy()
T_with_rejected_nodes.remove_nodes_from(reject_nodes)
nodes_to_keep = R_with_extra_nodes.nodes
nodes_to_add = T_with_rejected_nodes.nodes
tweaked_subgraph_nodes = np.concatenate([nodes_to_keep, nodes_to_add])
return MAIN_GRAPH.subgraph(tweaked_subgraph_nodes)
def gbs_tweak(subgraph, l):
"""An implementation for the GBS-Tweak algorithm.
Args:
subgraph (networkx.Graph): The subgraph to be tweaked.
l (int): The number of vertices to keep.
Returns:
networkx.Graph: The tweaked subgraph.
"""
tweaked_subgraph = try_generate_tweaked_subgraph(subgraph, l)
while len(tweaked_subgraph) != len(subgraph):
# Try generating another if the resulting subgraph is smaller than the original.
tweaked_subgraph = try_generate_tweaked_subgraph(subgraph, l)
return tweaked_subgraph
A parameter of this algorithm is \(l\), the number of vertices to be kept from the original subgraph. After fixing this, we can generate a new subgraph:
[187]:
l = 4 # Number of vertices to be kept in the GBS-tweak algorithm
new_subgraph = gbs_tweak(subgraph, l)
print("Original subgraph:", subgraph)
print("New subgraph:", new_subgraph)
Original subgraph: Graph with 8 nodes and 18 edges
New subgraph: Graph with 8 nodes and 21 edges
Now, we can implement a simple simulated annealing optimization for finding \(k\)-dense subgraphs.
[ ]:
initial_temperature = 100 # Initial temperature for simulated annealing
alpha = 0.85 # Parameter for geometric cooling
ITER = 100 # Number of iterations
subgraph = generate_dense_subgraph(MAIN_GRAPH, k)
best_subgraph_so_far = subgraph
best_density_so_far = nx.density(subgraph)
T = initial_temperature
for i in range(ITER):
density = nx.density(subgraph)
print(f"{i}. density={density}")
new_subgraph = gbs_tweak(subgraph, l) # Tweak the previous subgraph
new_density = nx.density(new_subgraph)
if new_density > density:
# If the new density is higher, accept the new subgraph
subgraph = new_subgraph
else:
# Otherwise, calculate an temperature-dependent acceptance probability
probability = np.exp((new_density - density) / T)
if np.random.rand() < probability:
# Accept anyway with some probability
subgraph = new_subgraph
if new_density > best_density_so_far:
# If the new density is higher than any density observed so far, save this
# subgraph!
best_subgraph_so_far = new_subgraph
best_density_so_far = new_density
T *= alpha # Decrease temperature to decrease acceptance probability
print("Solution:", best_subgraph_so_far)
print("Density:", best_density_so_far)
0. density=0.6071428571428571
1. density=0.5714285714285714
2. density=0.5357142857142857
3. density=0.5357142857142857
4. density=0.6071428571428571
5. density=0.6071428571428571
6. density=0.6071428571428571
7. density=0.5714285714285714
8. density=0.5357142857142857
9. density=0.6428571428571429
10. density=0.39285714285714285
11. density=0.5
12. density=0.5714285714285714
13. density=0.5714285714285714
14. density=0.4642857142857143
15. density=0.5
16. density=0.4642857142857143
17. density=0.5357142857142857
18. density=0.5714285714285714
19. density=0.5357142857142857
20. density=0.5
21. density=0.5
22. density=0.42857142857142855
23. density=0.42857142857142855
24. density=0.42857142857142855
25. density=0.5
26. density=0.5
27. density=0.5714285714285714
28. density=0.5357142857142857
29. density=0.39285714285714285
30. density=0.4642857142857143
31. density=0.6428571428571429
32. density=0.5714285714285714
33. density=0.5357142857142857
34. density=0.5357142857142857
35. density=0.4642857142857143
36. density=0.6071428571428571
37. density=0.5
38. density=0.6785714285714286
39. density=0.6785714285714286
40. density=0.5357142857142857
41. density=0.5357142857142857
42. density=0.7857142857142857
43. density=0.7142857142857143
44. density=0.7142857142857143
45. density=0.7142857142857143
46. density=0.6428571428571429
47. density=0.6428571428571429
48. density=0.6428571428571429
49. density=0.6428571428571429
50. density=0.6428571428571429
51. density=0.6428571428571429
52. density=0.6785714285714286
53. density=0.6428571428571429
54. density=0.7142857142857143
55. density=0.7142857142857143
56. density=0.7142857142857143
57. density=0.7142857142857143
58. density=0.7142857142857143
59. density=0.7142857142857143
60. density=0.7142857142857143
61. density=0.7142857142857143
62. density=0.7142857142857143
63. density=0.7142857142857143
64. density=0.7142857142857143
65. density=0.7142857142857143
66. density=0.7142857142857143
67. density=0.7142857142857143
68. density=0.7142857142857143
69. density=0.7142857142857143
70. density=0.7142857142857143
71. density=0.7142857142857143
72. density=0.7142857142857143
73. density=0.7142857142857143
74. density=0.7142857142857143
75. density=0.7142857142857143
76. density=0.7857142857142857
77. density=0.7857142857142857
78. density=0.7857142857142857
79. density=0.7857142857142857
80. density=0.7857142857142857
81. density=0.7857142857142857
82. density=0.7857142857142857
83. density=0.7857142857142857
84. density=0.7857142857142857
85. density=0.7857142857142857
86. density=0.7857142857142857
87. density=0.7857142857142857
88. density=0.7857142857142857
89. density=0.7857142857142857
90. density=0.7857142857142857
91. density=0.7857142857142857
92. density=0.7857142857142857
93. density=0.7857142857142857
94. density=0.7857142857142857
95. density=0.7857142857142857
96. density=0.7857142857142857
97. density=0.7857142857142857
98. density=0.7857142857142857
99. density=0.7857142857142857
Solution: Graph with 8 nodes and 22 edges
Density: 0.7857142857142857