{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6b4e67e0",
   "metadata": {},
   "source": [
    "# Finding dense k-subgraphs using Gaussian Boson Sampling"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "489aa7cc",
   "metadata": {},
   "source": [
    "In this tutorial, we roughly follow Ref. <cite data-footcite=\"graphgbs\"></cite> to implement a stochastic algorithm for finding dense k-subgraphs of a graph."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "099111b3",
   "metadata": {},
   "source": [
    "### Background\n",
    "\n",
    "By embedding an adjacency matrix $A$ into a Gaussian Boson Sampling <cite data-footcite=\"hamilton2017\"></cite> (GBS) setup via the Takagi-Autonne decomposition described in Ref. <cite data-footcite=\"Houde_2024\"></cite>, one obtains a sampler that preferentially produces outcomes corresponding to subgraphs with many perfect matchings. This property can be utilized in applications of GBS to graph problems, such as finding dense subgraphs, since graphs with many perfect matchings are typically dense. The density of a $k$-subgraph can be characterized by the ratio of the number of edges and the number of vertices, and a perfect matching is a matching (a set of pairwise non-adjacent edges) that covers every vertex of the graph. Consequently, GBS can accelerate heuristic classical algorithms that rely on subgraph sampling as a subroutine, including a stochastic algorithm for the densest $k$-subgraph problem <cite data-footcite=\"graphgbs\"></cite>:\n",
    "given a graph $G$ and $k < |G|$, find a subgraph of $k$ vertices with the largest density!\n",
    "This problem has a natural connection to clustering problems with the goal of finding highly correlated subsets of data, having a wide range of applications.\n",
    "However, this problem is NP-hard <cite data-footcite=\"dense_subgraph\"></cite>, and acquiring an approximation for the densest $k$-subgraph is also believed to be inefficient <cite data-footcite=\"bhaskara2010detectinghighlogdensities\"></cite>. One alternative approach is to make use of stochastic algorithms to provide dense $k$-subgraphs."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "883c5b22",
   "metadata": {},
   "source": [
    "We will introduce a hybrid stochastic algorithm, which uses GBS to provide guesses for dense subgraphs, originally described in Ref. <cite data-footcite=\"graphgbs\"></cite>.\n",
    "In Gaussian Boson Sampling, the probability of observing a given photon configuration is proportional to the number of perfect matchings in the corresponding subgraph.\n",
    "Also, intuitively, a graph with many perfect matchings is expected to contain many edges. The idea has been studied rigorously in Ref. <cite data-footcite=\"aaghabali2014upperboundsnumberperfect\"></cite>.\n",
    "\n",
    "We can embed the adjacency matrix in the GBS circuit, and we can obtain samples corresponding to subgraphs with a high number of perfect matchings. However, we can also exploit the obtained samples, i.e., we can provide an enhanced strategy that tweaks the samples obtained from GBS, possibly increasing the density. This algorithm can be used within an optimization algorithm searching for dense subgraphs, e.g., in a simulated annealing algorithm <cite data-footcite=\"simulated_annealing\"></cite>. It is important to mention that from the resulting samples, we have to postselect the samples that correspond to $k$-subgraphs, i.e., samples that only contain $0$s and $1$s (called collision-free samples), and the number of $1$s is exactly $k$. In such a sample, the indices where the $1$s appear correspond to the indices of the vertices in the sampled subgraph."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d92f38c",
   "metadata": {},
   "source": [
    "### Implementation\n",
    "\n",
    "Let us first start with the core of the algorithm, the dense $k$-subgraph sampling with GBS. This can be done by using the [Graph](../instructions/gates.rst#piquasso.instructions.gates.Graph) instruction. We also use [`dask`](https://www.dask.org/) (an open source Python library for parallel computing) with the `use_dask=True` configuration variable, so make sure to have it installed, or just set `use_dask=False`. We run the algorithm for a random Erdős-Rényi graph with 20 edges with $p=0.5$ probability for edge creation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0c3ff26e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import networkx as nx\n",
    "import numpy as np\n",
    "import random\n",
    "import piquasso as pq\n",
    "\n",
    "from functools import partial\n",
    "\n",
    "\n",
    "# networkx object for representing the graph\n",
    "MAIN_GRAPH = nx.erdos_renyi_graph(20, p=0.5)\n",
    "\n",
    "k = 8  # The number of vertices in the dense subgraph\n",
    "shots = 50  # Number of samples to be taken from the GBS distribution\n",
    "\n",
    "\n",
    "def is_collision_free_length_k(nodes, k):\n",
    "    return len(nodes) == len(np.unique(nodes)) and len(nodes) == k\n",
    "\n",
    "\n",
    "def postselect(subgraph_nodes, k):\n",
    "    return list(filter(partial(is_collision_free_length_k, k=k), subgraph_nodes))\n",
    "\n",
    "\n",
    "def generate_dense_subgraph(graph, k):\n",
    "    \"\"\"Generates a dense k-subgraph using the GBS distribution.\n",
    "\n",
    "    Args:\n",
    "        graph (networkx.Graph): The graph to be embedded into the GBS circuit.\n",
    "        k (int): Number of vertices in the sampled subgraph.\n",
    "\n",
    "    Returns:\n",
    "        networkx.Graph: A dense k-subgraph.\n",
    "    \"\"\"\n",
    "    adjacency_matrix = nx.adjacency_matrix(graph).toarray()\n",
    "\n",
    "    with pq.Program() as program:\n",
    "        pq.Q() | pq.Vacuum()\n",
    "        pq.Q() | pq.Graph(adjacency_matrix, mean_photon_number=k / len(graph))\n",
    "        pq.Q() | pq.ParticleNumberMeasurement()\n",
    "\n",
    "    simulator = pq.GaussianSimulator(d=len(graph), config=pq.Config(use_dask=True))\n",
    "\n",
    "    result = simulator.execute(program, shots=shots)\n",
    "\n",
    "    dense_subgraph_nodes = result.to_subgraph_nodes()\n",
    "\n",
    "    filtered_dense_subgraph_nodes = postselect(dense_subgraph_nodes, k)\n",
    "\n",
    "    if len(filtered_dense_subgraph_nodes) == 0:\n",
    "        # No dense subgraph is found, call again!\n",
    "        return generate_dense_subgraph(graph, k)\n",
    "\n",
    "    randomly_chosen_nodes = random.choice(filtered_dense_subgraph_nodes)\n",
    "\n",
    "    return graph.subgraph(np.array(graph.nodes)[randomly_chosen_nodes])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69a3cacf",
   "metadata": {},
   "source": [
    "We can use this function to get a $k$-subgraph corresponding to a high hafnian value, implying high density:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "67283b9e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Subgraph nodes: [5, 6, 7, 9, 11, 14, 17, 19]\n",
      "Subgraph density: 0.6428571428571429\n"
     ]
    }
   ],
   "source": [
    "subgraph = generate_dense_subgraph(MAIN_GRAPH, k)\n",
    "\n",
    "print(\"Subgraph nodes:\", subgraph.nodes)\n",
    "print(\"Subgraph density:\", nx.density(subgraph))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4dc16c34",
   "metadata": {},
   "source": [
    "Now, we aim to tweak this subgraph, using the GBS-Tweak algorithm described in Ref. <cite data-footcite=\"graphgbs\"></cite>:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d7394fb6",
   "metadata": {},
   "outputs": [],
   "source": [
    "def try_generate_tweaked_subgraph(subgraph, l):\n",
    "    \"\"\"\n",
    "    Tries to tweak the subgraph by using the GBS distribution.\n",
    "\n",
    "    Note: the resulting subgraph might be smaller than the original.\n",
    "\n",
    "    Args:\n",
    "        subgraph (networkx.Graph): The initial subgraph.\n",
    "        l (int): Number of vertices to keep.\n",
    "\n",
    "    Returns:\n",
    "        networkx.Graph: The tweaked subgraph.\n",
    "    \"\"\"\n",
    "    k = len(subgraph)\n",
    "\n",
    "    R = generate_dense_subgraph(subgraph, l)  # l-subgraph of the original k-subgraph\n",
    "    T = generate_dense_subgraph(MAIN_GRAPH, k - l)  # k-l-subgraph of the original graph\n",
    "\n",
    "    # Adding m extra nodes randomly\n",
    "    m = np.random.randint(0, k - l)\n",
    "\n",
    "    # Add m nodes to R randomly from the original subgraph\n",
    "    subgraph_minus_R = subgraph.copy()\n",
    "    subgraph_minus_R.remove_nodes_from(R)\n",
    "    extra_nodes = np.random.choice(subgraph_minus_R.nodes, size=m, replace=False)\n",
    "    R_with_extra_nodes = R.copy()\n",
    "    R_with_extra_nodes.add_nodes_from(extra_nodes)\n",
    "\n",
    "    # Remove m nodes randomly from T\n",
    "    reject_nodes = np.random.choice(T.nodes, size=m, replace=False)\n",
    "    T_with_rejected_nodes = T.copy()\n",
    "    T_with_rejected_nodes.remove_nodes_from(reject_nodes)\n",
    "\n",
    "    nodes_to_keep = R_with_extra_nodes.nodes\n",
    "    nodes_to_add = T_with_rejected_nodes.nodes\n",
    "\n",
    "    tweaked_subgraph_nodes = np.concatenate([nodes_to_keep, nodes_to_add])\n",
    "\n",
    "    return MAIN_GRAPH.subgraph(tweaked_subgraph_nodes)\n",
    "\n",
    "\n",
    "def gbs_tweak(subgraph, l):\n",
    "    \"\"\"An implementation for the GBS-Tweak algorithm.\n",
    "\n",
    "    Args:\n",
    "        subgraph (networkx.Graph): The subgraph to be tweaked.\n",
    "        l (int): The number of vertices to keep.\n",
    "\n",
    "    Returns:\n",
    "        networkx.Graph: The tweaked subgraph.\n",
    "    \"\"\"\n",
    "\n",
    "    tweaked_subgraph = try_generate_tweaked_subgraph(subgraph, l)\n",
    "    while len(tweaked_subgraph) != len(subgraph):\n",
    "        # Try generating another if the resulting subgraph is smaller than the original.\n",
    "        tweaked_subgraph = try_generate_tweaked_subgraph(subgraph, l)\n",
    "\n",
    "    return tweaked_subgraph"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eec719e8",
   "metadata": {},
   "source": [
    "A parameter of this algorithm is $l$, the number of vertices to be kept from the original subgraph. After fixing this, we can generate a new subgraph:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 187,
   "id": "42ce3348",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Original subgraph: Graph with 8 nodes and 18 edges\n",
      "New subgraph: Graph with 8 nodes and 21 edges\n"
     ]
    }
   ],
   "source": [
    "l = 4  # Number of vertices to be kept in the GBS-tweak algorithm\n",
    "\n",
    "new_subgraph = gbs_tweak(subgraph, l)\n",
    "\n",
    "print(\"Original subgraph:\", subgraph)\n",
    "print(\"New subgraph:\", new_subgraph)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9a009748",
   "metadata": {},
   "source": [
    "Now, we can implement a simple simulated annealing optimization for finding $k$-dense subgraphs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9a433120",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0. density=0.6071428571428571\n",
      "1. density=0.5714285714285714\n",
      "2. density=0.5357142857142857\n",
      "3. density=0.5357142857142857\n",
      "4. density=0.6071428571428571\n",
      "5. density=0.6071428571428571\n",
      "6. density=0.6071428571428571\n",
      "7. density=0.5714285714285714\n",
      "8. density=0.5357142857142857\n",
      "9. density=0.6428571428571429\n",
      "10. density=0.39285714285714285\n",
      "11. density=0.5\n",
      "12. density=0.5714285714285714\n",
      "13. density=0.5714285714285714\n",
      "14. density=0.4642857142857143\n",
      "15. density=0.5\n",
      "16. density=0.4642857142857143\n",
      "17. density=0.5357142857142857\n",
      "18. density=0.5714285714285714\n",
      "19. density=0.5357142857142857\n",
      "20. density=0.5\n",
      "21. density=0.5\n",
      "22. density=0.42857142857142855\n",
      "23. density=0.42857142857142855\n",
      "24. density=0.42857142857142855\n",
      "25. density=0.5\n",
      "26. density=0.5\n",
      "27. density=0.5714285714285714\n",
      "28. density=0.5357142857142857\n",
      "29. density=0.39285714285714285\n",
      "30. density=0.4642857142857143\n",
      "31. density=0.6428571428571429\n",
      "32. density=0.5714285714285714\n",
      "33. density=0.5357142857142857\n",
      "34. density=0.5357142857142857\n",
      "35. density=0.4642857142857143\n",
      "36. density=0.6071428571428571\n",
      "37. density=0.5\n",
      "38. density=0.6785714285714286\n",
      "39. density=0.6785714285714286\n",
      "40. density=0.5357142857142857\n",
      "41. density=0.5357142857142857\n",
      "42. density=0.7857142857142857\n",
      "43. density=0.7142857142857143\n",
      "44. density=0.7142857142857143\n",
      "45. density=0.7142857142857143\n",
      "46. density=0.6428571428571429\n",
      "47. density=0.6428571428571429\n",
      "48. density=0.6428571428571429\n",
      "49. density=0.6428571428571429\n",
      "50. density=0.6428571428571429\n",
      "51. density=0.6428571428571429\n",
      "52. density=0.6785714285714286\n",
      "53. density=0.6428571428571429\n",
      "54. density=0.7142857142857143\n",
      "55. density=0.7142857142857143\n",
      "56. density=0.7142857142857143\n",
      "57. density=0.7142857142857143\n",
      "58. density=0.7142857142857143\n",
      "59. density=0.7142857142857143\n",
      "60. density=0.7142857142857143\n",
      "61. density=0.7142857142857143\n",
      "62. density=0.7142857142857143\n",
      "63. density=0.7142857142857143\n",
      "64. density=0.7142857142857143\n",
      "65. density=0.7142857142857143\n",
      "66. density=0.7142857142857143\n",
      "67. density=0.7142857142857143\n",
      "68. density=0.7142857142857143\n",
      "69. density=0.7142857142857143\n",
      "70. density=0.7142857142857143\n",
      "71. density=0.7142857142857143\n",
      "72. density=0.7142857142857143\n",
      "73. density=0.7142857142857143\n",
      "74. density=0.7142857142857143\n",
      "75. density=0.7142857142857143\n",
      "76. density=0.7857142857142857\n",
      "77. density=0.7857142857142857\n",
      "78. density=0.7857142857142857\n",
      "79. density=0.7857142857142857\n",
      "80. density=0.7857142857142857\n",
      "81. density=0.7857142857142857\n",
      "82. density=0.7857142857142857\n",
      "83. density=0.7857142857142857\n",
      "84. density=0.7857142857142857\n",
      "85. density=0.7857142857142857\n",
      "86. density=0.7857142857142857\n",
      "87. density=0.7857142857142857\n",
      "88. density=0.7857142857142857\n",
      "89. density=0.7857142857142857\n",
      "90. density=0.7857142857142857\n",
      "91. density=0.7857142857142857\n",
      "92. density=0.7857142857142857\n",
      "93. density=0.7857142857142857\n",
      "94. density=0.7857142857142857\n",
      "95. density=0.7857142857142857\n",
      "96. density=0.7857142857142857\n",
      "97. density=0.7857142857142857\n",
      "98. density=0.7857142857142857\n",
      "99. density=0.7857142857142857\n",
      "Solution: Graph with 8 nodes and 22 edges\n",
      "Density: 0.7857142857142857\n"
     ]
    }
   ],
   "source": [
    "initial_temperature = 100  # Initial temperature for simulated annealing\n",
    "alpha = 0.85  # Parameter for geometric cooling\n",
    "ITER = 100  # Number of iterations\n",
    "\n",
    "subgraph = generate_dense_subgraph(MAIN_GRAPH, k)\n",
    "\n",
    "best_subgraph_so_far = subgraph\n",
    "best_density_so_far = nx.density(subgraph)\n",
    "\n",
    "T = initial_temperature\n",
    "\n",
    "for i in range(ITER):\n",
    "    density = nx.density(subgraph)\n",
    "    print(f\"{i}. density={density}\")\n",
    "    new_subgraph = gbs_tweak(subgraph, l)  # Tweak the previous subgraph\n",
    "    new_density = nx.density(new_subgraph)\n",
    "\n",
    "    if new_density > density:\n",
    "        # If the new density is higher, accept the new subgraph\n",
    "        subgraph = new_subgraph\n",
    "    else:\n",
    "        # Otherwise, calculate an temperature-dependent acceptance probability\n",
    "        probability = np.exp((new_density - density) / T)\n",
    "\n",
    "        if np.random.rand() < probability:\n",
    "            # Accept anyway with some probability\n",
    "            subgraph = new_subgraph\n",
    "\n",
    "    if new_density > best_density_so_far:\n",
    "        # If the new density is higher than any density observed so far, save this\n",
    "        # subgraph!\n",
    "        best_subgraph_so_far = new_subgraph\n",
    "        best_density_so_far = new_density\n",
    "\n",
    "    T *= alpha  # Decrease temperature to decrease acceptance probability\n",
    "\n",
    "print(\"Solution:\", best_subgraph_so_far)\n",
    "print(\"Density:\", best_density_so_far)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv13",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}