Note

You can download this notebook here.

Differentiating CVQNN layers with Piquasso and Tensorflow

Photonic variational quantum circuits (VQCs) are the analogs of traditional qubit-based VQCs adapted to photonic setting. A photonic VQC consists of the following:

  1. An initial state \(\ket{\psi(\mathbf{x})}\) (optionally) depending on some data \(\mathbf{x}\).

  2. A photonic quantum circuit \(\hat{U}(\mathbf{\theta})\), which is parametrized by a collection of free parameters \(\mathbf{\theta}\).

  3. A set of observables \(\{ \hat{O}_j \}_{j=1}^N\) to be measured.

In this setup, the expectation value of an observable \(\hat{O}_j\) is a function of the circuit parameters \(\mathbf{\theta}\), i.e.,

\[f_j(\mathbf{\theta}) := \bra{\psi_0} \hat{U}^\dagger(\mathbf{\theta}) \hat{O}_j \hat{U}(\mathbf{\theta}) \ket{\psi_0}.\]

Typically a loss function is a function of the expectation values of these observables (or estimates thereof), and the training of a VQC amounts to tuning the circuit parameters \(\mathbf{x}\) to minimize the loss function.

The increased success of solving difficult problems by deep learning in classical computing led to similar developments in variational quantum computing [1]. In particular, a family of photonic VQCs called continuous-variable quantum neural networks (CVQNNs) can be considered as analogs of deep neural networks [2].

The construction of CVQNNs is based on their classical counterpart. In classical deep learning, a fully connected neural network layer is often formulated as

\[\mathcal{L}(\mathbf{x}) = \varphi(W\mathbf{x} + \mathbf{b}),\]

where \(\mathbf{x} \in \mathbb{R}^n,\,\mathbf{b}\ \in \mathbb{R}^m\) are the input and the bias vectors, respectively, \(W \in \mathbb{R}^{m \times n}\) is the matrix containing the weights of the layer, and \(\varphi\) is a non-linear activation function, usually a sigmoid, ReLU or hyperbolic tangent function.

Building upon this, the CVQNN construction admits layers of linear optical gates, which can implement any symplectic affine transformations on the phase space [2]. Subsequently, a layer of nonlinear optical gates corresponds to an activation function. Hence, a single CVQNN layer depending on weights \(\mathbf{x}\) as gate parameters can be written as

\[\mathcal{L}(\mathbf{x}) := \Phi(\mathbf{x}_{\Phi})\, D(\mathbf{x}_{D})\, I_2(\mathbf{x}_{I_2}) \, S(\mathbf{x}_{S}) \, I_1(\mathbf{x}_{I_1}),\]

where \(\Phi\) represents a column of non-linear optical gates, \(I_i\) represent interferometers, and

\[\begin{split}D(\mathbf{\alpha}) = \bigotimes_{i=1}^d D_i(\alpha_i) \qquad (\mathbf{\alpha} \in \mathbb{C}^d), \\ S(\mathbf{z}) = \bigotimes_{i=1}^d S_i(z_i) \qquad (\mathbf{z} \in \mathbb{C}^d)\end{split}\]

are columns of Displacement and Squeezing gates, respectively. Usually, the \(\Phi\) is chosen to be a column of Kerr gates, i.e.,

\[\Phi(\mathbf{x}_\Phi) := K(\mathbf{\kappa}) = \bigotimes_{i=1}^d K_i(\kappa_i) \qquad (\kappa_i \in \mathbb{R}).\]

The parameters \(\mathbf{x}_{\Phi}, \mathbf{x}_{I_2}, \mathbf{x}_{I_1}\) are real-valued, whereas \(\mathbf{x}_{D}\) and \(\mathbf{x}_{S}\) are complex-valued. Interferometers can be further decomposed into phaseshifter and beamsplitter gates using the Clements decomposition which yields a parametrization in terms of phaseshifter and beamsplitter angles.

In parallel with the expression for a fully connected layer in the classical setting, one can also get a similar expression for a CVQNN layer as follows. Considering the effect of linear optical gates on the ladder operators \(\mathbf{\xi}\), the layer can be written as

\[\mathcal{L}^\dagger(\mathbf{x}) \mathbf{\xi} \mathcal{L}(\mathbf{x}) = K^\dagger( \mathbf{x}_{\text{non-linear}}) (S_{(c)} \mathbf{\xi} + \mathbf{\alpha} I) K( \mathbf{x}_{\text{non-linear}}),\]

where \(\mathbf{x} = (\mathbf{x}_{\text{non-linear}}, \mathbf{x}_{\text{linear}}) = (\mathbf{x}_{\text{non-linear}}, \mathbf{x}_{D}, \mathbf{x}_{I_2}, \mathbf{x}_{S}, \mathbf{x}_{I_1})\) and \(S_{(c)}, \mathbf{\alpha}\) depend on \(\mathbf{x}_{\text{linear}}\). Here, the term \(S_{(c)} \mathbf{\xi} + \mathbf{\alpha} I\) represents a symplectic affine transformation on the ladder operator \(\mathbf{\xi}\), and conjugation by the Kerr gates represent a non-linear transformation.

In Piquasso, one can easily create CVQNN circuits using the Piquasso CVQNN module.

[8]:
import piquasso as pq

d = 2  # Number of qumodes

layer_count = 10  # Number of CVQNN layers

# Generating random weights
weights = pq.cvqnn.generate_random_cvqnn_weights(layer_count=layer_count, d=d)

# Creating CVQNN layers as a Piquasso subprogram
cvqnn_layers = pq.cvqnn.create_layers(weights)

Piquasso automatically sets up a subprogram containing the instructions of the desired CVQNN layer. Now we can embed this subprogram in any Piquasso program. Let’s choose the input state as a pure displaced state.

[9]:
# The program definition
with pq.Program() as program:
    pq.Q() | pq.Vacuum()

    for i in range(d):
        pq.Q(i) | pq.Displacement(r=0.1)

    pq.Q() | cvqnn_layers

Now, we need to choose the simulator which executes the instructions. Since a CVQNN layer includes non-linear terms, we definitely need to perform the simulation in Fock space. Since our initial state is pure, we can use PureFockSimulator.

[10]:
cutoff = 5

simulator = pq.PureFockSimulator(d, pq.Config(cutoff=cutoff))

final_state = simulator.execute(program).state

After obtaining the state, we can calculate several things, e.g. the expectation value of the position operator on mode 0.

[11]:
print("Mean position on mode 0:")
final_state.mean_position(mode=0)
Mean position on mode 0:
[11]:
0.08810452589019563

In order to differentiate quantitities, we need to modify the simulation. In itself, PureFockSimulator is unable to perform automatic differentiation. In order to do that, we can use TensorflowConnector, which replaces NumPy to TensorFlow under the hood. For a concrete example, let the loss function be

\[J(w) = \| \ket{\psi(w)} - \ket{\psi_*} \|_2,\]

where \(\ket{\psi(w)}\) is the final state of the circuit, and \(\ket{\psi_*}\) is some random final state.

[12]:
import numpy as np
from scipy.special import comb

state_vector_size = comb(d + cutoff - 1, cutoff - 1, exact=True)

psi_star = np.random.rand(state_vector_size) + 1j * np.random.rand(state_vector_size)

psi_star /= np.sum(np.abs(psi_star))

Then, by using tf.GradientTape, we can differentiate this loss function.

[13]:
import tensorflow as tf

tf.get_logger().setLevel("ERROR")  # Turns off complex->float casting warnings

simulator = pq.PureFockSimulator(
    d, pq.Config(cutoff=cutoff), connector=pq.TensorflowConnector()
)

w = tf.Variable(weights)
psi_star = tf.Variable(psi_star)

with tf.GradientTape() as tape:
    cvqnn_layers = pq.cvqnn.create_layers(w)

    with pq.Program() as program:
        pq.Q() | pq.Vacuum()

        for i in range(d):
            pq.Q(i) | pq.Displacement(r=0.1)

        pq.Q() | cvqnn_layers

    simulator.execute(program)

    final_state = simulator.execute(program).state

    psi = final_state.state_vector

    loss = tf.math.reduce_sum(tf.math.abs(psi - psi_star))

loss_grad = tape.gradient(loss, w)

print(f"Loss: {loss.numpy()}")
print(f"Loss gradient: {loss_grad.numpy()}")
Loss: 1.9604476480647497
Loss gradient: [[ 2.48370504e-02  6.91253512e-03 -5.17538228e-02  5.91197083e-01
   8.35270137e-01  4.33414809e-02 -1.44455276e-03 -6.10485008e-02
   6.30518617e-01  7.87589392e-01  4.15688561e-04  6.45192857e-03
  -7.05893753e-02 -1.81552492e-02]
 [ 5.51070724e-02  1.14482055e-02 -7.20810177e-02  8.70276885e-01
   6.50958369e-01  4.79129674e-02 -1.05355280e-02 -6.39677085e-02
   7.03435046e-01  7.72832475e-01  6.62571242e-03  1.03948801e-03
  -7.14190194e-02 -1.32908739e-02]
 [ 5.91775056e-02  4.75369308e-03 -6.20956892e-02  8.06572060e-01
   7.06752388e-01  4.33605125e-02 -4.52911124e-04 -6.26476669e-02
   7.98818963e-01  7.21751009e-01  1.01077259e-02 -2.26709094e-03
  -6.75334937e-02 -1.38186903e-02]
 [ 4.01203598e-02  1.99768691e-03 -5.45376280e-02  8.81982242e-01
   6.53387621e-01  5.93868412e-03 -6.30429008e-04 -5.38798294e-02
   9.05422584e-01  7.15949040e-01 -3.64702223e-04 -5.77410545e-05
  -6.82197114e-02  1.74116637e-03]
 [ 1.07959822e-03  1.73058480e-04 -5.44175901e-02  8.66129564e-01
   6.70398558e-01  1.66385446e-02  4.30945595e-04 -5.41922390e-02
   9.00744709e-01  7.08804040e-01  4.50634059e-03  5.91755415e-04
  -6.23273595e-02  5.69621412e-03]
 [ 4.21102694e-03  3.93742722e-03 -5.36233257e-02  1.03238139e+00
   5.26661099e-01 -1.52360597e-03 -3.54072884e-03 -4.91789820e-02
   9.16184120e-01  6.90672795e-01  1.16414634e-03 -2.31269771e-03
  -5.94908198e-02  9.16505767e-03]
 [-2.04315766e-02  5.76912708e-04 -4.85917483e-02  8.56742917e-01
   6.53319067e-01 -1.99088351e-02  8.87190361e-04 -4.30689726e-02
   9.27679869e-01  6.49967198e-01  1.48523330e-04 -1.12318622e-03
  -4.57634914e-02  1.02198966e-02]
 [-2.21043277e-02  1.45639573e-03 -4.43768450e-02  9.80622059e-01
   5.40051984e-01 -4.77245102e-02  2.81496583e-03 -6.23762968e-02
   1.05210311e+00  4.76482110e-01 -8.71547209e-04  6.93330613e-05
  -8.64155789e-02  8.37255149e-03]
 [-4.67122038e-02 -3.90292564e-05 -6.32088147e-02  1.12961323e+00
   4.26270074e-01 -2.25618686e-02 -5.57393577e-03 -5.11763569e-02
   9.44769422e-01  6.50075251e-01 -1.20220424e-03 -2.19911083e-03
  -6.92235563e-02  6.31057580e-03]
 [-3.85113231e-02 -1.01100459e-03 -5.13675566e-02  7.89344621e-01
   6.40349720e-01 -3.04770202e-02 -4.07158955e-03 -4.26573559e-02
   8.30388031e-01  7.82317691e-01  2.13340682e-03  1.06722282e-03
  -4.83288207e-02  5.57853420e-03]]

However, Piquasso is written in a way that it supports tf.function (see Better performance with tf.function) one can also use tf.function for this task. Refactoring everything into a function, we can use the tf.function decorator. Note, that we have to avoid side effects in any function decorated with tf.function, because side effects are only executed at the tracing step. Therefore, instantiation of pq.Program should happen by providing the instructions as constructor arguments, instead of using the with keyword.

[14]:
def calculate_loss(w, psi_star, cutoff):
    d = pq.cvqnn.get_number_of_modes(w.shape[1])

    simulator = pq.PureFockSimulator(
        d,
        pq.Config(cutoff=cutoff),
        connector=pq.TensorflowConnector(decorate_with=tf.function),
    )

    with tf.GradientTape() as tape:
        cvqnn_layers = pq.cvqnn.create_layers(w)

        final_state = simulator.execute_instructions(
            instructions=[pq.Vacuum()] + cvqnn_layers.instructions
        ).state

        psi = final_state.state_vector

        loss = tf.math.reduce_sum(tf.math.abs(psi - psi_star))

    return loss, tape.gradient(loss, w)


improved_calculate_loss = tf.function(calculate_loss)

loss, loss_grad = improved_calculate_loss(w, psi_star, cutoff)

print(f"Loss: {loss.numpy()}")
print(f"Loss gradient: {loss_grad.numpy()}")
Loss: 2.018569979174099
Loss gradient: [[ 0.00000000e+00  0.00000000e+00  0.00000000e+00  5.75160857e-01
   6.83322458e-01  1.28450675e-02 -4.00008041e-05 -6.30548348e-03
  -7.49845629e-01 -8.00313533e-01  7.09929231e-05  1.03634563e-02
  -1.18960333e-02  1.04187002e-02]
 [-1.35800236e-03 -5.49585214e-04 -5.68490534e-03  7.68436902e-01
   5.05490527e-01  6.69074565e-03  6.61089567e-04 -5.98396524e-03
  -7.70615878e-01 -7.65752853e-01  2.44147270e-03  2.03468039e-03
  -4.38080826e-03  1.26999161e-02]
 [ 1.33582634e-02 -1.68655402e-03 -1.85593853e-03  7.07517913e-01
   5.44470265e-01  2.49539746e-02  6.42390173e-05 -1.95025626e-03
  -8.25024061e-01 -6.44185279e-01  6.53653384e-03 -7.65527162e-03
   9.10541653e-03  3.40278572e-03]
 [ 4.26398634e-02 -1.80787941e-03  6.39415698e-03  7.10539771e-01
   5.16317806e-01  4.19027819e-02  3.11151157e-04  6.34437434e-03
  -7.11899830e-01 -5.86043987e-01 -1.78246076e-03 -3.32723452e-04
   1.04650909e-02  4.07298677e-03]
 [ 4.29666003e-02 -5.13803858e-05  4.61329396e-03  6.98424383e-01
   5.25555203e-01  3.78096485e-02 -3.26929949e-04  6.85206072e-03
  -7.86593321e-01 -5.59964717e-01  8.99000290e-03  4.50767474e-03
   2.76895247e-02  7.51784650e-03]
 [ 4.84763174e-02 -5.06656057e-03  2.09086242e-02  7.78957857e-01
   4.11735915e-01  4.47041965e-02  5.57595820e-03  1.66618932e-02
  -7.45897045e-01 -6.70781724e-01  4.18901129e-03 -5.51458453e-03
   3.53793166e-02  2.84834471e-03]
 [ 6.30724855e-02 -6.41540211e-04  2.14924447e-02  6.11241592e-01
   5.05390366e-01  6.89479439e-02 -1.41398304e-03  2.94560107e-02
  -8.62929995e-01 -6.09878053e-01  1.88758840e-04 -3.11469367e-03
   5.47604296e-02 -3.52522121e-03]
 [ 6.68569098e-02 -2.65481869e-03  3.22995882e-02  7.03713041e-01
   4.32114569e-01  4.74536273e-02 -3.67396656e-03  2.02959096e-02
  -8.42787956e-01 -4.25121040e-01 -3.99655314e-03  2.98835285e-04
   1.63926716e-02 -1.21762767e-02]
 [ 4.79882382e-02  6.12290682e-05  1.62381274e-02  8.37636878e-01
   3.50979911e-01  5.30524902e-02  7.52035617e-03  1.63628512e-02
  -8.23920872e-01 -6.25385991e-01 -2.62846126e-03 -5.16820365e-03
   2.70939337e-02 -7.64525657e-03]
 [ 6.78217682e-02  5.96498824e-04  1.31378911e-02  5.59849745e-01
   5.20217392e-01  7.60063612e-02  2.26111302e-03  1.50965198e-02
  -4.33486232e-01 -6.76081337e-01 -5.44203449e-03  4.00677625e-03
   2.97522697e-02  1.53461233e-03]]

The first run is called the tracing step, and it takes some time, because Tensorflow captures a tf.Graph here. The size of the graph can be decreased by passing the decorate_with=tf.function argument to pq.TensorflowConnector, which also decreases the execution time of the tracing step. After the first run, a significant speedup is observed. We can also compare the runtimes of the compiled and non-compiled function.

[15]:
import time
import numpy as np

regular_runtimes = []
improved_runtimes = []

for i in range(10):
    w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))

    start_time = time.time()
    calculate_loss(w, psi_star, cutoff)
    end_time = time.time()

    regular_runtimes.append(end_time - start_time)

    start_time = time.time()
    improved_calculate_loss(w, psi_star, cutoff)
    end_time = time.time()

    improved_runtimes.append(end_time - start_time)

print(f"Regular: {np.mean(regular_runtimes)} s (+/- {np.std(regular_runtimes)} s).")
print(f"Improved: {np.mean(improved_runtimes)} s (+/- {np.std(improved_runtimes)} s).")
Regular: 2.4323306560516356 s (+/- 0.25203824522512575 s).
Improved: 0.011875104904174805 s (+/- 0.0002783859661023313 s).

But that is not everything yet! One can also create a similar function with the jit_compile=True flag, since every operation in Piquasso can be JIT-compiled using XLA through tf.function.

[16]:
jit_compiled_calculate_loss = tf.function(jit_compile=True)(calculate_loss)

jit_compiled_calculate_loss(w, psi_star, cutoff)
2024-03-02 09:40:10.970536: I external/local_xla/xla/service/service.cc:168] XLA service 0x23ff2f70 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-03-02 09:40:10.970561: I external/local_xla/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2024-03-02 09:40:14.366437: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-03-02 09:42:47.120088: E external/local_xla/xla/service/slow_operation_alarm.cc:65]
********************************
[Compiling module a_inference_calculate_loss_184206__XlaMustCompile_true_config_proto_3175580994766145631_executor_type_11160318154034397263_.64111] Very slow compile? If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
********************************
2024-03-02 09:45:40.915775: E external/local_xla/xla/service/slow_operation_alarm.cc:133] The operation took 4m53.795791141s

********************************
[Compiling module a_inference_calculate_loss_184206__XlaMustCompile_true_config_proto_3175580994766145631_executor_type_11160318154034397263_.64111] Very slow compile? If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
********************************
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1709394340.917125   34412 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
2024-03-02 09:45:41.016959: E external/local_xla/xla/stream_executor/stream_executor_internal.h:177] SetPriority unimplemented for this stream.
[16]:
(<tf.Tensor: shape=(), dtype=float64, numpy=1.9352979440546>,
 <tf.Tensor: shape=(10, 14), dtype=float64, numpy=
 array([[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
          2.82974314e-01,  1.00761968e+00, -4.86587480e-04,
          2.35804220e-04, -1.15113732e-04, -4.06551101e-01,
         -1.11506339e+00, -3.28812640e-04,  7.26509004e-03,
         -5.50407493e-04,  1.95176250e-02],
        [-1.27337958e-02, -4.32230679e-04, -1.16956930e-05,
          2.97284980e-01,  9.58032369e-01,  7.80146796e-03,
         -1.15800353e-03, -5.29595553e-04, -6.03508410e-01,
         -1.00960026e+00, -6.00128740e-05, -5.43138082e-03,
          1.13088525e-03, -6.98910091e-03],
        [ 1.50953348e-02,  1.06793890e-04, -6.96402317e-04,
          2.78072391e-01,  8.89425807e-01,  1.65284781e-02,
         -3.26193983e-04, -7.32702009e-04, -4.35469885e-01,
         -1.08220389e+00, -2.19933132e-03, -2.47865138e-03,
         -2.93182299e-03, -1.68765726e-02],
        [ 1.09036764e-02, -1.39943937e-04, -2.79208939e-03,
          2.70519193e-01,  9.50698120e-01,  1.51859239e-02,
         -1.67403741e-04, -2.13834584e-03, -4.22355256e-01,
         -1.12201492e+00,  5.67571939e-04,  1.58293338e-02,
         -2.39030263e-03, -5.86237362e-03],
        [ 8.30080898e-03, -1.82451373e-04, -1.38832253e-03,
          1.76303654e-01,  9.95779073e-01,  4.75550916e-03,
          6.22530734e-03, -8.94953191e-03, -4.22817330e-01,
         -1.02663235e+00,  4.82082440e-03, -1.31465644e-02,
         -1.01452761e-02, -5.70510595e-02],
        [ 2.50325193e-02,  6.29469112e-03, -1.04233986e-02,
          3.23611952e-01,  8.98244843e-01,  3.23882176e-02,
         -2.04379300e-03,  3.59544328e-03, -4.59367176e-01,
         -1.02342603e+00, -2.11534639e-03, -1.15771764e-03,
          3.74358915e-03, -3.37975680e-02],
        [ 2.35414600e-02, -1.02818825e-03,  2.50828514e-03,
          1.33153804e-01,  1.02103626e+00,  2.06230600e-02,
          1.96143023e-03, -5.50583838e-03, -3.60952140e-01,
         -1.03240699e+00, -4.49581492e-04,  8.07466573e-03,
         -1.35877540e-02, -2.15258038e-02],
        [ 1.43627038e-02,  1.12035672e-03, -7.07577659e-03,
          8.26398345e-02,  9.67306681e-01,  1.21847068e-02,
          3.93848334e-03, -1.30241315e-02, -5.96429663e-01,
         -9.72524487e-01,  4.51563428e-03, -2.90174828e-03,
         -2.43580857e-02, -1.92727297e-02],
        [ 3.10054604e-02,  2.81632895e-04, -8.79013015e-03,
          2.34154711e-01,  8.87158522e-01,  2.62144967e-02,
          1.71721715e-03, -3.41037966e-03, -7.37362510e-01,
         -8.97797687e-01,  2.97914053e-03,  4.39680362e-03,
         -1.40056766e-02, -1.96915555e-03],
        [ 2.65714043e-02,  8.56081649e-07, -4.32095216e-04,
          3.61163933e-01,  7.91136183e-01,  2.11602898e-02,
          1.01903539e-03, -1.55889975e-02, -8.24757241e-01,
         -8.52458879e-01,  2.70664645e-03,  2.46375745e-03,
         -4.45933265e-02, -1.36795389e-04]])>)

Compiling the same function takes significantly more time, but the compilation step results in an extra order of magnitude runtime improvement:

[17]:
jit_compiled_runtimes = []

for i in range(10):
    w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))

    start_time = time.time()
    jit_compiled_calculate_loss(w, psi_star, cutoff)
    end_time = time.time()

    jit_compiled_runtimes.append(end_time - start_time)

print(f"Regular:\t{np.mean(regular_runtimes)} s (+/- {np.std(regular_runtimes)} s).")
print(f"Improved:\t{np.mean(improved_runtimes)} s (+/- {np.std(improved_runtimes)} s).")
print(
    f"JIT compiled:\t{np.mean(jit_compiled_runtimes)} s "
    f"(+/- {np.std(jit_compiled_runtimes)} s)."
)
Regular:        2.4323306560516356 s (+/- 0.25203824522512575 s).
Improved:       0.011875104904174805 s (+/- 0.0002783859661023313 s).
JIT compiled:   0.0013802051544189453 s (+/- 0.00014840260989162237 s).

We are able to use this function for. e.g., quantum state learning. Consider the state

\[\ket{\psi_*} = \frac{1}{\sqrt{2}} \left ( \ket{03} + \ket{30} \right ),\]

which is an example of a so-called NOON state. We can produce this using Piquasso:

[18]:
with pq.Program() as target_state_preparation:
    pq.Q(all) | pq.StateVector([0, 3]) / np.sqrt(2)
    pq.Q(all) | pq.StateVector([3, 0]) / np.sqrt(2)


target_state = simulator.execute(target_state_preparation).state

target_state_vector = target_state.state_vector

psi_star = tf.Variable(target_state_vector)

Now we can demonstrate the speed of the JIT-compiled calculation by creating a simple optimization algorithm as follows:

[19]:
learning_rate = 0.01
iterations = 10000

optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))


for i in range(iterations):
    loss, loss_grad = jit_compiled_calculate_loss(w, psi_star, cutoff)

    optimizer.apply_gradients(zip([loss_grad], [w]))

    if (i + 1) % (iterations // 20) == 0:
        print(f"{i + 1}:\t\t", loss.numpy())

print("Final loss:\t", loss.numpy())
500:             0.8607086996549679
1000:            0.8473090910243798
1500:            0.5849377384183669
2000:            0.40879991260231896
2500:            0.29813437623431027
3000:            0.2761913995731654
3500:            0.26621822690380204
4000:            0.23435492471006458
4500:            0.23874428115482763
5000:            0.2205772331178741
5500:            0.2416911247101965
6000:            0.2146732713315681
6500:            0.1993502777279203
7000:            0.22721179422812682
7500:            0.21320290731217184
8000:            0.20555180951647123
8500:            0.21018212032348688
9000:            0.20439647783148554
9500:            0.19759860554592923
10000:           0.18922056395791387
Final loss:      0.18922056395791387

We can use the final weights to calculate the final state, and calculate its fidelity with the target state.

[20]:
program = pq.cvqnn.create_program(w)

final_state = simulator.execute(program).state

print("Final state fidelity: ", final_state.fidelity(target_state))
Final state fidelity:  0.9994946525680765