Note
Differentiating CVQNN layers with Piquasso and Tensorflow¶
Photonic variational quantum circuits (VQCs) are the analogs of traditional qubit-based VQCs adapted to photonic setting. A photonic VQC consists of the following:
An initial state \(\ket{\psi(\mathbf{x})}\) (optionally) depending on some data \(\mathbf{x}\).
A photonic quantum circuit \(\hat{U}(\mathbf{\theta})\), which is parametrized by a collection of free parameters \(\mathbf{\theta}\).
A set of observables \(\{ \hat{O}_j \}_{j=1}^N\) to be measured.
In this setup, the expectation value of an observable \(\hat{O}_j\) is a function of the circuit parameters \(\mathbf{\theta}\), i.e.,
Typically a loss function is a function of the expectation values of these observables (or estimates thereof), and the training of a VQC amounts to tuning the circuit parameters \(\mathbf{x}\) to minimize the loss function.
The increased success of solving difficult problems by deep learning in classical computing led to similar developments in variational quantum computing [1]. In particular, a family of photonic VQCs called continuous-variable quantum neural networks (CVQNNs) can be considered as analogs of deep neural networks [2].
The construction of CVQNNs is based on their classical counterpart. In classical deep learning, a fully connected neural network layer is often formulated as
where \(\mathbf{x} \in \mathbb{R}^n,\,\mathbf{b}\ \in \mathbb{R}^m\) are the input and the bias vectors, respectively, \(W \in \mathbb{R}^{m \times n}\) is the matrix containing the weights of the layer, and \(\varphi\) is a non-linear activation function, usually a sigmoid, ReLU or hyperbolic tangent function.
Building upon this, the CVQNN construction admits layers of linear optical gates, which can implement any symplectic affine transformations on the phase space [2]. Subsequently, a layer of nonlinear optical gates corresponds to an activation function. Hence, a single CVQNN layer depending on weights \(\mathbf{x}\) as gate parameters can be written as
where \(\Phi\) represents a column of non-linear optical gates, \(I_i\) represent interferometers, and
are columns of Displacement and Squeezing gates, respectively. Usually, the \(\Phi\) is chosen to be a column of Kerr gates, i.e.,
The parameters \(\mathbf{x}_{\Phi}, \mathbf{x}_{I_2}, \mathbf{x}_{I_1}\) are real-valued, whereas \(\mathbf{x}_{D}\) and \(\mathbf{x}_{S}\) are complex-valued. Interferometers can be further decomposed into phaseshifter and beamsplitter gates using the Clements decomposition which yields a parametrization in terms of phaseshifter and beamsplitter angles.
In parallel with the expression for a fully connected layer in the classical setting, one can also get a similar expression for a CVQNN layer as follows. Considering the effect of linear optical gates on the ladder operators \(\mathbf{\xi}\), the layer can be written as
where \(\mathbf{x} = (\mathbf{x}_{\text{non-linear}}, \mathbf{x}_{\text{linear}}) = (\mathbf{x}_{\text{non-linear}}, \mathbf{x}_{D}, \mathbf{x}_{I_2}, \mathbf{x}_{S}, \mathbf{x}_{I_1})\) and \(S_{(c)}, \mathbf{\alpha}\) depend on \(\mathbf{x}_{\text{linear}}\). Here, the term \(S_{(c)} \mathbf{\xi} + \mathbf{\alpha} I\) represents a symplectic affine transformation on the ladder operator \(\mathbf{\xi}\), and conjugation by the Kerr gates represent a non-linear transformation.
In Piquasso, one can easily create CVQNN circuits using the Piquasso CVQNN module.
[8]:
import piquasso as pq
d = 2 # Number of qumodes
layer_count = 10 # Number of CVQNN layers
# Generating random weights
weights = pq.cvqnn.generate_random_cvqnn_weights(layer_count=layer_count, d=d)
# Creating CVQNN layers as a Piquasso subprogram
cvqnn_layers = pq.cvqnn.create_layers(weights)
Piquasso automatically sets up a subprogram containing the instructions of the desired CVQNN layer. Now we can embed this subprogram in any Piquasso program. Let’s choose the input state as a pure displaced state.
[9]:
# The program definition
with pq.Program() as program:
pq.Q() | pq.Vacuum()
for i in range(d):
pq.Q(i) | pq.Displacement(r=0.1)
pq.Q() | cvqnn_layers
Now, we need to choose the simulator which executes the instructions. Since a CVQNN layer includes non-linear terms, we definitely need to perform the simulation in Fock space. Since our initial state is pure, we can use PureFockSimulator.
[10]:
cutoff = 5
simulator = pq.PureFockSimulator(d, pq.Config(cutoff=cutoff))
final_state = simulator.execute(program).state
After obtaining the state, we can calculate several things, e.g. the expectation value of the position operator on mode 0.
[11]:
print("Mean position on mode 0:")
final_state.mean_position(mode=0)
Mean position on mode 0:
[11]:
0.08810452589019563
In order to differentiate quantitities, we need to modify the simulation. In itself, PureFockSimulator is unable to perform automatic differentiation. In order to do that, we can use TensorflowConnector, which replaces NumPy to TensorFlow under the hood. For a concrete example, let the loss function be
where \(\ket{\psi(w)}\) is the final state of the circuit, and \(\ket{\psi_*}\) is some random final state.
[12]:
import numpy as np
from scipy.special import comb
state_vector_size = comb(d + cutoff - 1, cutoff - 1, exact=True)
psi_star = np.random.rand(state_vector_size) + 1j * np.random.rand(state_vector_size)
psi_star /= np.sum(np.abs(psi_star))
Then, by using tf.GradientTape, we can differentiate this loss function.
[13]:
import tensorflow as tf
tf.get_logger().setLevel("ERROR") # Turns off complex->float casting warnings
simulator = pq.PureFockSimulator(
d, pq.Config(cutoff=cutoff), connector=pq.TensorflowConnector()
)
w = tf.Variable(weights)
psi_star = tf.Variable(psi_star)
with tf.GradientTape() as tape:
cvqnn_layers = pq.cvqnn.create_layers(w)
with pq.Program() as program:
pq.Q() | pq.Vacuum()
for i in range(d):
pq.Q(i) | pq.Displacement(r=0.1)
pq.Q() | cvqnn_layers
simulator.execute(program)
final_state = simulator.execute(program).state
psi = final_state.state_vector
loss = tf.math.reduce_sum(tf.math.abs(psi - psi_star))
loss_grad = tape.gradient(loss, w)
print(f"Loss: {loss.numpy()}")
print(f"Loss gradient: {loss_grad.numpy()}")
Loss: 1.9604476480647497
Loss gradient: [[ 2.48370504e-02 6.91253512e-03 -5.17538228e-02 5.91197083e-01
8.35270137e-01 4.33414809e-02 -1.44455276e-03 -6.10485008e-02
6.30518617e-01 7.87589392e-01 4.15688561e-04 6.45192857e-03
-7.05893753e-02 -1.81552492e-02]
[ 5.51070724e-02 1.14482055e-02 -7.20810177e-02 8.70276885e-01
6.50958369e-01 4.79129674e-02 -1.05355280e-02 -6.39677085e-02
7.03435046e-01 7.72832475e-01 6.62571242e-03 1.03948801e-03
-7.14190194e-02 -1.32908739e-02]
[ 5.91775056e-02 4.75369308e-03 -6.20956892e-02 8.06572060e-01
7.06752388e-01 4.33605125e-02 -4.52911124e-04 -6.26476669e-02
7.98818963e-01 7.21751009e-01 1.01077259e-02 -2.26709094e-03
-6.75334937e-02 -1.38186903e-02]
[ 4.01203598e-02 1.99768691e-03 -5.45376280e-02 8.81982242e-01
6.53387621e-01 5.93868412e-03 -6.30429008e-04 -5.38798294e-02
9.05422584e-01 7.15949040e-01 -3.64702223e-04 -5.77410545e-05
-6.82197114e-02 1.74116637e-03]
[ 1.07959822e-03 1.73058480e-04 -5.44175901e-02 8.66129564e-01
6.70398558e-01 1.66385446e-02 4.30945595e-04 -5.41922390e-02
9.00744709e-01 7.08804040e-01 4.50634059e-03 5.91755415e-04
-6.23273595e-02 5.69621412e-03]
[ 4.21102694e-03 3.93742722e-03 -5.36233257e-02 1.03238139e+00
5.26661099e-01 -1.52360597e-03 -3.54072884e-03 -4.91789820e-02
9.16184120e-01 6.90672795e-01 1.16414634e-03 -2.31269771e-03
-5.94908198e-02 9.16505767e-03]
[-2.04315766e-02 5.76912708e-04 -4.85917483e-02 8.56742917e-01
6.53319067e-01 -1.99088351e-02 8.87190361e-04 -4.30689726e-02
9.27679869e-01 6.49967198e-01 1.48523330e-04 -1.12318622e-03
-4.57634914e-02 1.02198966e-02]
[-2.21043277e-02 1.45639573e-03 -4.43768450e-02 9.80622059e-01
5.40051984e-01 -4.77245102e-02 2.81496583e-03 -6.23762968e-02
1.05210311e+00 4.76482110e-01 -8.71547209e-04 6.93330613e-05
-8.64155789e-02 8.37255149e-03]
[-4.67122038e-02 -3.90292564e-05 -6.32088147e-02 1.12961323e+00
4.26270074e-01 -2.25618686e-02 -5.57393577e-03 -5.11763569e-02
9.44769422e-01 6.50075251e-01 -1.20220424e-03 -2.19911083e-03
-6.92235563e-02 6.31057580e-03]
[-3.85113231e-02 -1.01100459e-03 -5.13675566e-02 7.89344621e-01
6.40349720e-01 -3.04770202e-02 -4.07158955e-03 -4.26573559e-02
8.30388031e-01 7.82317691e-01 2.13340682e-03 1.06722282e-03
-4.83288207e-02 5.57853420e-03]]
However, Piquasso is written in a way that it supports tf.function (see Better performance with tf.function) one can also use tf.function for this task. Refactoring everything into a function, we can use the tf.function decorator. Note, that we have to avoid side effects in any function decorated with tf.function, because side effects are only executed at the tracing step. Therefore, instantiation of pq.Program should happen by
providing the instructions as constructor arguments, instead of using the with keyword.
[14]:
def calculate_loss(w, psi_star, cutoff):
d = pq.cvqnn.get_number_of_modes(w.shape[1])
simulator = pq.PureFockSimulator(
d,
pq.Config(cutoff=cutoff),
connector=pq.TensorflowConnector(decorate_with=tf.function),
)
with tf.GradientTape() as tape:
cvqnn_layers = pq.cvqnn.create_layers(w)
final_state = simulator.execute_instructions(
instructions=[pq.Vacuum()] + cvqnn_layers.instructions
).state
psi = final_state.state_vector
loss = tf.math.reduce_sum(tf.math.abs(psi - psi_star))
return loss, tape.gradient(loss, w)
improved_calculate_loss = tf.function(calculate_loss)
loss, loss_grad = improved_calculate_loss(w, psi_star, cutoff)
print(f"Loss: {loss.numpy()}")
print(f"Loss gradient: {loss_grad.numpy()}")
Loss: 2.018569979174099
Loss gradient: [[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 5.75160857e-01
6.83322458e-01 1.28450675e-02 -4.00008041e-05 -6.30548348e-03
-7.49845629e-01 -8.00313533e-01 7.09929231e-05 1.03634563e-02
-1.18960333e-02 1.04187002e-02]
[-1.35800236e-03 -5.49585214e-04 -5.68490534e-03 7.68436902e-01
5.05490527e-01 6.69074565e-03 6.61089567e-04 -5.98396524e-03
-7.70615878e-01 -7.65752853e-01 2.44147270e-03 2.03468039e-03
-4.38080826e-03 1.26999161e-02]
[ 1.33582634e-02 -1.68655402e-03 -1.85593853e-03 7.07517913e-01
5.44470265e-01 2.49539746e-02 6.42390173e-05 -1.95025626e-03
-8.25024061e-01 -6.44185279e-01 6.53653384e-03 -7.65527162e-03
9.10541653e-03 3.40278572e-03]
[ 4.26398634e-02 -1.80787941e-03 6.39415698e-03 7.10539771e-01
5.16317806e-01 4.19027819e-02 3.11151157e-04 6.34437434e-03
-7.11899830e-01 -5.86043987e-01 -1.78246076e-03 -3.32723452e-04
1.04650909e-02 4.07298677e-03]
[ 4.29666003e-02 -5.13803858e-05 4.61329396e-03 6.98424383e-01
5.25555203e-01 3.78096485e-02 -3.26929949e-04 6.85206072e-03
-7.86593321e-01 -5.59964717e-01 8.99000290e-03 4.50767474e-03
2.76895247e-02 7.51784650e-03]
[ 4.84763174e-02 -5.06656057e-03 2.09086242e-02 7.78957857e-01
4.11735915e-01 4.47041965e-02 5.57595820e-03 1.66618932e-02
-7.45897045e-01 -6.70781724e-01 4.18901129e-03 -5.51458453e-03
3.53793166e-02 2.84834471e-03]
[ 6.30724855e-02 -6.41540211e-04 2.14924447e-02 6.11241592e-01
5.05390366e-01 6.89479439e-02 -1.41398304e-03 2.94560107e-02
-8.62929995e-01 -6.09878053e-01 1.88758840e-04 -3.11469367e-03
5.47604296e-02 -3.52522121e-03]
[ 6.68569098e-02 -2.65481869e-03 3.22995882e-02 7.03713041e-01
4.32114569e-01 4.74536273e-02 -3.67396656e-03 2.02959096e-02
-8.42787956e-01 -4.25121040e-01 -3.99655314e-03 2.98835285e-04
1.63926716e-02 -1.21762767e-02]
[ 4.79882382e-02 6.12290682e-05 1.62381274e-02 8.37636878e-01
3.50979911e-01 5.30524902e-02 7.52035617e-03 1.63628512e-02
-8.23920872e-01 -6.25385991e-01 -2.62846126e-03 -5.16820365e-03
2.70939337e-02 -7.64525657e-03]
[ 6.78217682e-02 5.96498824e-04 1.31378911e-02 5.59849745e-01
5.20217392e-01 7.60063612e-02 2.26111302e-03 1.50965198e-02
-4.33486232e-01 -6.76081337e-01 -5.44203449e-03 4.00677625e-03
2.97522697e-02 1.53461233e-03]]
The first run is called the tracing step, and it takes some time, because Tensorflow captures a tf.Graph here. The size of the graph can be decreased by passing the decorate_with=tf.function argument to pq.TensorflowConnector, which also decreases the execution time of the tracing step. After the first run, a significant speedup is observed. We can also compare the runtimes of the compiled and non-compiled function.
[15]:
import time
import numpy as np
regular_runtimes = []
improved_runtimes = []
for i in range(10):
w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))
start_time = time.time()
calculate_loss(w, psi_star, cutoff)
end_time = time.time()
regular_runtimes.append(end_time - start_time)
start_time = time.time()
improved_calculate_loss(w, psi_star, cutoff)
end_time = time.time()
improved_runtimes.append(end_time - start_time)
print(f"Regular: {np.mean(regular_runtimes)} s (+/- {np.std(regular_runtimes)} s).")
print(f"Improved: {np.mean(improved_runtimes)} s (+/- {np.std(improved_runtimes)} s).")
Regular: 2.4323306560516356 s (+/- 0.25203824522512575 s).
Improved: 0.011875104904174805 s (+/- 0.0002783859661023313 s).
But that is not everything yet! One can also create a similar function with the jit_compile=True flag, since every operation in Piquasso can be JIT-compiled using XLA through tf.function.
[16]:
jit_compiled_calculate_loss = tf.function(jit_compile=True)(calculate_loss)
jit_compiled_calculate_loss(w, psi_star, cutoff)
2024-03-02 09:40:10.970536: I external/local_xla/xla/service/service.cc:168] XLA service 0x23ff2f70 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-03-02 09:40:10.970561: I external/local_xla/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2024-03-02 09:40:14.366437: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2024-03-02 09:42:47.120088: E external/local_xla/xla/service/slow_operation_alarm.cc:65]
********************************
[Compiling module a_inference_calculate_loss_184206__XlaMustCompile_true_config_proto_3175580994766145631_executor_type_11160318154034397263_.64111] Very slow compile? If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
********************************
2024-03-02 09:45:40.915775: E external/local_xla/xla/service/slow_operation_alarm.cc:133] The operation took 4m53.795791141s
********************************
[Compiling module a_inference_calculate_loss_184206__XlaMustCompile_true_config_proto_3175580994766145631_executor_type_11160318154034397263_.64111] Very slow compile? If you want to file a bug, run with envvar XLA_FLAGS=--xla_dump_to=/tmp/foo and attach the results.
********************************
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1709394340.917125 34412 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
2024-03-02 09:45:41.016959: E external/local_xla/xla/stream_executor/stream_executor_internal.h:177] SetPriority unimplemented for this stream.
[16]:
(<tf.Tensor: shape=(), dtype=float64, numpy=1.9352979440546>,
<tf.Tensor: shape=(10, 14), dtype=float64, numpy=
array([[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
2.82974314e-01, 1.00761968e+00, -4.86587480e-04,
2.35804220e-04, -1.15113732e-04, -4.06551101e-01,
-1.11506339e+00, -3.28812640e-04, 7.26509004e-03,
-5.50407493e-04, 1.95176250e-02],
[-1.27337958e-02, -4.32230679e-04, -1.16956930e-05,
2.97284980e-01, 9.58032369e-01, 7.80146796e-03,
-1.15800353e-03, -5.29595553e-04, -6.03508410e-01,
-1.00960026e+00, -6.00128740e-05, -5.43138082e-03,
1.13088525e-03, -6.98910091e-03],
[ 1.50953348e-02, 1.06793890e-04, -6.96402317e-04,
2.78072391e-01, 8.89425807e-01, 1.65284781e-02,
-3.26193983e-04, -7.32702009e-04, -4.35469885e-01,
-1.08220389e+00, -2.19933132e-03, -2.47865138e-03,
-2.93182299e-03, -1.68765726e-02],
[ 1.09036764e-02, -1.39943937e-04, -2.79208939e-03,
2.70519193e-01, 9.50698120e-01, 1.51859239e-02,
-1.67403741e-04, -2.13834584e-03, -4.22355256e-01,
-1.12201492e+00, 5.67571939e-04, 1.58293338e-02,
-2.39030263e-03, -5.86237362e-03],
[ 8.30080898e-03, -1.82451373e-04, -1.38832253e-03,
1.76303654e-01, 9.95779073e-01, 4.75550916e-03,
6.22530734e-03, -8.94953191e-03, -4.22817330e-01,
-1.02663235e+00, 4.82082440e-03, -1.31465644e-02,
-1.01452761e-02, -5.70510595e-02],
[ 2.50325193e-02, 6.29469112e-03, -1.04233986e-02,
3.23611952e-01, 8.98244843e-01, 3.23882176e-02,
-2.04379300e-03, 3.59544328e-03, -4.59367176e-01,
-1.02342603e+00, -2.11534639e-03, -1.15771764e-03,
3.74358915e-03, -3.37975680e-02],
[ 2.35414600e-02, -1.02818825e-03, 2.50828514e-03,
1.33153804e-01, 1.02103626e+00, 2.06230600e-02,
1.96143023e-03, -5.50583838e-03, -3.60952140e-01,
-1.03240699e+00, -4.49581492e-04, 8.07466573e-03,
-1.35877540e-02, -2.15258038e-02],
[ 1.43627038e-02, 1.12035672e-03, -7.07577659e-03,
8.26398345e-02, 9.67306681e-01, 1.21847068e-02,
3.93848334e-03, -1.30241315e-02, -5.96429663e-01,
-9.72524487e-01, 4.51563428e-03, -2.90174828e-03,
-2.43580857e-02, -1.92727297e-02],
[ 3.10054604e-02, 2.81632895e-04, -8.79013015e-03,
2.34154711e-01, 8.87158522e-01, 2.62144967e-02,
1.71721715e-03, -3.41037966e-03, -7.37362510e-01,
-8.97797687e-01, 2.97914053e-03, 4.39680362e-03,
-1.40056766e-02, -1.96915555e-03],
[ 2.65714043e-02, 8.56081649e-07, -4.32095216e-04,
3.61163933e-01, 7.91136183e-01, 2.11602898e-02,
1.01903539e-03, -1.55889975e-02, -8.24757241e-01,
-8.52458879e-01, 2.70664645e-03, 2.46375745e-03,
-4.45933265e-02, -1.36795389e-04]])>)
Compiling the same function takes significantly more time, but the compilation step results in an extra order of magnitude runtime improvement:
[17]:
jit_compiled_runtimes = []
for i in range(10):
w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))
start_time = time.time()
jit_compiled_calculate_loss(w, psi_star, cutoff)
end_time = time.time()
jit_compiled_runtimes.append(end_time - start_time)
print(f"Regular:\t{np.mean(regular_runtimes)} s (+/- {np.std(regular_runtimes)} s).")
print(f"Improved:\t{np.mean(improved_runtimes)} s (+/- {np.std(improved_runtimes)} s).")
print(
f"JIT compiled:\t{np.mean(jit_compiled_runtimes)} s "
f"(+/- {np.std(jit_compiled_runtimes)} s)."
)
Regular: 2.4323306560516356 s (+/- 0.25203824522512575 s).
Improved: 0.011875104904174805 s (+/- 0.0002783859661023313 s).
JIT compiled: 0.0013802051544189453 s (+/- 0.00014840260989162237 s).
We are able to use this function for. e.g., quantum state learning. Consider the state
which is an example of a so-called NOON state. We can produce this using Piquasso:
[18]:
with pq.Program() as target_state_preparation:
pq.Q(all) | pq.StateVector([0, 3]) / np.sqrt(2)
pq.Q(all) | pq.StateVector([3, 0]) / np.sqrt(2)
target_state = simulator.execute(target_state_preparation).state
target_state_vector = target_state.state_vector
psi_star = tf.Variable(target_state_vector)
Now we can demonstrate the speed of the JIT-compiled calculation by creating a simple optimization algorithm as follows:
[19]:
learning_rate = 0.01
iterations = 10000
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
w = tf.Variable(pq.cvqnn.generate_random_cvqnn_weights(layer_count, d))
for i in range(iterations):
loss, loss_grad = jit_compiled_calculate_loss(w, psi_star, cutoff)
optimizer.apply_gradients(zip([loss_grad], [w]))
if (i + 1) % (iterations // 20) == 0:
print(f"{i + 1}:\t\t", loss.numpy())
print("Final loss:\t", loss.numpy())
500: 0.8607086996549679
1000: 0.8473090910243798
1500: 0.5849377384183669
2000: 0.40879991260231896
2500: 0.29813437623431027
3000: 0.2761913995731654
3500: 0.26621822690380204
4000: 0.23435492471006458
4500: 0.23874428115482763
5000: 0.2205772331178741
5500: 0.2416911247101965
6000: 0.2146732713315681
6500: 0.1993502777279203
7000: 0.22721179422812682
7500: 0.21320290731217184
8000: 0.20555180951647123
8500: 0.21018212032348688
9000: 0.20439647783148554
9500: 0.19759860554592923
10000: 0.18922056395791387
Final loss: 0.18922056395791387
We can use the final weights to calculate the final state, and calculate its fidelity with the target state.
[20]:
program = pq.cvqnn.create_program(w)
final_state = simulator.execute(program).state
print("Final state fidelity: ", final_state.fidelity(target_state))
Final state fidelity: 0.9994946525680765