Extended-variable probabilistic computing with probabilistic d-dimensional bits

Duffee, Christian 1 ; Athas, Jordan1; Grimaldi, Andrea2; Volpe, Deborah3; Finocchio, Giovanni4; Wei, Ermin1; Khalili Amiri, Pedram1

Published Sep 19, 2025 on Dryad. https://doi.org/10.5061/dryad.sxksn03gd

Data files

Sep 19, 2025 version files 3.73 MB

Extended_P_Computing_Data.zip

3.72 MB
README.md

10.17 KB

Abstract

Ising machines can solve combinatorial optimization problems by representing them as energy minimization problems. A common implementation is the probabilistic Ising machine (PIM), which uses probabilistic (p-) bits to represent coupled binary spins. However, many real-world problems have complex data representations that do not map naturally into a binary encoding, leading to a significant increase in hardware resources and time-to-solution. Here, we describe a generalized spin model that supports an arbitrary number of spin dimensions, each with an arbitrary real component. We define the probabilistic d-dimensional bit (p-dit) as the base unit of a p-computing implementation of this model. We further describe two restricted forms of p-dits for specific classes of common problems and implement them experimentally on an application-specific integrated circuit (ASIC): (A) isotropic p-dits, which simplify the implementation of categorical variables resulting in ~34x performance improvement compared to a p-bit implementation on an example 3-partition problem. (B) Probabilistic integers (p-ints), which simplify the representation of numeric values and provide ~5x improvement compared to a p-bit implementation of an example integer linear programming (ILP) problem. Additionally, we report a field-programmable gate array (FPGA) p-int-based integer quadratic programming (IQP) solver which shows ~64x faster time-to-solution compared to the best of a series of state-of-the-art software solvers. The generalized formulation of probabilistic variables presented here provides a path to solving large-scale optimization problems on various hardware platforms including digital CMOS.

Dataset DOI: 10.5061/dryad.sxksn03gd

Description of the data and file structure

Files were generated that represent individual probabilistic problems (generate✱.py), which were then simulated using various simulator scripts (✱.cpp). A probabilistic Ising machine application-specific integrated circuit was specified using Verilog (rtl/✱.v) alongside look-up tables (rtl/✱.mem), and controlled using C code (✱.c) that ran on an included RISC-V CPU. Verilog (rtlFPGA/✱.v) code using look-up tables (dataFPGA/✱.txt) was also used as an integer programming probabilistic Ising machine. The same problem was run using GAMS (gamsSolvers/✱) as a comparison.

Files and variables

File: Extended_P_Computing_Data.zip

Description: Compressed file of all included data and code

File: data/✱.txt

Description: Human-readable format describing various linear programming problems. The first line lists the used integer variables, the next few lines specify constraints, and the last line specifies the objective function.

File: data/✱.csv

Description: J/h matrix specification file of the used problems. The first lines are an NxN grid which each cell represents the weighted connection between the row and column index probabilistic element. The last row represents the offset of each element.

File: dataFPGA/✱.txt

Description: The look up tables that are used to implement the probabilistic integers' updates on the field-programable gate array implementation.

File: gamsSolver/cplex.opt

Description: Files used to specify additional needed parameters for the Cplex solver in GAMS. The main .gms file is required to load this file when using the Cplex solver.

File: rtl/✱.mem

Description: The look up tables that are used to implement the hyperbolic arch-tangent LUT to implement updates on the application-specific integrated circuit implementation.

File: FigureData.xlsx

Description: Data plotted in each of the figures in the main text. Each sheet corresponded to one (sub)-figure.

3a: The unitless energy of all integer pairs for the ILP problem with the constraint x_1+ 3x_2=0, objective expression of -x_1+x_2, C=1, and O=1/5 . The values have been linearly normalized so the ground state has an energy of 1. x_1 is determined by the row and x_2 by the column.

3b: The observed state probability of a p-bit implementation of the same problem with β=0.1. As with all probability data, this is represented as a proportion.

3c: The observed state probability of same problem implemented using the same parameters with p-ints.

4a: Results from a p-bit implementation of a 3-partition problem composed of 14 randomly generated integers, ranging between 1 and 6, over trials of 512 iterations with a constant β of 1/32. As C is increased, an increasing proportion of the PIM’s states during each trial are valid encodings. This pattern is displayed as both the proportion of trials in which any iteration had a valid encoding, and the proportion of all iterations which had a valid encoding. For valid encodings the unitless error was calculated as the total absolute difference between each partition’s sum and that of the true solution. The lowest error found in each trial increases with C for most of its range after a minimum. Notably, for low C values, few or no states were valid in 10,000 trials, resulting in a noisy or undefined mean error, respectively. O is kept constant at 1.

4b: The same setup, except that a constant C of 94 is used and β is swept. Lower temperature correlates with a higher proportion of valid states; however the best error is minimized at a medium temperature.

4c: The same problem, sweeping the number of iterations performed for a p-bit PIM with a C of 94 and an isotropic p-dit³ PIM, both at a β of 1/32. For low numbers of iterations, no trials resulted in the p-bit PIM finding the true ground state solution. For each number of iterations, the isotropic p-dit³ system had a significantly higher success rate. In both cases, results from the ASIC over at least 250 trials closely matched those from simulations over 10,000 trials.

5: Results from an implementation of a 6-partition problem of 1,000 randomly generated numbers between 1 and 100, using isotropic p-dits⁶. The unitless error reduces as the number of iterations increases, with a sharp decrease starting at around 27 iterations. When the error is sufficiently low, beginning around 2¹⁰ iterations, the true solution success rate grows rapidly. A linear β sweep from 1/32 to 1 is used over 1,000 trials for each data point.

6a: The proportion of 300 iteration-long trials using variable temperatures in which the true solution to a change-making problem was found. Specifically, the problem was composed to find change for a sum of $1.34 using the minimum number of coins worth 3¢, 4¢, 7¢, and 11¢, respectively. A C of 1 and O of 96 were used.

6b: Simulated and ASIC results showing the p-int implementation greatly outperforming the p-bit implementation, with both using the best β found from the first plot (1/64 for p-int and 1/128 for p-bit). Each simulated data point is averaged over 1,000 trials, while each experimental data point is averaged over 250 trials.

7a_: Heat maps of the proportion of 500 trials of 8,192 iterations in which the optimum solution was found in a fixed-charge problem. During each trial, β was swept from the listed β_0 value to 32 times its value, using a base-2 logarithmic sweep. C was kept constant at 1. Each row represents a value of the constraint constant, while each column represents a value of β_0. β_0 was swept from 1/524288 to 4096. C was swept from 1/2048 to 2048 at which point the success probability had gone to zero for all β_0 values across all implementations.

7a_Slack_Even: As above where a traditional slack variable representation of the problem with even sampling. A relatively wide range of both β_0 and O resulted in decent performance.

7a_Slack_Scaled: As above where a slack variable representation and variable range scaled sampling were used. Compared to the even sampling, the optimal range is slightly more defined.

7a_Violation_Even: As above where a violation variable representation with an even sampling as used. Good performance is found for a large region of β_0 and O.

7a_Violation_Scaled: As above where a violation variable representation and scaled sampling were used. Great performance is found for a large region of the explored variables.

7b: Performance of the four representations for the same problem over 1,000 trials of varying length. A pronounced increase in performance is found by using a violation variable representation over traditional slack ones. A clear increase is found by using adjusted sampling over even sampling in the violation variable case, and a lesser but still distinguishable increase is shown after 5,000 iteration trials in the slack variable case. For both slack representation formulations, β_0 =1/256 and O=2, while both violation representation formulations used β_0 =1/4 and O =1/4.

8: A comparison of the time required to produce a solution, and the quality of solutions for an IQP problem for each solver. The generated problem includes 17 integer variables, 51 linear inequality constraints (34 being bounds), and a quadratic objective function. The p-int implementation was synthetized on an FPGA. It contained two cores with two pipelined instances each and used β_0=3.3 and O=0.0008. If no solution was found by the iteration cutoff, the state of the system was reset, with the trial continuing. The remainder are state-of-the-art software solvers included in GAMS. The p-int data is composed from 25 trials, while the solvers data arose from a trial with the default seed. Despite the slower 50 MHz clock speed of the FPGA compared to the 4.31 GHz boost of the CPU, the p-int system in both configurations outperformed all software solvers which found the problem’s true solution. Configured with a cutoff of 2²¹, a ~64x improvement in time-to-solution was observed for the p-int solver compared to the MOSEK solver. The time in seconds to the best found solution and the unitless error of the best found solution is shown for each solver.

Code/software

File: gamesSolvers/test17.gms

Description: Code used to specify the chosen program for use by the GAMS solver. The code must be modified to choose the specific solver. General algebraic modeling system (GAMS) release 49.3.0 was used.

File: rtl/✱.v

Description: Verilog files describing the application-specific integrated circuit design which was used. The design is based on the Caravel Harness and was manufactured through Efabless.

File: rtlFLGA/✱.v

Description: Verilog files describing the field programmable gate array design which was used. A Terasic Cyclone IV Altera DE2-115 was used. Quartus Prime was used to program the device.

File: generateInstanceQIP.py

Description: Helper Python 3 which was used to generate a random quadratic integer programming problem.

File: generateVerilogFromJ.py

Description: Helper Python 3 script which was used to generate Verilog code from a probabilistic program design.

File: generate✱.py

Description: Helper Python 3 script which was used to generate J/h files (typically .csv) for specific programs.

File: getEnergy.cpp

Description: Helper C++ program which is used to print the energy levels of each system configuration.

File: ✱.cpp

Description: Simulator programs which are used to simulate the evolution of different probabilistic Ising machines.

File: ✱.c

Description: C code which is executed on the RISC-V CPU built into the application-specific integrated circuit to setup and readout the state of the probabilistic Ising machine.

ASIC design

The ASIC PIM was defined using RTL Verilog code which was processed by the OpenLane RTL to GDSII pipeline, using the Skywater 130 nm open-source process design kit (PDK). The PIM was manufactured using the Efabless multi-project wafer (MPW) service, which also provided the design for a co-integrated small CPU based on a VexRiscv minimal+debug configuration [61]. An oscillator running at 10 MHz was used as the ASIC’s clock for all experiments shown.

Experimental code

All non-trivial problems and their Ising representations were created using self-made Python 3 code. All simulated data were gathered using self-made C++ code. Self-made C code executed on the RISC-V CPU was used to run all experimental trials. Communication with the RISC-V CPU, to upload trial code and to record results, was done using the UART protocol over a USB cable.

IQP Comparison

Solvers used for the IQP comparison were bundled with GAMS 49.3.0 [62] and accessed using GAMS Studio 1.20.2. To the best of our knowledge, all local, standalone (not utilizing other solvers included in GAMS as subsolvers) non-convex MIQCP solvers that were included in an academic license have been considered. For each trial, the solvers were instructed to use 4 threads, and to have a timeout of 10 minutes, with all other parameters kept at their default values, including the seed. For CPLEX, ‘OptimalityTarget’ was set to ‘3’ to allow it to process a non-convex problem. The solution time for each solver was extracted from the produced logfiles, where it is reported differently for each solver. CPLEX and MOSEK reported solution time to the nearest hundredth of the second, while ALPHAECP and XPRESS reported solution time to the nearest second. Total execution time was not considered. All solvers finished execution on their own apart from XPRESS which was halted after the timeout period. This comparison was conducted on a Ryzen 7 4700U which has a frequency of 2.0 GHz and a Turbo Clock of 4.1 GHz. Memory utilization was not a limiting factor for any solver.

FPGA design

All FPGA demonstrations were conducted using a Terasic Cyclone IV Altera DE2-115. A 50 MHz oscillator included on the FPGA was used as the driving clock. The design was defined by RTL SystemVerilog code compiled by Quartus Prime Version 23.1std.0 Build 991. One push-button was used to reset the pseudo-random number generator to an initial seed. The other independently reset the state of the PIM. The pseudo-random number generator is based on the state update procedure of the 32-bit PCG pseudo-random number generator [63]. It has a period of 2³³ states, corresponding to ~86 seconds. For each set of initial, randomly generated seeds, 5 trials were conducted. Readout of the iteration in which the true solution was found was done through LEDs.