This branch contains the relevant hardware and test/synthesis flows for cvw's unified integer/fp divide/sqrt recurrence unit. The recurrence unit can be generated for a variety configurations, which span flavors of radix = {2,4}, floating-point precision = {float,double,quad}, integer width = {unsupported,32,64} and divider copies = {1,2,4,8}.
The fpu postprocessor on cvw handles inputs not only from the div/sqrt unit, but also the fma and convert units. This branch's drsu unit contains a postprocessor with logic only relevant to division/sqrt.
# file hiearchy
The RTL files for the divider can be found under `cvw/src/fpu`
The majority of divider modules are found in `cvw/src/fpu/divremsqrt`, which also borrows some modules from `cvw/src/fpu/fdivsqrt`
divremsqrt/drsu desribes the top-level unit for the divider, taking in unpacked floating point signals, including Xs, Xm Xe, Ys, Ym, Ye.
drsu first feeds signals to `divremsqrt/divremsqrt`, which contains the preprocessor, iteration units, fsm, and postprocessing logic. The postprocessor in `divremsqrt/divremsqrt` also contains all integer postprocessing logic. Outputs from `divremsqrt/divremsqrt` are then sent to `divremsqrt/divremsqrtpostprocess`, which handles rounding and flags.
# verification flow
drsu is verified with the risc-v arch test Berkeley SoftFloat floating point suite of test vectors for floating point square-root and division. In order to run the top-level regression script, run `regression-wally-intdiv -intdiv`
The top-level regression python script is found accordingly in `cvw/bin/regression-wally-intdiv`. The testbench is found in `cvw/testbench/testbench_fp`, which runs drsu against testvectors. Batches of testvectors are stored within `cvw/testbench/tests-fp.vh`, and the raw binary test vectors are read from `tests/fp/vectors`
Regression log files can be found in `cvw/sim/questa/logs` after running `regression-wally-intdiv -intdiv`. Files are named with `{precision}_ieee_div_{R}_{K}_{integer}_rv{XLEN}gc_{TESTNAME}.log`
* precision denotes the floating-point precision types supported by the divider: f, fd, fdq, fdqh
* R denotes the radix of the divider: 2,4
* K denotes the number of divider copies in the unit: 1,2,4,8
* integer denotes whether integer division/remainder is supported on the divider: i
* XLEN denotes the width of integers: 32, 64 (this only matters if integer is supported on the divider)
* TESTNAME denotes which tests are being run:
* fdivremsqrt: runs fdiv, fsqrt, intdiv, intrem
* fdiv: runs fdiv
* fsqrt: runs fsqrt
# synthesis flow
To run synthesis results for all flavors of the recurrence unit, go to `cvw/synthDC/scripts` and run `python3`. This will execute a python script that runs the installed version of synopsis design compiler on divider permutations for a target frequency of 5GHz and 100MHz. To then pipe area, delay and energy results to a CSV, run `./`. Results can then be viewed in `fp-synthresults_reordered.csv` in a format similar to the one presented in the paper.
Wally is a 5-stage pipelined processor configurable to support all the standard RISC-V options, including RV32/64, A, B, C, D, F, M, Q, and Zk* extensions, virtual memory, PMP, and the various privileged modes and CSRs. It provides optional caches, branch prediction, and standard RISC-V peripherals (CLINT, PLIC, UART, GPIO). Wally is written in SystemVerilog. It passes the [RISC-V Arch Tests]( and boots Linux on an FPGA. Configurations range from a minimal RV32E core to a fully featured RV64GC application processor.
Wally is described in an upcoming textbook, *RISC-V System-on-Chip Design*, by Harris, Stine, Thompson, and Harris. Users should follow the setup instructions below. A system administrator must install CAD tools using the directions further down.
Wally is presently at Technology Readiness Level 4, passing the RISC-V compatibility test suite and custom tests, and booting Linux in simulation and on an FPGA. See the [Test Plan](docs/testplans/ for details.
This section describes the open source toolchain installation.
### Compatibility
The current version of the toolchain has been tested on Ubuntu (versions 20.04 LTS, 22.04 LTS, and 24.04 LTS) and on Red Hat/Rocky/AlmaLinux (versions 8 and 9).
- NOTE: Verilator does not currently work reliably for simulating Wally on Ubuntu 20.04 LTS and Red Hat 8
- [RISC-V Sail Model]( golden reference model for RISC-V
- [OSU Skywater 130 cell library]( standard cell library
- [RISCOF]( RISC-V compliance test framework
Additionally, Buildroot Linux is built for Wally and linux test-vectors are generated for simulation. See the [Linux README](linux/ for more details.
If this script is run as root or using `sudo`, it will also install all of the prerequisite packages using the system package manager. The default installation directory when run in this manner is `/opt/riscv`.
If a user-level installation is desired, the script can instead be run by any user without `sudo` and the installation directory will be `~/riscv`. In this case, the prerequisite packages must first be installed by running
In either case, the installation directory can be overridden by passing the desired directory as the last argument to the installation script. For example,
See `` for a detailed description of each component, or to issue the commands one at a time to install on the command line.
**NOTE:** The complete installation process requires ~55 GB of free space. If the `--clean` flag is passed as the first argument to the installation script then the final consumed space is only ~26 GB, but upgrading the tools requires reinstalling from scratch.
`$WALLY/` sources `$RISCV/`. If the toolchain was installed in either of the default locations (`/opt/riscv` or `~/riscv`), `$RISCV` will automatically be set to the correct path when `` is run. If a custom installation directory was used, then `$WALLY/` must be modified to set the correct path.
`$RISCV/` allows for customization of the site specific information such as commercial licenses and PATH variables. It is automatically copied into your `$RISCV` folder when the installation script is run.
Change the following lines to point to the path and license server for your Siemens Questa and Synopsys Design Compiler and VCS installations and license servers. If you only have Questa or VCS, you can still simulate but cannot run logic synthesis. If Questa, VSC, or Design Compiler are already setup on this system then don't set these variables.
Electronic Design Automation (EDA) tools are vital to implementations of System on Chip architectures as well as validating different designs. Open-source and commercial tools exist for multiple strategies and although the one can spend a lifetime using combinations of different tools, only a small subset of tools is utilized for this text. The tools are chosen because of their ease in access as well as their repeatability for accomplishing many of the tasks utilized to design Wally. It is anticipated that additional tools may be documented later after this is text is published to improve use and access.
Siemens Questa is the primary tool utilized for simulating and validating Wally. For logic synthesis, you will need Synopsys Design Compiler. Questa and Design Compiler are commercial tools that require an educational or commercial license.
Note: Some EDA tools utilize `LM_LICENSE_FILE` for their environmental variable to point to their license server. Some operating systems may also utilize `MGLS_LICENSE_FILE` instead, therefore, it is important to read the user manual on the preferred environmental variable required to point to a user’s license file. Although there are different mechanisms to allow licenses to work, many companies commonly utilize the FlexLM (i.e., Flex-enabled) license server manager that runs off a node locked license.
Although most EDA tools are Linux-friendly, they tend to have issues when not installed on recommended OS flavors. Both Red Hat Enterprise Linux and SUSE Linux products typically tend to be recommended for installing commercial-based EDA tools and are recommended for utilizing complex simulation and architecture exploration. Questa can also be installed on Microsoft Windows as well as Mac OS with a Virtual Machine such as Parallels.
Siemens Questa simulates behavioral, RTL and gate-level HDL. To install Siemens Questa first go to a web browser and navigate to Click Sign In and log in with your credentials and the product can easily be downloaded and installed. Some Windows-based installations also require gcc libraries that are typically provided as a compressed zip download through Siemens.
Many commercial synthesis and place and route tools require a common installer. These installers are provided by the EDA vendor and Synopsys has one called Synopsys Installer. To use Synopsys Installer, you will need to acquire a license through Synopsys that is typically Called Synopsys Common Licensing (SCL). Both the Synopsys Installer, license key file, and Design Compiler can all be downloaded through Synopsys Solvnet. First open a web browser, log into Synsopsy Solvnet, and download the installer and Design Compiler installation files. Then, install the Installer
Installer can be utilized in graphical or text-based modes. It is far easier to use the text-based installation tool. To install DC, navigate to the location where your downloaded DC files are and type installer. You should be prompted with questions related to where you wish to have your files installed.
The Synopsys Installer automatically installs all downloaded product files into a single top-level target directory. You do not need to specify the installation directory for each product. For example, if you specify /import/programs/synopsys as the target directory, your installation directory structure might look like this after installation:
Note: Although most parts of Wally, including the Questa simulator, will work on most modern Linux platforms, as of 2022, the Synopsys CAD tools for SoC design are only supported on RedHat Enterprise Linux 7.4 or 8 or SUSE Linux Enterprise Server (SLES) 12 or 15. Moreover, the RISC-V formal specification (sail-riscv) does not build gracefully on RHEL7.
The Verilog simulation has been tested with Siemens Questa/ModelSim. This package is available to universities worldwide as part of the Design Verification Bundle through the Siemens Academic Partner Program members for $990/year.
If you want to implement your own version of the chip, your tool and license complexity rises significantly. Logic synthesis uses Synopsys Design Compiler. Placement and routing uses Cadence Innovus. Both Synopsys and Cadence offer their tools at a steep discount to their university program members, but the cost is still several thousand dollars per year. Most research universities with integrated circuit design programs have Siemens, Synopsys, and Cadence licenses. You also need a process design kit (PDK) for a specific integrated circuit technology and its libraries. The open-source Google Skywater 130 nm PDK is sufficient to synthesize the core but lacks memories. Google presently funds some fabrication runs for universities. IMEC and Muse Semiconductor offers full access to multiproject wafer fabrication on the TSMC 28 nm process including logic, I/O, and memory libraries; this involves three non-disclosure agreements. Fabrication costs on the order of $10,000 for a batch of 1 mm2 chips.
Startups can expect to spend more than $1 million on CAD tools to get a chip to market. Commercial CAD tools are not realistically available to individuals without a university or company connection.