VeriGen: a novel AST-based fuzzer to stress-test Verilog tools on under‑explored features

Based on a Thesis submitted in fulfillment of requirements for the degree of Master of Engineering in Electronic and Information Engineering

This article introduces Verilog fuzzing, explains my approach, gives a brief demo, and presents results.


Motivation: Why Fuzz Verilog?

Verilog is a Hardware Description Language (HDL) used to describe the structure and behaviour of digital circuits for ASICs, FPGAs, CPUs and more. It was standardised as IEEE‑1364 (1995) and later subsumed into SystemVerilog (IEEE‑1800, 2005), which added assertions, interfaces and object‑oriented features suitable for both design and verification.

Designs typically flow through two classes of Verilog‑consuming tools:

  • Simulators — test circuit behaviour without physically building the circuit.
  • Synthesis tools — translate HDL into a low‑level gate‑level netlist and map to real hardware components.

These tools are widely trusted, yet they’re complex, often closed‑source, and effectively treated as black boxes. That combination means subtle bugs can slip through.

HDL Synthesis Flow Parse HDL Elaborate structure Optimise logic Map to real components Output gate-level netlist Parse HDL Elaborate structure Optimise logic Map to real components Output gate-level netlist

Figure 1: Design flow of a synthesis tool

What is Hardware Fuzzing?

Fuzzing is the automated generation of unexpected or “weird” inputs to stress systems and uncover bugs, crashes, or inconsistencies. In a hardware toolchain context, that means generating random (but valid) Verilog designs to probe simulators and synthesisers beyond standard workflows. Even well‑tested tools can fail under unusual, edge‑case inputs; fuzzing helps reveal those weaknesses early.

So, Why Would One Want to Fuzz Verilog Tools?

Bugs in HDL tools can be very costly:

  • Silent functional errors in hardware designs.
  • ASIC spins or FPGAs that pass simulation but fail in the real world.
  • Divergent behaviour across vendors and from the language standard (recent work has shown inconsistent support across front‑ends).

Fuzzing lets us push these tools into corners where such misbehaviours surface.


Prior Work: Progress and Gaps

Two notable fuzzers illustrate the landscape:

  • VeriSmith: generates deterministic Verilog and uses equivalence checking to find synthesis bugs. It avoids undefined behaviour but does not support the generate construct or module hierarchy. It has previously found 11 bugs across Yosys, Vivado and Icarus.
  • VlogHammer: uses non‑deterministic designs for differential testing, but lacks support for behavioural Verilog and multi‑module structures.

Historically, fuzzers have avoided two features that are common in real designs yet tricky for tools: the generate construct and hierarchical naming. This project targets exactly those.

  VeriSmith VlogHammer VeriGen
Deterministic Design Generation Yes No Yes
Employ Equivalency Checking Yes: formal equivalency checking No: differential testing Yes: calculates the expected-value of a design
Implements the generate construct and Hierarchical Naming No No Yes

Table 1: feature comparison between Verilog fuzzers


Target 1: The Generate Construct

Introduced in IEEE‑1364‑2005, generate comes in three forms that mirror software control flow:

  • if
  • for
  • case (including casez and casex)

Synthesis tools resolve these during elaboration, enabling scalable structural replication (e.g., a ripple‑carry adder built from repeated module instances connected in sequence).

module ripple_carry_adder #(parameter SIZE = 4) (
    input [SIZE-1:0] a,
    input [SIZE-1:0] b,
    input ci,
    output [SIZE-1:0] sum,
    output co
);
    wire [SIZE:0] carry;
    assign carry[0] = ci;
    genvar i;
    for (i = 0; i < SIZE; i = i + 1) begin: adder_array
        full_adder fa(
            .sum(sum[i]), 
            .carry_out(carry[i+1]), 
            .a(a[i]), 
            .b(b[i]), 
            .carry_in(carry[i])
        );
    end
    assign co = carry[SIZE];
endmodule

Figure 2: A ripple-carry adder constructed using the for generate block


Target 2: Hierarchical Naming

Formalised in IEEE‑1364‑1995 (though non-standard implementations existed before this), hierarchical names reference signals or modules via scoped paths, e.g. top.top_c1.out referred to inside top without an explicit wire. This is common in simulation for debugging/monitoring and design reuse. Synthesis support is limited — notably, cross‑module references (XMR) are broadly disallowed.

This fuzzer tests naming across nested modules, relative references, $root‑prefixed paths, and defparam‑style overrides.

[DIAGRAM PLACEHOLDER: Module tree with absolute/relative hierarchical references and optional $root/defparam]


Aims and Requirements

Functional requirements:

  • Deterministic Verilog code generation, randomised test generation per iteration.
  • Tool invocation and robust output handling.
  • Discrepancy detection (expected vs observed).
  • Logging and reproducibility/determinism (seeded runs).

Non‑functional requirements:

  • Performance (≲ 1 min/iteration for simulation; synthesis may be longer e.g. ~7 mins in Vivado).
  • Extensibility and maintainability.
  • Portability (Windows and Linux).

Architecture at a Glance

The fuzzer is AST‑based: designs are constructed as Abstract Syntax Trees representing Verilog structures, enabling fine‑grained control over generation and mutation.

Run-time CLI Config AST + Gen Libraries Verilog Fuzzer (C++) Equivalency Checker Tool Wrapper (Bash + C++) Toolchain Results Recorder/Reporter parameters generated Verilog expected result status + output pass/fail

Figure 3: Overview of the fuzzer architecture

Key Design Choices

  • AST‑based generation for structural control and flexibility.
  • Seeded randomness for reproducible, debuggable fuzz cases.
  • No external stimulus needed: each generated design computes a known constant output, sidestepping testbench input generation.
    • This uses expected‑value propagation so every design encodes a “golden” result for automated checking.
  • Tool‑agnostic CLI wrapper to support multiple back‑ends (e.g. Vivado, Quartus, Icarus).
  • Expression operators limited to ADD and XOR to reduce error masking (e.g., a bad sum cancelled by SUB).

Design Generation

Two complementary generators are responsible for generating HDL designs:

  1. Generate‑focused: builds nested generate blocks with tunable loop parameters (start values, bounds) per seed.
  2. Hierarchy‑focused: creates multi‑level module trees with options such as $root prefixing and defparam overrides. Leaf modules can themselves include generate constructs.

All designs are produced from the AST and serialised for analysis. Seeding ensures deterministic regeneration of any failing case.


Demo: CLI Examples

# Hierarchical naming + ModelSim simulation
./fuzz -t 5 --hier -n 10 --depth 3 --root-prefix --defparam

# Generate-construct designs: Quartus (synth) + ModelSim (sim)
./fuzz -t 1 -n 5 --depth 4

# Hierarchy with embedded generate blocks: simulate in ModelSim and Icarus
./fuzz -t 6 --hier -n 5 --depth 3 --include-gen

Results

Across 67k generated designs, all back-ends (Quartus, ModelSim, Vivado, Icarus) completed with no synthesis crashes, simulation errors, or equivalence mismatches. Determinism checks produced bit-identical output for fixed seeds.

ID Depth Tool Purpose Iterations
TC1 3 Quartus and ModelSim Nested-loop correctness 6 500
TC2 3 Vivado Nested-loop correctness 500
TC3 5 Icarus and ModelSim Nested-loop correctness with hierarchical naming 10 000
TC4 5 Icarus Hierarchical naming test 50 000

Table 2: Iterations per Test Campaign (TC)

Note: Vivado support was enabled late in the project; ModelSim was added near the end. Coverage therefore reflects limited runtime on those back‑ends relative to Icarus.

On coverage, the generator—despite being much smaller than VeriSmith—already reaches comparable regions of Icarus’s codebase. The generate-only and hierarchy-only modes land within 1–2% of each other, hinting that each explores distinct parts of the front-end. Combining both gives the best overall result and comes closest to VeriSmith. Icarus was rebuilt with coverage to obtain these measurements.

Figure 4: Coverage results recorded through Icarus, comparing VeriSmith and VeriGen


Conclusions and Further Work

This work delivers a novel AST‑based Verilog fuzzer focused on under‑tested yet practical features, generate and hierarchical naming, using deterministic, constant‑evaluating designs to avoid external test vectors and enable robust automated checking.

Key takeaways:

  • Competitive coverage despite a simpler grammar.
  • Determinism and expected‑value propagation make debugging tractable.
  • Targeting language features (not just tokens) reveals meaningful behaviours in real tools.

Future extensions:

  • Broader expression/operator set and deeper nesting.
  • More synthesis/simulation back‑ends.
  • Richer discrepancy oracles (e.g., cross‑tool differential checks, waveform‑level comparisons).

Full technical report (PDF)
Source code (GitHub)