How fuzz testing finds vulnerabilities other tools miss

How fuzz testing finds vulnerabilities other tools miss

Every security tool has a blind spot. Static analysis finds code-level issues but cannot see runtime behaviour. Penetration testing finds what a skilled tester thinks to look for but covers a fraction of the possible input space. Unit testing confirms that components behave correctly under planned conditions. None of these approaches systematically explores what happens when a system receives inputs it was never designed to handle.

That is the space fuzz testing occupies. Not as a replacement for the tools already in use, but as the method that explores the input space those tools structurally cannot reach. The vulnerability classes it finds are not rare or obscure. They are among the most consistently exploited in real-world attacks, and they are found disproportionately through the kind of systematic unexpected-input generation that fuzz testing provides.

This guide explains what those vulnerability classes are, why other tools miss them, and why protocol-aware fuzz testing finds things that generic fuzzing approaches also leave behind.

What Fuzz Testing Finds

Fuzz testing finds vulnerabilities that arise when software receives inputs it was not designed to handle. The specific classes it surfaces consistently include buffer overflows, input validation failures, state machine bugs, authentication bypasses triggered by malformed inputs, and denial-of-service conditions caused by resource exhaustion or unexpected parsing behaviour.

These are not edge case vulnerabilities. Buffer overflows have appeared in the OWASP Top 10 and the CWE Top 25 most dangerous software weaknesses for years. Input validation failures underpin a significant proportion of published CVEs across every major software category. State machine bugs in protocol implementations are a consistently underassessed attack surface in connected devices and industrial systems. The fact that these classes appear so regularly in real-world exploit data is not coincidental. It reflects the fact that they are the classes most likely to arise in the conditions that fuzz testing specifically explores, and least likely to be found by the tools most commonly used to assess software security.

What fuzz testing does not find is everything. It does not assess business logic flaws that require contextual understanding of what the software is supposed to do. It does not replace the human judgment that a skilled penetration tester brings to exploitability assessment. It does not find vulnerabilities that only manifest in specific deployment configurations that the test environment does not replicate. Understanding what fuzz testing finds, and what it does not, is the starting point for understanding where it belongs in a security programme.


Fuzz Testing vs Other Security Tools

The comparison between fuzz testing and other security tools is most useful when framed around what each approach can and cannot see, rather than which is better. They are not alternatives. They have different blind spots, and a programme that uses only one of them has the blind spots of the ones it does not use.

Static analysis examines source code or compiled binaries without executing them. It finds issues that are visible in the code itself: known dangerous function calls, code paths that could produce out-of-bounds writes, missing null checks, and similar structural issues. Static analysis is fast, can be integrated into build pipelines, and finds issues early. What it cannot do is observe runtime behaviour. A vulnerability that only manifests when a specific sequence of valid-looking inputs drives the system into an unexpected state is invisible to static analysis, because static analysis never runs the code.

Penetration testing applies human expertise and judgment to assess the security of a system. A skilled tester understands attack techniques, reasons about the system’s architecture, identifies likely vulnerability locations, and assesses exploitability. Penetration testing produces high-quality, contextualised findings. Its fundamental constraint is coverage. A penetration test explores the input space the tester thinks to explore. The inputs that cause the most significant vulnerabilities in fuzz testing are often the ones nobody thought to try, which is precisely why fuzzing finds them.

Unit and integration testing verify that software behaves correctly under planned conditions. They are written by developers who understand what the software is supposed to do, and they test that it does it. They are not designed to explore what happens outside the planned conditions, and they are not effective at finding the vulnerabilities that arise there. A unit test suite with 100% code coverage can coexist with a buffer overflow triggered by a field value two bytes longer than the maximum the developer thought to test.

Fuzz testing occupies the space that all three of these approaches leave uncovered. It runs the code, which static analysis does not. It explores the input space systematically at scale, which penetration testing cannot. And it specifically targets the conditions outside the planned inputs that unit testing does not reach. The vulnerability classes it finds are the ones that fall into all three blind spots simultaneously.


Why These Vulnerability Classes Matter

The vulnerability classes that fuzz testing finds are not academically interesting edge cases. They are the classes that appear most frequently in real-world exploits, in published CVE data, and in the incident reports of organisations that have been compromised.

Buffer overflows remain one of the most exploited vulnerability classes despite decades of awareness. They arise when a programme writes data beyond the allocated memory boundary for a buffer, potentially overwriting adjacent memory in ways that allow an attacker to control programme execution. They are found most reliably by providing inputs that exceed expected length boundaries, which is exactly what boundary value testing within a fuzz testing programme does systematically. A developer who tests a field with values up to the documented maximum will not find the buffer overflow triggered by a value one byte longer.

Input validation failures cover a broad category of vulnerabilities arising from software that does not adequately verify the format, type, range, or content of inputs before processing them. They underpin SQL injection, command injection, XML parsing vulnerabilities, and a significant proportion of the protocol parsing bugs found in industrial and connected device security assessments. Fuzz testing finds them by generating inputs that violate format rules in specific, targeted ways and observing whether the software handles the violation safely.

State machine bugs arise in protocol implementations and stateful software when the system enters an unexpected state as a result of an input sequence that its developers did not test. A protocol implementation that handles each individual message type correctly may still crash, hang, or enter an exploitable state when it receives a valid-looking message in a state where that message type is not expected. Finding these bugs requires testing not just individual inputs but sequences of inputs that drive the implementation through its state machine, which is a capability that distinguishes protocol-aware fuzz testing from generic approaches.

Denial-of-service conditions caused by malformed inputs are particularly significant in operational technology environments where availability is a safety-critical requirement. An input that causes a control system component to crash, hang, or consume excessive resources may have operational consequences well beyond the security implications. Robustness testing within a fuzz testing programme finds these conditions before they are discovered by a fault in the field or an attacker deliberately targeting availability.

Authentication bypasses triggered by malformed inputs represent a distinct and serious vulnerability class. These arise when the authentication logic in a system can be bypassed not by providing valid credentials but by providing inputs that cause the authentication process itself to fail in a way that grants access. They are found by testing the authentication interface with inputs that violate the expected format and observing whether the failure mode is safe.


How Fuzz Testing Surfaces Each Vulnerability Class

The mechanism by which fuzz testing finds each of these vulnerability classes is worth understanding, because it explains why systematic input generation is necessary and why the vulnerabilities are not found by other means.

Buffer overflows are found through boundary value testing at the edges of valid input ranges, combined with generation of inputs significantly beyond those ranges. A fuzz testing platform generates field values at the documented maximum, one byte beyond it, two bytes beyond it, at the integer maximum for the field type, and at other boundary points that trigger the conditions under which buffer overflows arise. This is not sophisticated. It is systematic. The reason it finds vulnerabilities that developers miss is that developers test the values they expect and fuzz testing tests the values they do not.

Input validation failures are found by generating inputs that violate format rules in specific ways: wrong data types in typed fields, prohibited characters in character-constrained fields, null bytes in string fields, negative values in unsigned integer fields, and format strings in fields expected to contain plain data. A fuzz testing platform generates these variations systematically across all input fields, finding the places where validation is absent, incomplete, or inconsistently applied.

State machine bugs are found by generating sequences of inputs that drive the implementation through state transitions, then testing each state with inputs that are invalid for that state. This requires the fuzz testing platform to model the state machine of the protocol or system being tested, which is why protocol-aware fuzz testing finds these bugs and generic fuzz testing often does not. A platform with no model of the protocol can only send individual messages. A platform with a formal protocol model can generate the sequences that reach specific states and then test those states with targeted invalid inputs.

Denial-of-service conditions are found through robustness testing: sustained generation of malformed inputs, resource exhaustion conditions, and inputs designed to trigger processing that consumes disproportionate CPU or memory. The conditions that cause a system to become unavailable under malformed input are often distinct from the conditions that cause crashes or exploitable behaviour, and finding them requires specific test case generation targeting resource consumption rather than just memory safety.


Why Protocol-Aware Fuzzing Goes Further

Generic fuzz testing generates unexpected inputs. Protocol-aware fuzz testing generates unexpected inputs that are structurally plausible within the protocol being tested. The distinction determines whether the test cases reach the application logic where most vulnerabilities sit, or are rejected at the protocol framing layer before they get there.

A generic fuzzer sending random bytes to a Modbus endpoint will have most of those bytes rejected immediately. The framing layer checks for the correct function code, the correct data length, and the correct CRC before passing the message to the application logic. Random bytes fail these checks. They are discarded. The application logic never sees them. The vulnerabilities in the application logic are never triggered.

A protocol-aware fuzzer with a formal model of Modbus generates messages with valid function codes and correct CRCs, but with field values, lengths, or sequences that are invalid in specific, targeted ways. These messages pass the framing layer checks. They reach the application logic. They trigger the parsing and processing behaviour where the interesting vulnerabilities sit. The difference in finding rate between generic and protocol-aware fuzzing for protocol implementations is not marginal. It is the difference between finding real vulnerabilities and generating a lot of traffic that is discarded before it reaches anything meaningful.

State coverage is the second dimension where protocol-aware fuzzing goes further. A fuzzer with a formal protocol model can drive the implementation through specific state transitions and then test each state with targeted invalid inputs. A generic fuzzer has no model of the state machine and cannot reliably reach specific states. The state machine bugs that are among the most significant vulnerability classes in protocol implementations are only consistently findable by a fuzzer that understands the protocol well enough to drive the implementation into the states where those bugs manifest.

The protocols used in industrial and telecoms environments, including Modbus, DNP3, IEC 61850, IEC 60870-5-104, DLMS COSEM, GTP-C, GTP-U, SS7, and Diameter, are binary protocols with specific framing requirements, complex state machines, and application logic that generic fuzz testing cannot reach. Protocol-aware fuzzing designed specifically for these protocols is the only approach that finds the vulnerabilities their implementations carry at the depth and scale a security assurance programme requires.


Where Fuzz Testing Fits Alongside Existing Tools

Fuzz testing is most effective when it is integrated into a security programme that already includes the other tools, rather than substituted for them. The combination of static analysis, penetration testing, and fuzz testing covers a significantly larger proportion of the vulnerability space than any of them covers individually.

In the development lifecycle, fuzz testing fits at the integration and system testing stage, after unit testing has verified component behaviour under planned conditions and before penetration testing assesses exploitability of the integrated system. At this stage, fuzz testing explores the input space that unit testing has left uncovered, finds the vulnerabilities that arise from component interactions under unexpected inputs, and produces a finding set that informs the scope and focus of subsequent penetration testing.

For protocol implementations specifically, fuzz testing should be conducted against each significant protocol interface before the product ships. IEC 62443-4-1 Practice 6 requires vulnerability testing that includes robustness and negative testing techniques for all external interfaces, which fuzz testing directly satisfies. The compliance case and the security case align: the testing that the standard requires is also the testing most likely to find the vulnerabilities that matter most in these environments.

In ongoing security assurance, fuzz testing provides regression coverage for protocol implementations that change over time. A new message type added to a protocol implementation, a change to parsing logic, or an update to an underlying library can introduce new vulnerabilities. Running the fuzz test corpus against the updated implementation catches regressions before they reach the field, at a fraction of the cost of discovering them through a field incident or a customer security assessment.


What Good Fuzz Testing Output Looks Like

The output of a fuzz testing programme determines whether the findings it produces drive remediation or sit unactioned. Understanding what good output looks like helps teams commission fuzz testing that produces results they can use.

Each finding needs the exact input that triggered it. For protocol testing, that means the precise message content, the protocol state at the time it was sent, and the sequence of prior messages that established that state. Without the exact triggering input, the finding cannot be reproduced, the root cause cannot be identified, and the fix cannot be verified. A fuzz testing report that describes vulnerability classes without providing triggering inputs is not actionable.

The observed behaviour needs to be documented precisely. A crash needs to be described with the memory address, the register state, and the input field that caused it, not just noted as a crash. An unexpected response needs to capture the exact response content and explain why it is unexpected relative to the protocol specification. An authentication bypass needs to document the exact input sequence that triggered it and the access that was obtained. Precision in finding documentation is what enables the engineering team to understand, reproduce, and fix the issue.

Severity classification needs to reflect exploitability in the specific deployment context. A crash triggered by a message that requires prior authentication is a different risk from the same crash triggered by an unauthenticated connection. The classification needs to distinguish these cases and provide the security team with a prioritised list of what to fix first, not an undifferentiated list of everything that went wrong.

For IEC 62443 compliance purposes, the output needs to map findings to specific standard requirements with documented methodology and traceability. SVV-3 requires that vulnerability testing evidence includes the scope of testing, the methodology used, and traceability from test cases to the requirements being verified. A fuzz testing report that does not include this structure will not satisfy a certification audit regardless of how thorough the testing was.


How ProtoCrawler Finds What Other Tools Miss

ProtoCrawler is CyTAL’s automated protocol fuzz testing platform. It is built specifically to find the vulnerability classes that generic fuzz testing and other security tools miss in protocol implementations and connected devices.

For each supported protocol, ProtoCrawler uses a formal protocol model to generate test cases that are structurally plausible but contain specific, targeted flaws. The test cases pass framing layer checks and reach the application logic. They target the boundary conditions, state transitions, and field combinations where implementation vulnerabilities are most likely to sit. They are not random inputs. They are systematically generated negative test cases derived from knowledge of the protocol specification.

The monitoring layer captures the target’s response to each test case with the precision that useful output requires. Crashes are captured with the triggering input and the protocol state at the time of the crash. Unexpected responses are recorded with the exact response content. Protocol state violations are flagged with the message sequence that produced them. Every finding is scored and classified to give the security team a prioritised remediation list they can act on.

ProtoCrawler supports more than 100 protocols including Modbus, DNP3, IEC 61850, IEC 60870-5-104, GTP-C, GTP-U, DLMS COSEM, MQTT, SS7, and Diameter. The output maps directly to IEC 62443 compliance requirements, producing the SVV-3 vulnerability testing evidence, CR 3.5 input validation findings, and CR 7.1 denial-of-service protection assessments that certification audits require. For the full protocol list, see the protocol models page.

Ready to see what fuzz testing finds in the protocols your systems depend on? Book a demo to see ProtoCrawler in action against the specific protocols in your environment.


Common Questions About Fuzz Testing and Vulnerabilities

Does fuzz testing find all vulnerabilities in a system?

No. Fuzz testing finds vulnerabilities that arise from unexpected inputs, which covers a significant and consistently underassessed portion of the vulnerability space. It does not find business logic flaws that require contextual understanding of what the software is supposed to do. It does not assess the exploitability of findings in the way a penetration tester does. It does not replace static analysis for code-level issues that do not manifest at runtime. A programme that uses fuzz testing alongside static analysis and penetration testing covers significantly more of the vulnerability space than one that uses any of them alone.

How is fuzz testing different from fuzzing in the context of AI?

The term fuzzing appears in different contexts with different meanings. In software security, fuzzing and fuzz testing refer to the technique of generating unexpected inputs to find vulnerabilities in software. In the context of AI and machine learning, fuzzing sometimes refers to techniques for testing the robustness of machine learning models. The two uses of the term are distinct. This guide, and ProtoCrawler, address software security fuzzing: the systematic generation of unexpected inputs to find vulnerabilities in protocol implementations and connected devices.

What percentage of vulnerabilities does fuzz testing typically find?

There is no universal figure, because it depends on the type of software, the protocols in use, the quality of the fuzz testing approach, and what other testing has already been conducted. What the research and field data consistently show is that fuzz testing finds a significant proportion of the memory safety and input validation vulnerabilities that are not found by other methods. For protocol implementations specifically, protocol-aware fuzz testing regularly finds vulnerabilities in products that have passed penetration testing and static analysis, because it reaches input conditions that neither of those approaches systematically explores.

How long does fuzz testing take to find vulnerabilities?

It depends on the complexity of the target, the number of protocols and interfaces being tested, and the depth of testing required. Simple interfaces with limited state can be tested in hours. Complex protocol implementations with many message types and rich state machines require longer campaigns to achieve meaningful coverage. The practical answer for most protocol security assessments is that meaningful findings begin to emerge within the first few hours, with deeper findings accumulating over days of testing. ProtoCrawler is designed to produce prioritised findings continuously rather than at the end of a fixed testing period.

Can fuzz testing be run continuously in a CI/CD pipeline?

Yes. Continuous fuzz testing integrated into a development pipeline is one of the most effective ways to catch regressions as protocol implementations change. Each build triggers a fuzz testing run against the updated implementation, and any new findings are flagged before the change reaches production. For products with active development cycles, continuous fuzz testing provides ongoing regression coverage that periodic point-in-time assessments cannot match. ProtoCrawler supports integration into development pipelines for organisations that want this level of ongoing coverage.

Ready to see what fuzz testing finds in your protocol implementations? Book a demo to discuss how ProtoCrawler fits into your security programme.

Book a demo

This field is for validation purposes and should be left unchanged.

Book Your Free Demo

Complete the form and we will confirm your slot within 1 business day.

By submitting, you agree to Cytal storing your information to arrange this demo. We will never share your details with third parties. Privacy Policy. Unsubscribe at any time.