Architecture

WASMShark is composed of three independent analysis layers that each produce a verdict, which are then correlated for high-confidence detection.

┌─────────────────────────────────────────────────────────────┐
│                        WASMShark                            │
│                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │   STATIC    │  │   DYNAMIC   │  │   eBPF RUNTIME      │  │
│  │  ANALYSIS   │  │  (Wasabi)   │  │    MONITOR          │  │
│  │             │  │             │  │                     │  │
│  │ • Parser    │  │ • Instr     │  │ • execve()          │  │
│  │ • CFG       │  │   counting  │  │ • mmap W+X          │  │
│  │ • Taint     │  │ • Call graph│  │ • mprotect EXEC     │  │
│  │ • Entropy   │  │ • State     │  │ • connect()         │  │
│  │ • Rules     │  │   machine   │  │ • /proc monitor     │  │
│  │ • Plugins   │  │ • Dyn CFG   │  │                     │  │
│  └──────┬──────┘  └──────┬──────┘  └──────────┬──────────┘  │
│         │                │                    │             │
│         └────────────────┴────────────────────┘             │
│                          │                                  │
│                ┌─────────┴──────────┐                       │
│                │     CORRELATOR     │                       │
│                │                    │                       │
│                │ CONFIRMED_AUTORUN  │                       │
│                │   CONFIRMED_XOR    │                       │
│                └─────────┬──────────┘                       │
│                          │                                  │
│                   ┌──────┴──────┐                           │
│                   │   VERDICT   │                           │
│                   │ MALICIOUS   │                           │
│                   │ 100.0/100   │                           │
│                   └─────────────┘                           │
└─────────────────────────────────────────────────────────────┘

Module Map

Module	Responsibility
`wasmshark.py`	CLI entry point, orchestrates all modules
`wasmshark_core.py`	Parser, disassembler, CFG builder, taint analysis, rule engine, scoring, HTML/JSON/SARIF generation
`wasmshark_advanced.py`	WASI capability analyzer, loop characterizer, obfuscation classifier, section anomaly detector, scan history
`wasmshark_cfg_analysis.py`	Dominance tree, SCC, natural loops, irreducibility, path counting, CFG fingerprinting
`wasmshark_dynamic.py`	State machine extraction, dynamic CFG reconstruction, static/dynamic divergence analysis
`wasmshark_wasabi.py`	Wasabi instrumentation runner, Node.js integration, analysis JS, result parser
`wasmshark_ebpf.py`	bpftrace eBPF monitor, /proc polling, alert generation, runtime report
`wasmshark_watch.py`	File system watcher, CI/CD integration
`wasmshark_yara.py`	YARA rule integration (optional)

Data Flow

WASM binary
    │
    ▼
WASMParser.parse()
    │ AnalysisReport
    ▼
ScoringEngine.score()
    │
    ├──► RuleEngine.evaluate()
    │        │ matched_rules, findings
    │
    ├──► PluginManager.run()
    │        │ plugin_results
    │
    ├──► WasabiRunner.run()          (--wasabi)
    │        │ WasabiResult
    │        ▼
    │    extract_state_machine()
    │    reconstruct_dynamic_cfg()
    │    analyze_divergence()
    │
    ├──► generate_html_report()      (--html)
    ├──► to_json_report()            (--json)
    └──► to_sarif()                  (--sarif)

Plugin Interface

class WASMPlugin:
    name:        str   # Plugin identifier
    description: str   # Human-readable description
    version:     str   # Version string

    def analyze(self, report: AnalysisReport) -> dict:
        """
        Receive the complete analysis report.
        Return a dict of plugin results.
        Keys should be snake_case strings.
        Must include a 'summary' key.
        """
        ...

Rule Engine

Rules are parsed from .wsr files using a simple block format. The rule engine evaluates conditions against the AnalysisReport object and creates Finding objects for each match.

Condition evaluation supports:

String matching against report fields
Numeric comparisons (>, <, >=, <=)
Boolean flags (has_start_func, is_wasi, etc.)
Crypto constant presence checks
Score threshold checks