Semester of Graduation

Spring 2026

Degree

Master of Science (MS)

Department

Computer Science & Engineering

Document Type

Thesis

Abstract

Pooled multi-dataset benchmarks are an attractive way to evaluate intrusion detection systems (IDS) across heterogeneous public corpora, but they can quietly reward shortcut features tied to capture schedules and dataset identity. This work introduces TRACER, an auditable benchmark specification that standardizes seven public IDS corpora into a shared transaction-window prediction unit and a shared label ontology, enabling controlled comparisons between compact sequence backbones and strong tabular baselines under matched splits, training budgets, and scoring rules.

Under this protocol, absolute clock time is a strong shortcut under pooled random splits. Enforcing time-robust controls (timestamp rebasing, circular shifts, and schedule-token masking) reduces pooled classification macro-F1 by 0.31–0.48 across sequence models and changes architecture rankings, indicating that protocol choice can dominate apparent model gains. In this benchmark, Mamba is the strongest compact sequence model under time-robust evaluation (0.594±0.102 pooled macro-F1, n = 3), while tree ensembles on window features remain stronger for pooled classification (XGBoost 0.761±0.023 pooled macro-F1, n = 3). Per-dataset results are heterogeneous and include rank reversals that pooled averages hide.

Bounded split-hygiene audits sharpen the interpretation: zero exact-ID overlap does not imply content disjointness, and coarse group keys are still shared across splits in sampled audits. The practical implication is methodological rather than architectural: pooled IDS claims should report protocol controls, negative-control baselines, and split-hygiene evidence as first-class results.

Date

3-10-2026

Committee Chair

Ghawaly, James Michael, Jr.

LSU Acknowledgement

1

LSU Accessibility Acknowledgment

1

Share

COinS