Why Small, Purpose-Trained AI Models Beat Frontier LLMs at Offensive Security

In live-browser exploit benchmarks, Novee’s 4B-parameter model achieved up to 90% accuracy, outperforming Claude 4 Sonnet and other frontier LLMs by over ~55%.