Enhancing Small Language Models for Code Generation via Strategic Decomposition and Filtering

Code Generation Language Models Decomposition Small Models Benchmarking

Authors

Downloads

This study addresses the challenge of enhancing Small Language Models (SLMs) for complex code generation tasks requiring structured planning, which current models struggle with due to their monolithic, single-pass generation approach. A three-stage pipeline architecture is proposed that decouples strategic planning from implementation: (1) an SLM generates diverse natural language strategies at high temperature, (2) a filtering mechanism selects high-quality strategies while removing noise, and (3) refined strategies guide a specialized coding model for final implementation. The approach was evaluated on the ClassEval benchmark for class-level code generation. The pipeline enabled a 1.5B parameter model to achieve 13% class success rate, representing a 30% relative improvement over direct generation (10%) and competitive performance with models 5-8 times larger. Critically, effective strategy filtering proved more important than strategy diversity, with simple pattern-based filters successfully mitigating SLM artifacts like few-shot contamination. This work demonstrates that structured, inference-time computation offers an efficient alternative to parameter scaling, with strategic noise reduction being the key driver of performance gains in resource-constrained models.