This is a guest post written by Miguel del Rio Fernandez, senior speech scientist at Rev, a company known for its speech-to-text service for the legal trade and elsewhere.
If you were to stop a (relatively technical) stranger on the street and ask what AI is, chances are they would describe a large language model (LLM) like ChatGPT, Gemini, Claude, or DeepSeek-R1, which have defined the AI space lately… so what does del Rio Fernandez think we need to know next?
He writes in full as follows…
These well-known AI models are just one representation of what generative AI technology offers. Not to be overlooked are small language models (SLMs), which can offer advanced efficiency and edge computing capabilities that LLMs can’t.
Though SLMs’ applications are narrower than those of LLMs, they offer distinct advantages at a lower cost. Selecting the right kind of model for a specific use case – and understanding LLMs’ and SLMs’ respective strengths and limitations – is key to maximising efficiency, accuracy and sustainability in AI solutions.
SLM advantages
SLMs are more cost-effective, requiring less computational power and reducing training time and costs. This allows for faster prototyping, fine-tuning, and rollout of solutions. That rapid iteration also makes diagnosing issues with the data much easier and helps optimise solutions to tasks faster.
Though SLMs are associated with sensitive, mission-critical information, that doesn’t make them less secure than LLMs – actually, the opposite is true. Putting an SLM on-premises, as close to the endpoint as possible, allows for better data privacy than cloud deployments, as well as lower latency to the end-user.
SLMs are also more environmentally sustainable than LLMs. Because of their smaller size, they consume less energy and resources in both training and inference environments, shrinking their environmental footprints.
SLM limitations
The specificity of SLMs cuts both ways. They are limited to performing well on the specific tasks they are trained on, and, like all machine learning models, are affected by the biases in their training data. Their limitations are outlined by the scope of their training. Performance degrades fast when they’re tasked with something outside that scope, and pulling outside knowledge to reduce dataset bias isn’t an option.
Unlike SLMs, LLMs can perform reasonably well on tasks beyond their training scope and are better suited to tackle complex tasks. Though they also suffer from bias, LLMs’ broader training and generalisation capabilities allow them to learn from other sources without as much dataset-specific bias as SLMs encounter.
Optimal SLM use cases
SLMs are generally best suited for speed- and resource-constrained tasks or tasks where domain-specific knowledge will solve a problem. These are proven solutions with a wide range of applications, even in today’s post-LLM world.
SLMs can be impactful on key verticals including customer relationship management, finance and retail. For example, they can be leveraged as categorization, risk assessment, or sentiment analysis solutions.Regardless of the vertical, stakeholders must carefully analyze the task at hand to choose the best solution, whether that’s an SLM or LLM, support vector machine or decision tree.
Even the choice between the two types of models has its complexities.
To figure out whether an LLM, an SLM, or even multiple SLMs are best suited to an application, ask yourself three things:
- Is the problem well-defined? In cases where the task is straightforward, specialised and has a reasonable amount of data to train your model on, odds are an SLM can be just as good as, if not better than, a general LLM.
- Does the problem require broad generalisation or complex understanding? These tasks often require the extensive knowledge and capabilities of an LLM. You could theoretically train an SLM to get the job done, but an LLM is almost certainly going to outperform it.
- What are your constraints? This is the ultimate deciding factor between using an LLM and multiple SLMs. Speed- or resource-constrained environments will struggle running and/or loading LLMs – in this case, even if it means worse performance, SLMs are the only choice. But if managing multiple models for your team is too difficult or expensive, an LLM can be a valid choice, even if SLMs could produce better results.
You might also consider domain-specific LLMs, which can be as good as or better than SLMs, since they’re trained on both more general and domain-specific data. This edge can allow domain-specific LLMs to generalise much more effectively than SLMs, even on within-domain tasks.

Rev’s del Rio Fernandez,: Fine-tuned SLMs are often preferable to domain-specific LLMs on narrowly defined domains & tasks.
That said, fine-tuned SLMs are often preferable to domain-specific LLMs on narrowly defined domains and tasks, as well as in cases with strict speed and/or resource constraints.
Key takeaways
With the spread of open-source models fueling innovation, developers can spin up new SLMs and domain-specific LLMs more easily than ever. There is an ever-expanding world of AI solutions out there.
It takes deep problem knowledge – as well as an understanding of each model type’s optimal use cases – to determine where in your AI system to deploy SLMs, LLMs, or both. With intelligent routing to direct tasks to the right model, a multi-model system can make a big impact on both your bottom line and the user experience.