With large language models (LLMs) more widely adopted across industries, securing these powerful AI tools has become a growing concern. At Black Hat Asia 2025 in Singapore this week, a panel of experts delved into the question of whether LLM firewalls are the solution to AI security or simply another piece of the puzzle.

Matthias Chin, founder of CloudsineAI, a cyber security firm, kicked off the discussion by noting the differences between guardrails and firewalls: “Guardrails are a protective mechanism while a firewall is a security concept with a lot more functionalities – for example, it is a control point, incorporates guardrails, has a threat vector database and integrates with SIEM [security information and event management] workflows.”

Pan Yong Ng, chief innovation officer and chief cloud engineer of Home Team Science and Technology Agency (HTX), noted the importance of integrating AI security into the foundation of an organisation’s IT infrastructure, though there remains uncertainty over where the security controls should be. He suggested a combination of controls at various layers, from inference-serving models to web application security and even extending to AI agents.

Laurence Liew, director for AI innovation at AI Singapore, said the growing use of AI agents and LLMs by developers will require the use of LLM firewalls to enforce guardrails and corporate policies. “We tell our young engineers to make sure they have certain guardrails in place, and they will do it, but they are often so busy with all the coding that the guardrails may not be updated,” he said.

Xiaojun Jia, research fellow at Singapore’s Nanyang Technological University, noted the limitations of traditional firewalls in addressing LLM-specific security issues. “Traditional firewalls focus on network security, making them less effective in defending against LLM jailbreaking attacks which exploit the logic flow of LLMs,” he said.

Chin added that LLM firewalls are not just for ensuring AI security – they are also in AI safety to prevent model hallucination and the generation of biased and toxic outputs. They can also guard against new-generation attacks that are executed in human language and prompts, rather than code.

The panellists also explored the question of whether a universal LLM firewall would be effective across all industries, or if customisation would be necessary.

Liew singled out AI Singapore’s transcription projects with governmental agencies, where speech-to-text engines are fine-tuned to the needs of each agency. In the same vein, he said, LLM firewalls should be carefully designed to handle specific scenarios such as healthcare and financial services.

In terms of implementing LLM firewalls, Jia advocated for a multi-layered approach that includes input detection, model tuning and output filtering.

“Input detection detects malicious inputs before the prompts are fed to the models, model tuning ensures outputs are aligned with human values, and output filtering detect outputs which are harmful,” Jia said, even as he acknowledged the challenge of balancing security with usability, calling for adaptive defences that can respond to evolving attacks.

Testing and benchmarking is key to ensuring that LLMs and their firewalls work as intended. Chin said this area is still evolving and that the work to be done will depend on test cases, aligning with the sector where the firewalls will be deployed, whether it be banking or healthcare. He pointed to Meta’s CyberSecEval and Singapore’s AI Verify as examples of initiatives that can help to support testing and benchmarking efforts.

Liew pointed to the importance of diverse teams in building and testing LLMs. “It’s very important to have people across different disciplines,” he said. “Make sure you have people in the team who understand the domain. You’ll be surprised the questions they ask are never thought of by cyber security engineers.”

On whether LLM firewalls will hinder innovation in AI, Chin said with the adoption of emerging technologies such as model context protocol (MCP) – an open standard developed by Anthropic that lets AI agents communicate with other applications and data sources – there’s a chance that AI agents will bypass LLM firewalls and start communicating with other agents. “We have to allow innovation to thrive and continue to build the agility to deal with the challenges,” he added.

Chin said LLM firewalls will continue to evolve, driven by the rise of agentic AI frameworks and that organisations will need some form of guardrails or firewalls, especially large enterprises and governments. Just as how network firewalls now include web application firewalls and endpoint firewalls, he noted that LLM firewalls could take the form of hardware or software deployed at security control points and endpoints.


By itnews