BSI Turns AI Finance Into An Audit Question
Germany’s most useful AI compliance document for finance is not a manifesto. It is a test catalogue.
The Federal Office for Information Security, the BSI, has put a finance-specific frame around trustworthy AI systems. Its AI page points to the Test Criteria Catalogue for AI Systems in Finance, developed under the AICRIV project. The catalogue is aimed at developers, providers, operators and testing organisations in the financial industry. It is written in English because the intended audience is not only German banks and insurers, but financial and insurance institutions across the EU.
That language choice matters. Germany is using a national cyber authority to produce a working control instrument for a European regulatory problem. The EU AI Act sets obligations. Financial firms still need to translate those obligations into checks, artefacts and governance routines. BSI’s catalogue is one of the clearer attempts to make that translation operational.
The BSI describes the catalogue as covering nearly 100 practical test criteria. The areas listed are not cosmetic: IT security, data quality, model robustness, governance, human oversight and performance. Those are the same control families that decide whether an AI system in a bank is merely well described or actually manageable. In finance, that distinction is the difference between a model inventory and a supervisory file that can survive a challenge.
This is also why the catalogue is more important than another high-level AI ethics note. Banks and insurers already know they need policies for model risk, outsourcing, information security, data protection and consumer treatment. The harder problem is proving that a specific AI system satisfies those policies. A credit model, a fraud-detection tool, an AML monitoring system or an internal coding assistant each creates a different evidence burden. The BSI catalogue pushes the conversation toward testability.
BaFin’s own risk work explains the demand side. In its 2025 digitalisation trend, the supervisor says financial firms are increasingly using artificial intelligence, especially generative AI, across the value chain. It also describes many generative AI initiatives as still being in pilot or test phases. The most common large-language-model uses are internal assistants, document preparation and developer support. Customer contact exists as a potential use case, but BaFin says it has so far been relatively limited.
That is a sober picture. The current German supervisory issue is not a sudden wave of autonomous customer-facing bank bots. It is the quieter spread of AI into internal production, risk and operations. Those systems can change how documents are processed, how code is written, how alerts are triaged and how staff make decisions. They may not sit in front of the consumer. They still sit inside the control environment.
BSI’s catalogue lands in that gap. It gives the financial sector a way to ask whether an AI system is secure, whether the data pipeline is fit for purpose, whether robustness has been tested, whether human oversight is real and whether performance claims can be validated. That is the language auditors, model-risk teams and supervisors can use together.
There is a second German angle. The BSI is a cyber authority, not a conduct regulator. Its entry point is security and trustworthiness. That makes the catalogue especially relevant where AI risk overlaps with operational resilience: poisoned data, adversarial inputs, opaque third-party models, undocumented dependencies, weak monitoring and brittle fallbacks. For banks that are already adapting to DORA, the catalogue points toward the next layer of scrutiny. It is not enough to say the ICT service is resilient if the AI component inside the process cannot be tested.
The catalogue also reduces one common excuse in AI governance. Firms often argue that standards are too unsettled to build detailed controls. That is partly true. The AI Act still needs implementation practice, and sector supervisors will refine expectations. But the absence of final case law does not mean the absence of testable controls. The BSI has listed concrete dimensions and linked the work to existing standards and regulation. A bank can disagree with a criterion. It cannot credibly say there is no starting point.
For international readers, the point is not that Germany has solved AI supervision. It has not. The point is that the German compliance stack is becoming more operational before many firms have moved beyond pilots. BaFin is watching AI use in finance. BSI is offering test criteria for finance-specific AI systems. The EU AI Act supplies the legislative frame. Together, they create a direction of travel: evidence before scale.
That will suit some institutions better than others. Large banks and insurers already have model-risk, information-security and internal-audit teams that can absorb a catalogue and turn it into control mappings. Smaller fintechs and service providers may find the evidence demand heavier. But that burden is also the message. In German financial supervision, AI adoption is not judged only by whether a use case is innovative. It is judged by whether the institution can show how the system behaves, how it fails and who is accountable when it does.
The first real AI compliance race in German finance may therefore be less glamorous than the market expects. It is not a race to deploy the most visible assistant. It is a race to build files that satisfy security, governance and audit questions before the pilot becomes infrastructure.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.