**Maximum-Severity XXE Vulnerability Discovered in Apache Tika**
Apache Tika, a widely used open-source content analysis toolkit, has been found to be vulnerable to a maximum-severity XML External Entity (XXE) injection attack. The vulnerability, tracked as CVE-2025-66516 with a CVSS score of 10.0, allows attackers to trigger an XXE injection in Apache Tika's core, PDF, and parser modules.
According to the advisory, the vulnerability is caused by a malicious XFA file being embedded inside a PDF, which tricks Tika into processing external XML entities. This opens a path for attackers to access sensitive internal resources. The affected versions of Apache Tika include tika-core (1.13-3.2.1), tika-pdf-module (2.0.0-3.2.1) and tika-parsers (1.13-1.28.5) modules on all platforms.
XXE injection is a type of security vulnerability that occurs when an application parses XML input insecurely, allowing attackers to load external entities that reference files or URLs outside the document. The project maintainers urge users to install updates as soon as possible to prevent potential attacks.
"Critical XXE in Apache Tika tika-core (1.13-3.2.1), tika-pdf-module (2.0.0-3.2.1) and tika-parsers (1.13-1.28.5) modules on all platforms allows an attacker to carry out XML External Entity injection via a crafted XFA file inside of a PDF," the advisory reads.
This vulnerability is similar to CVE-2025-54988, but it expands the scope of affected packages in two ways:
- While the entrypoint for the vulnerability was the tika-parser-pdf-module as reported in CVE-2025-54988, the vulnerability and its fix were actually in tika-core.
- The original report failed to mention that in the 1.x Tika releases, the PDFParser was in the “org.apache.tika:tika-parsers” module.
Apache Tika is an open-source content analysis toolkit used to extract text, metadata, and structured information from virtually any type of file. It is widely used in systems like search indexes, document ingestion pipelines (e.g., Apache Solr, Elasticsearch), compliance tools, and content analysis platforms.
The project maintainers recommend installing the latest updates as soon as possible to prevent potential attacks. Users who have upgraded only the PDF module without upgrading tika-core to version 3.2.2 or later remain exposed.