XML External Entity (XXE) Attack

Jul 17, 2022

In this article, I will write about the XML External Entity attack. For this attack to occur, the application must have logic for parsing XML input.

This injection will happen if there is a weakly configured XML parser. A successful attack would be if the attacker would be able to view files on the application server and interact with the backend. This XXE vulnerability could be used to perform server-side request forgery (SSRF) attacks, denial of service (DoS) Billion Laughs Attack, and many more.

What are XXE types?

There is no strict classification of XXE attacks, but we can divide them into two types: in-band and out-of-band(blind).

· In-band are more common than out-of-band ones. In this case, the attacker will receive an immediate response to the XXE payload.

· Out-of-band or so-called Blind XXE, there is no immediate response. This type involves the creation of an external Document Type Definition. For this type, the XML parser also needs to make an additional request to an attacker-controlled server.

What are the cases when attacker can execute this injection?

· In old applications where the version of SOAP is less than 1.2

· Applications where users are logged in based on their sessions - SAML(single sign-on (SSO) login standard). Chances for this attack to happen in this case can be very high because SAML uses XML for identity assertions

· If there are XML inputs or XML uploads into XML documents that can be added from untrusted data and parsed by an XML processor after that.

· There is a high risk when Document Type Definitions (DTD) is enabled


When would application parse XML?

XML is often used in both: frontend and backend web development.


The Frontend side of the application can request, for example, an XML file from API and create and present a UI form based on the data in XML. Then we can have an option to add a new field into the form and if we would like to save the changes. Afterward, the XML input would be added into the XML document.

From the backend parsing, XML would be used to transfer the data in some standard format. Also, in mobile development, Android applications use it to create layouts and store configurations.

On the OWASP site, you can find more examples of XXE attacks. Portswigger has a nicely explained example of this attack:

For example, suppose a shopping application checks for the stock level of a product by submitting the following XML to the server:

The application performs no particular defenses against XXE attacks, so you can exploit the XXE vulnerability to retrieve the /etc/passwd file by submitting the following XXE payload:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]>

This XXE payload defines an external entity &xxe; whose value is the contents of the /etc/passwd file and uses the entity within the productId value. This causes the application's response to include the contents of the file:


Invalid product ID:


List of preventions for XXE

  • Using JSON instead of XML and avoiding serialization of sensitive data

  • As I mentioned before, this attack can happen easily when the application is using SOAP < 1.2, so try to update to the higher version

  • Implement XSD validation in your application ("XML Schemas") for all XML file inputs

  • Patch or upgrade all XML libraries

  • Use SAST tools for checking out if there are XXE vulnerabilities.


How to prevent if you are using SAML?

SAML language is used to construct authorization statements, whose authenticity is protected by the XML digital signature applied over the statements.

Many attacks happen because of wrong assumptions made by developers; for example, the token is always properly formed XML compliant with SAML schema.

The developers can assume that SAML would have just one Assertion tag in the document (the properly formed SAML would have). With that fact, developers can validate just the first element they get when searching for elements by the tag name in the XML document.

To get list of nodes JS "getElementsByTagName" method can be used:

NodeList xmlNodes = doc.getElementsByTagName("saml:Assertion");

To xmlNodes will be assigned the list of matching elements from document with tag Name "saml:Assertion".

As developers can assume that this is the properly formed SAML with one Assertion tag, they will get the first element and validate it after:

let firstElement = (Element)xmlNodes.item(0);

*As you can guess, this is not the proper way to validate the tag because the attacker can also assume that developers used this approach for the validation. In this case, the attacker can catch the first element (tag) and replace it with a malicious assertion before the original one, and it will never be detected.

With the same logic, some developers use "getElementsByTagNameNS" but the result would be the same: easily inserted malicious script in the first element.

Proper prevention would be:

· Parsing the XML document. Using structure validation based on the supplied schema. Never allow automatic download of schemas from the third party but prefer to use local trusted copies. It would also be good if it is possible to inspect schemas and perform schema hardening. This could be used to disable possible wildcard types or relaxed processing statements.

· Digital signature validation, which verifies the authenticity and integrity of the assertion embedded in the SAML document. This prevents forgery.

**Most important when writing schema is to describe the intended document's structure precisely.


How to prevent using XSD validation?

I will explain how to create a C# solution to validate XML data.

The most important reason we want to use XSD (XML Schema Definition) validation is that we want the sender and receiver to have the same "expectations" about the content. Using schemas, we need to describe exactly the data so both parties would be clear about them.


· Add XML file into the code

When adding XML file, you will just see xml tag:

<?xml version="1.0" encoding="utf-8" ?>

I will add object User with properties FirstName, LastName, Address, so xml file would look like this:

· Create XML Schema for this file

You will get XML schema structure like this:

· Modify XSD

Now you can modify the file- add validations for FirstName and Address. In this case, I just show how to add validations for these fields, but they will, of course, not prevent the attack; they will just validate the length and the type of mentioned fields.

· Validate XML using XSD

What am I doing in the code?

  • Getting the local path of Assembly so I can after add XML file name and XSD file name to get their full paths

  • Creating schema using XmlSchemaSet and XmlSeverityType which are from System.Xml.Schema

  • Using XMLReader from System.XML so I can create XDocument imported from System.Xml.Linq

  • When I create document, I want to use validate method that class has and pass schema by which I will validate and the method ValidationEventHandler (I named it like that) which is throwing exception if type is error. In this method you should add all validation logic.

This is just an example on how to create XSD for XML file and which libraries you can use for the validation.

How to prevent with implementation of DTD?

We can also validate XML file using DTD. Here are some differences between XSD and DTD on site.

In this example, I am validating an XML file using a DTD file with DtdProcessing.


  • Setting the validation settings using XmlReaderSettings

  • Creating the XmlReader object so I can parse the file using the method read()

  • Creating ValidationEventHandler method which is throwing an exception if the type is an error. In this method, you should add all validation logic.

List of SAST testing tools

SAST testing tools will help you with static application security testing.

SAST tools can be free, commercial, and open-source tools.

A list of the most popular SAST Tools currently are:

  • Veracode

  • LGTM

  • Checkmarx

  • Klocwork

  • Reshift

  • SpectralOps

  • HCL AppScan

  • Codacy

  • Insider CLI

  • Argon


Why is SOAP version < 1.2 vulnerable to XXE attack and why you should use later versions?


Before version 1.2 external entities were allowed within SOAP messages.

Since version 1.2 some changes were introduced to the envelope and encoding schemas. Both schemas have been updated to be compliant with the XML Schema Recommendation.

You can see the list of recommendations which were used:

· http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/

· http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/

· http://www.w3.org/TR/1999/REC-xml-names-19990114

· http://www.w3.org/TR/2000/REC-xml-20001006

· http://www.w3.org/TR/2000/PR-xlink-20001220/

Also, additional changes occurred in this version, within the names of datatypes in the XML Schema specification, and some datatypes were removed. If you want check out all changes which were made you can go to this site.



This article presented some prevention steps that could help you defend your application from XXE attack.

The OWASP team, which is constantly working to discover new ways the attackers can exploit your application and perform their malicious actions, are always updating their Prevention Cheat Sheet.

The best way to secure your application would be to always be up to date with the new prevention ways: best libraries to use, best detection tools, etc.

In the end, secure code is the cheapest code!    

Cover photo by Joshua Woroniecki

#XXE_attack #XSD #DTD #SAML #vicarius_blog


  • #vicarius_blog


Written by

Jenny R

Recent Posts

  • 1

    Not So Fast: Analyzing the FastCompany Hack

    John Kilhefner September 29, 2022
  • 2

    How to test application with ZAP - Part Two

    Jenny R September 28, 2022
  • 3

    How to test application with ZAP - Part One

    Jenny R September 28, 2022
  • 4

    The World's Worst Hackers Have Flags

    Paul Lighter September 27, 2022
  • 5

    Intro to Windows (Win32) API

    acephale 4w September 26, 2022

Start Closing Security Gaps

  • Risk reduction from Day 1
  • Fast set-up and deployment
  • Unified platform
  • Full-featured 30-day trial