A Python RESTful API framework for online malware analysis and threat intelligence services.
_ _
| | | |
_ __ ___ __ _| |___ _ _| |__ +--+
| '_ ` _ \ / _` | / __| | | | '_ \ +--+|
| | | | | | (_| | \__ \ |_| | |_) | | |+
|_| |_| |_|\__,_|_|___/\__,_|_.__/ +--+
malsub is a Python 3.6.x framework that wraps several web services of online malware and URL analysis sites through their RESTful Application Programming Interfaces (APIs). It supports submitting files or URLs for analysis, retrieving reports by hash values, domains, IPv4 addresses or URLs, downloading samples and other files, making generic searches and getting API quota values. The framework is designed in a modular way so that new services can be added with ease by following the provided template module and functions to make HTTP GET
and POST
requests and to pretty print results. This approach avoids having to write individual and specialized wrappers for each and every API by leveraging what they have in common in their calls and responses. The framework is also multi-threaded and dispatches service API functions across a thread pool for each input argument, meaning that it spawns a pool of threads per each file provided for submission or per each hash value provided for report retrieval, for example.
The following services are currently included in malsub:
Most of these services require API keys that are generated after registering an account in their respective websites, which need to be specified in the apikey.yaml
file according to the given structure. Note that some of the already bundled services are limited in supported operations due to the fact that they were developed with free API keys. API keys associated with paid subscriptions are allowed to make additional calls not open to the public and may not be restricted by a given quota. Yet, malsub can process multiple input arguments and pause between requests as a workaround for cooldown periods.
The main goal of malsub is to serve as a one-stop-shop for querying multiple online services of malware analysis and for aiding investigators. It is thus suitable for incident response, forensic and malware analysts, as well as for security practitioners alike.
malsub requires a few modules that are specified in requirements.txt
. The framework is structured into a package and sub packages. Its folder structure and some key files to be taken into consideration when using it or developing additional service modules are described as follows:
malsub/malsub.py
: application entry point;malsub/data/
: miscellaneous data folder;
apikey.yaml
: YAML data file of the API key and username pairs;malsub/downl/
: downloads folder of files and samples;malsub/malsub/
: malsub package;malsub/malsub/common/
: modules that have a common use all throughout;
out.py
: module with output displaying functions according to specific formats and log level (debug, verbose, informational or error);frmt.py
: module with pretty display functions like dictionary to JSON and tabular formats;rw.py
: module with read and write functions;malsub/malsub/core/
: core modules of the application;
web.py
: module responsible for handling HTTP requests;malsub/malsub/service/
: services developed as modules parsed during runtime;
base.py
: base template module for service construction.The supported options are the following:
Usage: malsub [-h] [-a <service>] [-H] [-p <num>] [-R] [-v ...]
[-d | -f | -q | -r | -s | -t]
[-i | -o | -l | -u]
[<input> ...]
Interact with online malware, URL and intelligence analysis services for malware
samples, domain names, IP addresses or URLs.
Options:
-h, --help show this help message and exit
-a, --analysis <service> character-separated list of services (class or short names) [default: all]
-H, --servhelp show help messages about selected services and exit
-p, --pause <num> wait an interval in seconds between service requests (rate limit) [default: 0]
-R, --recursive recurse on input paths
-v, --verbose display verbose and debug messages
API functions:
-d, --download download files or malware samples
-f, --find search for arbitrary terms (input format irrelevant)
-q, --quota retrieve API user quota
-r, --report retrieve submission reports for domains, files, hash values, IP addresses or URLs
-s, --submit submit malware samples or URLs for analysis
-t, --test test API calls by calling each service function as defined with some default values
Input formats (hash values or files are given as default depending on options):
-i, --ipaddr input are IPv4 addresses (applies to '-r' only)
-o, --domain input are domain names (applies to '-r' only)
-l, --appl input are hash values for application lookups (applies to '-r' only)
-u, --url input are URLs (applies to '-r' and '-s' only)
Supported hash values: MD5, SHA1, SHA-256 and SHA-512.
$ python3.6 malsub.py -a avc,ha -q -v
$ python3.6 malsub.py -vva VirusTotal -su <url>
$ python3.6 malsub.py -a mt,qs,virustotal -p 60 -s <file1> <file2>
$ python3.6 malsub.py -a VxStream,vt -rRv <file> <path> <hash>
$ python3.6 malsub.py -or <domain>
$ python3.6 malsub.py -a all,-ThreatCrowd,-mt -or <domain>
$ python3.6 malsub.py -a pe -r <hash>
$ python3.6 malsub.py -a ms -d <hash>
Modules of services are developed as subclasses of the Service
class in malsub/service/base.py
. Service
is an abstract class that lays out the attributes and functions that must be implemented by subclasses to ensure that service modules have all that is necessary for the main application. The full list of supported API functions is the following:
download_file
: download a file or a sample matching a given hash value;report_file
: retrieve an analysis report for a file submission identified by its hash value;submit_file
: submit a file for analysis;report_app
: retrieve a report for a known application given a hash value;report_dom
: retrieve a report for a domain name;report_ip
: retrieve a report for an IPv4 address;report_url
: retrieve a report for a URL;submit_url
: submit a URL for analysis;search
: perform searches of arbitrary terms;quota
: query user quota data.All the above listed API functions have specific signatures in respect to arguments that must be respected. The framework works with simple custom File
and Hash
classes to respectively represent files and hash values by defining a few file attributes like name and size and hash type. Some of the API functions receive such File
and Hash
objects as arguments, which must be accessed by their attributes in order to build request parameters.
For instance, the report_file
function of the VirusTotal
service module example below, here included for illustration, accesses the hash value through hash.hash
. Moreover, the framework reads API key and username pairs from apikey.yaml
and sets them to their respective services during initialization. These can be retrieved in the form of a dictionary with self.get_apikey()
. Note that making HTTP requests requires only passing a well-defined APISpec
object to the request
function. APIs vary from service to service, particularly in the fields in which data is transmitted across, with some using HTTP query parameters, others the body of POST
requests or even cookies. The APISpec
class is prepared to handle all these variations that are diligently passed onto requests
for HTTP communications.
from malsub.service.base import APISpec, Service
from malsub.core.crypto import Hash
from malsub.core.file import File
from malsub.core.web import request, openurl
from malsub.common import out, frmt
class VirusTotal(Service):
# full name of the service
name = "VirusTotal"
# short name of the service
sname = "vt"
# length of the API key as a single integer or a list of integers specifying
# multiple key lengths
api_keyl = [32, 48, 64]
# API specification to download a file or a sample
api_dowf = APISpec()
# API specification to retrieve a file report
api_repf = APISpec("POST", "https://www.virustotal.com", "/vtapi/v2/file/report")
# ... (other API functions specifications)
# '@Service.unsupported' marks a function as unsupported by a particular
# service, being ignored by the main application
@Service.unsupported
def download_file(self, hash: Hash):
# all base functions need to be explicitly declared even if not used by
# a service
pass
def report_file(self, hash: Hash):
# 'hash' is an object of 'Hash' with a hash value and its type (MD5,
# SHA1, SHA-256 or SHA-512)
# fill in the request fields
# in this case, the 'data' corresponds to the HTTP POST body data
# 'self.get_apikey()' returns the corresponding API key pair
self.api_repf.data = {**self.get_apikey(), "resource": hash.hash}
# make the request
data, _ = request(self.api_repf)
# format results as a JSON dictionary and "prune" the tree at depth one
data = frmt.jsontree(data, depth=1)
# data = frmt.jsonvert(data["scans"]) # tabular format
# open a URL in the browser
openurl(data["permalink"])
# return the pretty-print formatted data ready for display
return out.pformat(data)
# ... (other API functions)
Starting with version 1.3 malsub can be used outside of its own repository as a regular module to make use of its customized modules, classes and functions. An example follows:
import malsub
from malsub.service import serv as services
hashes = malsub.core.crypto.parse_hashl(
"5f782447ff08ee2c36a5d3988dd4774674233f8ea0e1ad09dd766e842d785ddd",
"58ffd8200db69cfbc048921e18d79aa1adc894af34db3e1782e53beab3b2d98f")
API_KEY = {'apikey': {'apikey': '...truncated...'}}
vt = services['vt']
vt.set_apikey(API_KEY)
for h in hashes:
vt.download_file(h)
You can run easily malsub in a container by using the supplied Dockerfile
.
First build the container:
$ sudo docker build -t malsub .
Then run it with the proper command:
$ sudo docker run --rm -it -v ${PWD}/downl:/app/downl malsub -a urlhaus -r 4293c1d8574dc87c58360d6bac3daa182f64f7785c9d41da5e0741d2b1817fc7
We created a volume mounting of the downl
folder, so any download operation will copy the file outside of the container.