Censys + Forrester discuss the effects of digital transformation in our latest Webinar | Register Now

CVE-2021-22205: It Was A GitLab Smash

Hosts with potentially vulnerable GitLab instances.


GitLab is an open-source code repository system for software development, primarily used by large organizations to manage DevOps and other related software projects. GitLab’s primary offering is managed and hosted by the GitLab company itself; they also offer a self-hosted version of their product for both communities and enterprises with varying levels of support.

On May 07, 2021, a researcher named “vakzz” (William Bowling) reported a nasty bug that targeted the GitLab enterprise and community software via the HackerOne bug bounty program. GitLab engineers promptly fixed the issue, and the vulnerability seemed to have gone under the radar for a few months. That is until a user on Twitter reported that a botnet of thousands of compromised GitLab instances started a massive DDoS campaign, generating over one terabit of data per second (that’s 75,000 ZD/m (Zip Disks per minute)).

What is the Issue?

According to the original report, images uploaded to the GitLab server get passed through a Perl tool called ExifTool to strip out useless metadata. Unfortunately, ExifTool also supports a DjVu file format, which will process user-defined S-Expressions, a LISP-like evaluation language found in specific sections of the DjVu file. A related bug, CVE-2021-22204, was announced back in April 2021, wherein the same researcher showed that the ExifTool DjVu did not properly sanitize inputs before calling the Perl function eval(), allowing for user-controlled, arbitrary Perl code execution (Remote Code Execution). William Bowling wrote an excellent rundown on his blog with more details on the research that led to these findings.

At the time of writing, Censys found 20,524 internet-facing hosts running 20,565 services identified as a vulnerable version of the GitLab server software. Most of the hosts ran version 11.11 with over 1,400 occurrences, closely followed by version 13.1 and 12.1 with around 1,300 hosts

While the GitLab server did not have any feature that allowed us to determine the specific version of the running software, we came up with an effective method of identification and fingerprinting using (literal) artifacts of GitLab’s build pipeline. 

GitLab will automatically kick off a set of jobs that compiles and tests the GitLab software whenever a code commit is tagged. One part of this build process is a job called “compile-production-assets” found in the directory “.gitlab/ci/frontend.gitlab-ci.yml” whose goal is to aggregate and compile the product’s public assets (images and CSS files) into an output folder. Since these files are accessible to the public without authentication and referenced in an exposed GitLab website index (thus searchable by Censys), they made perfect candidates for a bit of ad-hoc fingerprinting.

Many of these generated filenames hardly ever change (such as the logo), but some do. For example, “public/assets/application.css” seemed to have a different generated filename for every minor release. Additionally, the names of these files will include a 64-byte ASCII-HEX encoded string appended to the original filename, giving us the ability to map these filenames back to the build job and software version that generated it. So our goal was to download every tagged build (with a vulnerable version) of the GitLab software, then note the name of the generated filename with the prefix of “public/assets/application-” and the extension of “css”, and associate that filename with the tag of the build. Not perfect, but it will do.

After downloading every vulnerable image from docker.io (all 247 major, minor, and release candidates), we ran a small shell script that generated a table of tag-names to unique file-names that resulted in 44 unique filename hashes:

#!/usr/bin/env bash
tags=$(cat ./vulnerable_tags.txt)
for tag in $tags; do
    filename=$(docker run --rm -it --entrypoint "" gitlab/gitlab-ce:$tag ls $assetdir|egrep '^application-.*\.css' | grep -v \.gz)
    echo $tag,$filename

After a bit of manual data pruning, we started the search off by generating a BigQuery SQL statement to find all running GitLab services along with the specific filename we were looking for. The results of which were stored in a temporary database table for further analysis. (A feature which Censys enterprise customers have access to):

  results AS (
    DISTINCT host_identifier.ipv4 AS host,
    autonomous_system.asn AS asn,
    autonomous_system.name AS asn_name,
    location.country_code AS cc,
    location.coordinates.latitude AS lat,
    location.coordinates.longitude AS long,
        ".*(/assets/application-[a-f0-9]{64}.css).*") AS version,
    UNNEST(services) svc
    UNNEST(svc.http.response.html_tags) x
    DATE(snapshot_date) = "2021-11-03"
    AND SAFE_CONVERT_BYTES_TO_STRING(svc.http.response.body)LIKE '%/assets/application-%.css%'
    AND SAFE_CONVERT_BYTES_TO_STRING(x) LIKE '%/assets/gitlab_logo-%'
    version DESC )

The next step was to take the data generated by the above query and map the found filenames to their respective version as seen in the docker images, resulting in the following:

FilenameMapped VersionHost Count

We can also construct a somewhat verbose Censys search query using the above hashes to view better the hosts matching these conditions by searching for both the gitlab_logo filename and the list of files that match. 

While using these exploited hosts for DDoS is terrible by itself, there have also been discussions of other mass-exploitation attacks where random admin users were found. A bigger worry here is the potential for more advanced attacks; For example, an attacker could potentially introduce backdoors and vulnerable functionality into the source code of projects hosted by these services. If this were to happen, even the most securely written code could become an administrative nightmare.

The attack itself is straightforward, and there are readily available exploits making this a severe issue that should be dealt with immediately. Hopefully, the data presented in this post will assist engineers and administrators in identifying and patching vulnerable GitLab instances.

What do I do about it?


Stay up to date

To get regular news about product updates, user guides, and security tips, send us your email. You can unsubscribe at any time.