Lightweight Protocol Scanning
Traditionally, Censys has focused on “deep” scans — scans which do a full analysis of the service behind them and break out key values into fields to be searched on later. Some protocols were captured by banner grabs of “UNKNOWN” services and were just untagged, making them searchable but not easily. For example, IRC servers could be found on Censys Search by searching for “irc” on port 6667. This narrows services down, but misses many IRC servers that won’t present “irc” in their banner or don’t run on port 6667. It also gives some false-positives, with some results just being HTTP servers on port 6667 that happen to include the phrase “irc.”
Many other protocols were missed entirely; they aren’t chatty enough to yield banner data without a specific probe to elicit it. For example, the Printer Job Language protocol (PJL in Censys Search) is valuable to see, but will only respond to a few specific probes, which Censys previously didn’t use for its default port of 9100. For us, this includes all UDP protocols, since we do not collect any sort of banner data on UDP ports. Further, in some cases for TCP-based protocols, Censys didn’t yet scan the port those services generally live on.
Censys set out to increase its protocol coverage in June and began tagging the banner grab data, which could be attributed to a particular protocol. We discovered those services not present in our data. Most of the protocols we sought to add were very simple to detect compared to the nuance associated with scanning a protocol like DNS. The main hurdle in adding them was the infrastructure required to add support for a new protocol, which was designed around the assumption that scans would be complex operations, consisting of many packets sent back and forth to extract the different data fields Censys wants to expose. In our new case, we wanted to send a very basic probe and pattern match against the response.
To meet this new scenario, Censys implemented a “lightweight” protocol scanning framework on top of our existing scan engine. This framework allowed us to rapidly and easily add support for new protocols without writing any code. We could specify the port and method to scan with in a config file, add it to the list of existing probes, and be on our way having added full support for a new protocol in an hour, which cut down from (often) days of developer time it would take to add these protocols in the existing framework.
Using this new framework, Censys was able to increase the number of protocols supported by our scanner from 40 to 78 in just a two-day hackathon. The majority of the work that went into adding these protocols was not in writing code or wiring things but in deciding which protocols would be most valuable to add to the platform and fingerprinting them. We returned to this effort to bring the number of protocols from 78 to an even 100.
We narrowed down which protocols we’d add from a large list of possible protocols. We prioritized adding protocols which fit into one of these boxes:
- Protocols which signified an obvious issue. E.g. Kerberos, ARD, ATG, Andromouse
- Database protocols (Cassandra, Zookeeper, Bolt)
- Industrial control protocols (GE_SRTP, PCWORX, FINS…)
- Protocols which were just… cool! (and didn’t require much effort in dissecting the protocol). E.g. Terraria, QOTD, Teamspeak
We expose the data on these new protocols in the same way we would an “unknown” banner grab – the only difference is in the service_name field. This can be seen when viewing the table view for a host in search. Note that the data fields are called banner_grab even though this is a Murmur server.
Previously, we didn’t expose service data in this way. Each service added had a new dedicated field and format (e.g., services.openvpn), which was then permanent. This made it troublesome to add any sort of “temporary” probe. We would be committing to supporting a field forever. Because we have one dedicated field (banner_grab) for all “lightweight” probes, we are now able to add a temporary probe that we can remove later if it is no longer providing value. Further, this made it possible in the data, since changing search path names is no longer a requirement, only changing the value of name of the service. We already had to do this during our first protocol-adding sprint. We added a probe for “RDP_UDP” and decided it would be better to just call this probe RDP. We renamed the service in our config file, and problematic names were filtered out and replaced by the more sensible one.
Below is a table of all the services Censys now scans for, and how many of each we had on the internet as of 2021-08-16. These numbers can vary wildly, but in general, they stay about the same in order of popularity. In green are the protocols we have added within the last eight weeks using the new scan framework. This was generated using the report feature on Search 2.0. We’ve manually added in Andromouse here, because we don’t see any active services for it at the moment.
“how many services does Censys not recognize?”