Censys Search 2.0 Beta Announcement
A message from Censys co-founder Zakir Durumeric.
When Censys Search launched in 2015, we modeled Internet hosts by stitching together weekly ZMap and ZGrab scans of 11 protocols on 14 ports. Since then, Censys has grown considerably. We’ve spun out of the University of Michigan, raising venture capital from Google Ventures, Greylock, and Decibel, and launched the Censys Attack Surface Management (ASM) Platform. We’ve also dramatically improved our Internet data collection.
Today, we continuously scan over 2,500 ports, perform automatic protocol detection, and scan from across multiple Tier-1 ISPs. We also ensure the highest confidence in our data with 24 hour refreshes of known services.
We’re making this data available to the public with the beta launch of Censys Search 2.0, which is directly accessible at: https://search.censys.io. Censys Search 2.0, like its predecessor, will offer both a free community tier and paid professional tier. Below, we describe the most exciting and important changes to our data collection infrastructure and new search engine.
Improved Scanning and Data Collection Features
Continuous Host Discovery. Censys 2.0 is powered by a new continuous scan engine that prioritizes popular ports, protocols, and networks over others. We scan the top 2,500 ports across the full IPv4 address space every 10 days and the top 100 ports every day. We also scan popular cloud providers (e.g., AWS and GCP) every day in search of new services. All users will have access to all scan results.
Daily Data Refresh. Internet services are changing constantly — much faster than we can see them in our regular service discovery scans (iterating through 2,500 ports on 3.7 billion addresses takes a little time!). To ensure that results are up to date, we refresh the data about known services daily. With this process, the average observation time for services* is within 16 hours.
Automatic Protocol Detection. Recent research has shown that an astounding number of services run on unexpected ports, and that services on unexpected ports tend to have more security problems. Censys detects protocols on unexpected ports and completes a full protocol handshake after identification. Today, the majority of services catalogued by Censys are found on non-standard ports.
Multi-Perspective Scanning. Stateless scanning is vulnerable to irrevocable data loss in the event of packet drop, and overall visibility is dependent on the Internet provider. Censys peers with and scans from three Tier-1 ISPs (NTT, Telia, and Hurricane Electric), giving us nearly 99% coverage of listening hosts and protection against packet drop. We’re excited about adding an additional data center in Europe and two additional Tier-1 service providers later this year.
New Search 2.0 Features
The appearance of the Censys Search 2.0 UI should be a familiar sight, but there are a few big differences that allow greater flexibility searching and interacting with our data:
Simplified Data Model. We are introducing a new data model for IPv4 hosts that accounts for the complexity that any service could appear on any port. The new schema is simpler and hopefully easier to use, but it will require updating your existing searches to query new field names. [more information]
Improved Service and Device Context. We have improved our detection of Software, Operating Systems and IoT devices. We have also migrated our software, OS, and device labeling to the CPE format for improved interoperability with third-party tools and datasets.
Censys Search Language. We’ve developed a new search DSL that allows more flexibility in querying across ports and protocols—particularly across multiple ports and protocols. For example, you can search for a banner on any protocol on any port, for TLS data on any port, or for a service configuration on a specific port and protocol. [more information]
Host History. We have begun tracking the history of services on individual hosts, which is accessible through the Web UI as well as our REST API. Community users can see the past week of history; full history is available to paid users.
Fast Lookup API. We are introducing a new high-speed lookup API that allows users to lookup hosts and certificates en masse. The API is available to all users.
Community, Research, and Paid Access
Censys Search 2.0, including access to all of our current scan data, will be available for free to community users. Beyond existing Pro Search features, paid users will be able to additionally lookup the full history of hosts as well as perform high-speed API lookups. See our website for more information about commercial access as well as downloadable datasets.
We will continue to support non-commercial research, and we are transitioning all researchers to the new Universal Internet Dataset that powers Search 2.0. In addition, we will be providing all students, faculty, and non-commercial researchers with free Censys Pro Search accounts.
Features That Are Going Away
As part of the upgrade, there are a few features we’re removing and want to call your attention to:
- ZTag Library. When we started Censys, we developed an internal service labeling library, ZTag. In parallel, H.D. Moore et al. at Rapid7 developed Recog. We’ve migrated from ZTag to Recog in our new data pipeline and are deprecating ZTag. We will be contributing fingerprints to Recog moving forward, and we encourage the community to do the same.
- Regex Searches. Censys originally supported regular expression searches against scan data. Since the feature was seldom used and had compatibility issues with Elasticsearch indices, we are restricting this feature to commercial users to ensure that we can appropriately serve these computationally expensive requests.
- Fields for TLS Vulnerabilities. When we launched Censys, we were interested in tracking a number of headline TLS vulnerabilities that have been largely rectified in the years since then. Censys no longer performs scans to check for export-grade ciphers or SSLv3. We will likely introduce more comprehensive TLS vulnerability testing in the future.
* The average observation points for services excludes pseudo services like honeypots and middleboxes that listen on all ports.