Category Archives: Academics

Open Letter to the Cuba Internet Task Force

The following is a response to an invitation to participate in the recently formed Cuba Internet Task Force.

Task Force Representatives:
I will not be joining the Cuba Internet Task Force, or Subcommittees, because I believe the harm done by the existence of these committees outweighs any potential benefit of the recommendations that can come from them.

In recent years, Cuba has increasingly normalized Internet usage, through expansion and cost reduction of WiFi, through the advent of AirBNB as a major source of tourism revenue, and through growing traffic capacity.

In the scope of my work, I have documented the flourishing community wireless networks operating in tandem with official Internet service from ETECSA. These community efforts already address the “last mile” problem, and it is not hard to imagine the future where they are consolidated or integrated to provide Internet-to-the-home for many more Cubans.

These efforts are hindered by the perception by the Cuban government that the Internet and its associated ‘freedom’ are being forced upon them by the United States. In the wake of the creation of this task force, Cuban media has focused on the implied pressure, and private individuals in the Cuban technology sector have come under increased scrutiny.

Instead of attempting to influence the policies of another sovereign nation, I encourage us to reflect more on our internal policies. US government sanctions currently require a wide range of US-based education and reference sites from blocking Cuban traffic. Likewise, limitations preventing Cubans from connecting to US-invested undersea cables are partially responsible for the scarcity and cost of Cuban Internet connections. Reducing these sanctions can allow Cubans to become a market for US companies, and will provide additional incentives for widespread connectivity across the country.

Initial Measurements of the Cuban Street Network

Internet access in Cuba is severely constrained, due to limited availability, slow speeds, and high cost. Within this isolated environment, technology enthusiasts have constructed a disconnected but vibrant IP network that has grown organically to reach tens of thousands of households across Havana. We present the first detailed characterization of this deployment, which is known as the SNET, or Street Network. Working in collaboration with SNET operators, we describe the network’s infrastructure and map its topology, and we measure bandwidth, available services, usage patterns, and user demographics. Qualitatively, we attempt to answer why the SNET exists and what benefits it has afforded its users. We go on to discuss technical challenges the network faces, including scalability, security, and organizational issues. To our knowledge, the SNET is the largest isolated community-driven network in existence, and its structure, successes, and obstacles show fascinating contrasts and similarities to those of the Internet at large.


The Internet in Cuba: A Story of Community Resilience. Chaos Communication Congress. 2017


P Pujol, Eduardo E., Will Scott, Eric Wustrow, and J. Alex Halderman. “Initial measurements of the cuban street network.” In Proceedings of the 2017 Internet Measurement Conference, pp. 318-324. ACM, 2017. Slides


Last week I talked briefly about the state of open internet measurement for network anomalies at IETF 98. This was my first time attending an IETF in-person meeting, and it was very useful in getting a better understanding of how to navigate the standards process, how it’s used by others, and what value can be gained from it.

A couple highlights that I took away from the event:

There’s a concern throughout the IETF about solving the privacy leaks in existing protocols for general web access. There are three major points in the protocol that need to be addressed and are under discussion as part of this: The first is coming up with a successor to DNS that provides confidentiality. This, I think, is going to be the most challenging point. The second is coming up with a SNI equivalent that doesn’t send the requested domain in plain-text. The third is adapting the current public certificate transparency process to provide confidentiality of the specific domains issued certificates, while maintaining the accountability provided by the system.

Confidential DNS

There are two proposals with traction for encrypting DNS that I’m aware of. Neither fully solve the problem, but both provide reasonable ways forward. The first is dnscrypt, a protocol with support from entities like yandex and cloudflare. It maintains a stateless UDP protocol, and encrypts requests and responses against server and client keys. There are working client proxies for most platforms, although installation on mobile is hacky, and a set of running providers. The other alternative, which was represented at IETF and seems to be preferred by the standards community is DNS over TLS. The benefit here that there’s no new protocol, meaning less code that needs to be audited to gain confidence of the security properties for the system. There are some working servers and client proxies available for this, but the community seems more fragmented, unfortunately.

The eventual problem that isn’t yet addressed is that you still need to trust some remote party with your dns query and neither protocol changes the underlying protocol where the work of dns resolution is performed by someone chosen by the local network. Current proxies allow the client to choose who this is instead, but that doesn’t remove the trust issue, and doesn’t work well with captive portals or scale to widespread deployment. It also doesn’t prevent that third party from tracking the chain of dns requests made by the client and getting a pretty good idea about what the client is doing.

Hidden SNI

SNI, or server name identification, is a process that occurs at the beginning of an HTTPS request where the client tells the server which domain it wants to talk to. This is a critical part of the protocol, because it allows a single IP address to host HTTPS servers for multiple domains. Unfortunately, it also allows the network to detect and potentially block requests at a domain, rather than IP granularity.

Proposals for encrypting the SNI have been around for a couple years. Unfortunately, they did not get included in TLS1.3, which means that it will be a while before the next iteration of the standard and the potential to include this update.

The good news was that there seems to be continued interest in figuring out ways to protect the SNI of client requests, though no current proposal I’m aware of.

Certificate Transparency Privacy

Certificate Transparency is an addition to the HTTPS system to enforce additional accountability in to the certificate authority system. It requires authorities (CA)’s to publish a log of all certificates they issue publicly, so that third parties can audit their list and make sure they haven’t secretly mis-issued certificates. While a great feature for accountability and web security, it also opens an additional channel where the list of domains with SSL certificates can be enumerated. This includes internal or private domains that the owner would like to remain obscure.

As google and others have moved to require the CT log from all authorities through requirements on browser certificate validity, this issue is again at the fore. There’s been work on addressing this problem, including a cryptographic proposal and the IETF proposal for domain label redaction which seems to be advancing through the standards process.

There remains a ways to go to migrate to protocols which provide some protection against a malicious network, but there’s willingness and work to get there, which is at least a start.

Another Strike against Domain Fronting

In 2014, Domain Fronting became the newest obfuscation technique for covert, difficult to censor communication. Even today, the Meek Pluggable transport serves ~400GB of Tor traffic each day, at a cost of ~$3000/month.

The basic technique is to make an HTTPS connection to the CDN directly, and then once the encryption has begun, make the HTTP request to the actual backing site instead. Since many CDNs use the same “front-end cache” servers for incoming requests to all of the different sites they host, there is a disconnect between the software handling SSL, and the routing web server proxying requests to where they need to go.

Even as the technique became widely adopted in 2014-2015, its demise was already predicted, with practitioners in the censorship circumvention community focused on how long it could be made to last until the next mechanism was found. This prediction rested on two points:

  1. The CDN companies will find themselves in a difficult position politically, since they are now in the position of supporting circumvention while also maintaining a relationship with the censoring countries.
  2. The technique has security and cost implications that make it not great for either the CDNs, or the practitioners.

We’ve seen both of these predictions mature.

Cloudflare, explicitly doesn’t support this mechanism of circumvention, and coincidentally has major Chinese partnerships and worked to deploy into China. Google also has limited the technique over periods as they have struggled with abuse (although mute in China, since the Google cloud doesn’t work there as a CDN.)

In terms of cost, the most notable incident is the “Great Cannon”, which targeted not only Github as widely reported, but also caused a significant amount of traffic to go to Amazon-hosted pages run by GreatFire, a dissident news organization, and costing them significant amounts of money. GreatFire had been providing a free browser that operated by proxying all traffic through domain-fronting. Due to a separate and less reported Chinese “DDOS” they ended up with a monthly bill for several tens of thousands of dollars and had to turn down the service.

The latest strike against domain fronting is seen in posts by Cobalt Strike and FireEye that the technique is also gaining adoption for Malware C&C. This abuse case will further incentivize CDNs from allowing the practice to continue, since there will now be many legitimate western voices actively calling on them to stop. Enterprises attempting to track threats on their networks, and CDN customers wanting to not be blamed for attacks will both begin putting more pressure on the CDNs to remove the ability for different domains to be intermixed, and we should expect to see a continued drop in the willingness of providers to offer such a service.

Thoughts on IPv6 Measurement

About five years ago two projects, Zmap and Masscan, helped to shift the way that many researchers thought about the Internet. The tools both provide a relatively optimized code path for sending packets and collecting replies, and allow a researcher with moderate resources to attempt connections to every computer on the IPv4 Internet in about an hour.

These techniques are widely applied to monitor the Internet-scale security of services, with prominent examples of,, and For the security community, they have become a first-step for reconnaissance, allowing hackers to find origin IPs masked by CDNs, unadvertised points of presence, and vulnerable hosts within an organization.

While the core of the Internet and the services we actively choose to connect with remain staunchly IPv4, the networks that many end hosts are connected to are more rapidly adopting IPv6, responding to the exhaustion and density of the IPv4 address space.

This fall, a new round of research has focused on what is possible for the enumeration and exploration of the IPv6 address space. ‘You can -J reject but you can’t hide’ was presented at CCC, focusing on spidering DNS records to learn of active IPv6 addresses which are registered within the DNS system. Earlier in the fall, there were several sessions at IMC thinking about IPv6. Most notably, “Entropy/IP – uncovering entropy in IPv6″, which looks at how addresses are allocated in practice as seen by Akamai at the core of the network. In addition, IPv6 was the focus of a couple WIP sessions, expressing thoughts on discovering hosts through progressive ICMP probing, as well the continued exploration of what’s actually happening in the core as seen by Akamai.

Where does this growing understanding of wide-scale IPv6 usage take us?

  • Enumeration of candidate addresses is a new first step that will be needed for anything beyond a single prefix. Even then, scanning within a single organizational prefix can be considered an active brute-force attack, rather than the relatively ‘harmless’ reconnaissance of IPv4 scanning.
  • There are many potential sources to interact with for enumeration, including DNS records, observed network traffic, and default ::1 addresses. The Entropy/IP paper points out that has already been observed adding itself as a member of the NTP pool to harvest candidate IPv6 addresses for scanning.
  • Address generation for many hosts is not fully random, embedding a mac address, IPv4 address, or other non-random information. This can be used to discover a subset of hosts more efficiently, though still not at Internet scale. (for example, 2^32 attempts to look for hosts of a specific brand within a 2^64 network address space.) This would still sends several gigabytes of traffic to an individual network in the process of scanning. Non-random addresses tend to be more often associated with servers and routers than with end-clients.
  • Discovery of network topology is possible by enumerating where error responses to guessed addresses come back from. This doesn’t allow for discovery of individual machines either.

What do we do about it?

There will probably not be a for ipv6 in the same way there is for ipv4. Instead, much of the wide-scale scanning on the IPv6 network will be performed through reflection from hosts discovered through their participation in other active services, for instance bit torrent, NTP, or DNS.

Conversely, the number of vulnerable IPv6 hosts will keep growing, because they can exist for much longer before anyone will find them. This will likewise increase the value that can be obtained through scanning – both to hackers, and to academics looking at Internet dynamics. We can expect to see a marketplace for addresses observed passively by ISPs, the network core, and passive services.

It’s worth also watching the watchers here: which providers are “selling me out” so to speak? It would be worth building the honey-pots to observe which services and servers leak client information and lead top probing and the potential for compromise of end hosts.