Thoughts on Wulim

One of the exciting developments at CCC last month was a talk discussing the copy protection features in the Wulim tablet produced by the Pyongyang Information Center. This post is an attempt to reconcile the features they describe with my experience with devices around Pyongyang and provide some additional context of the environment the device exists within.

Threat Model

As mentioned in the talk, the Wulim tablet, and most of the devices available in Pyongyang for that matter, do a good job of defending against the primary threat model anticipated: of casual dissemination of subversive material. To that end, transfer of content between devices is strictly regulated, with watermarking to track how material has been transferred, a screenshot-based verification systems for visual inspection, and technical limitations on the ability to run externally created Applications.

One of the interesting points of note is that the Wulim, and the earlier pyongyang phone from PIC implement much of their security through a system application and kernel process named ‘Red Flag’, which shares an icon and name with the protection system on the Red Star desktop system. While the code is most likely entirely different (I haven’t actually compared), the interesting point is that these implementations come from two separate labs and entities, indicating that there is potentially coordination or joint compliance with a common set of security requirements.

System Security

The Wulim was difficult for the CCC presenters to gain access to. While there were bugs allowing them to view the file system, there was no easy way to casually circumvent the security systems in place. This indicates a general success of the threat model the system was designed to protect against and shows a significant increase in technical proficiency from the 2013/2014 devices. In the initial generations of android-based hardware, most devices had an enabled recovery mode, and the security could generally be breached without more than a computer. The alternative start-up mode found at CCC indicates that the labs are still not deeply familiar with all of the intricacies of Android, and there remain quirks in its operation that they haven’t anticipated. This will likely continue, with a pattern of an attack surface area that continues to shrink as exploits are discovered and make their way back to pyongyang.

The ‘crown jewel’ for this system, it should be noted, is an exploit that the CCC presenters did not claim to have found: the ability to create applications which can be installed on the device without modification. One of the first and most effective security mechanisms employed by the wulim and previous generations of PIC android systems is the requirement that applications be signed with a lab-issued key. While it might be possible that either the security check of applications, or information about the private key might be recovered from a device, this code has likely been checked quite well, and I expect such a major lapse in security to be unlikely.

The presence of this security means that I cannot install an Application on your tablet from an SD card, computer, or via bluetooth transfer if it has not already been pre-approved. This key is potentially shared between KCC and PIC, because the stores offering to install after-market applications around pyongyang have a single list, and are willing to try adding them to systems produced by either lab.

Connectivity

The Wulim is a 2015-2016 model, and evidences a feeling of confidence from the labs that they’ve got the software security at a reasonably appropriate level of security, and are more comfortable opening back up appropriate levels of connectivity between devices. 2013 and 2014 models of tablets and phones were quite limited in connectivity, with bluetooth as a ‘high-end’ option only available on the flag-ship models, and wifi connectivity removed completely. In contrast, the Wulim has models with both bluetooth and wifi, as well as the capability for PPPOE based connectivity to intranet services broader than a single network.

This connectivity extends in two additional ways of note:

  • First, there continue to be rumors of mobile data services being tested for broader availability within the country, and the Gateway mechanism in the wulim presents yet another clue towards how this will manifest down the road. While the wulim tablet does not have 3G connectivity, the same software stack has been seen on phones (for instance the pyongyang phone series with the most recent generation ‘2610’ released in 2015).
  • Second, the same basic android system is being used for wider installations, and is on display in the science and technology exhibition center. In that context, a custom deployment of tablets with modified software have been installed in both tablet and desktop configuration (desktop through USB peripheral keyboards and mice), and are connected through a LAN-local wifi network for searching the library resources on-site.

Tracking

The screenshot ‘trace viewer’ mentioned in the CCC talk is really just a file-system viewer of images taken by the same red flag security tool integrated into the system. The notable points here are that screen shots are taken at regular interval on images not in a predefined white-list, so even if new signed applications are created, there will be an alternative system were their presence can be detected even if they’ve been uninstalled by the user prior to inspection. It’s worth noting that it is more effective against the transmission of images and videos containing subversive content then against applications. Applications in android will likely be able to take advantage of screen-security APIs to prevent themselves from appearing in the list. Or, more to the point, once external code is running, the system age is typically 2-3 years behind current android and one of several root methods can be used to escalate privileges and disable the security measures on the device.

While the CCC talk indicates that images and videos can only be viewed on the device they have been created on, this was not what I observed. It was relatively common for citizens to transfer content between devices, including road-maps and pictures of family and friends. The watermarking may be able to indicate lineage, but these sorts of transfer were not restricted or prevented.

Releasing

Very little underlying data was released from the CCC talk, although they indicate the intention to release some applications and data available on the tablet they have access to. This is unfortunate. The talk, and the general environment has already signaled to Pyongyang that devices are available externally, and much of their reaction to this reality has already occurred. In particular, devices are no longer sold to foreigners within the country regularly, as they were in 2013/14 – with a couple exceptions where a limited software release (without the protections imposed on locals) and on older hardware can be obtained.

The only remaining risk then is the fear of retribution against the individual who brought the device out of the country. The CCC presenters were worried that the device they have may have a serial number tied to an individual. This has not been my experience, and I believe it is highly unlikely. Cellphones with connectivity do need to be attached to a passport at the point of sale, but tablet, as of spring 2015 continue to be sold without registration. The serial numbers observed by the CCC presenters are version numbers common to the image placed on all of the tablets of that generation released.

Thoughts on IPv6 Measurement

About five years ago two projects, Zmap and Masscan, helped to shift the way that many researchers thought about the Internet. The tools both provide a relatively optimized code path for sending packets and collecting replies, and allow a researcher with moderate resources to attempt connections to every computer on the IPv4 Internet in about an hour.

These techniques are widely applied to monitor the Internet-scale security of services, with prominent examples of censys.io, scans.io, and shodan.io. For the security community, they have become a first-step for reconnaissance, allowing hackers to find origin IPs masked by CDNs, unadvertised points of presence, and vulnerable hosts within an organization.

While the core of the Internet and the services we actively choose to connect with remain staunchly IPv4, the networks that many end hosts are connected to are more rapidly adopting IPv6, responding to the exhaustion and density of the IPv4 address space.

This fall, a new round of research has focused on what is possible for the enumeration and exploration of the IPv6 address space. ‘You can -J reject but you can’t hide’ was presented at CCC, focusing on spidering DNS records to learn of active IPv6 addresses which are registered within the DNS system. Earlier in the fall, there were several sessions at IMC thinking about IPv6. Most notably, “Entropy/IP – uncovering entropy in IPv6″, which looks at how addresses are allocated in practice as seen by Akamai at the core of the network. In addition, IPv6 was the focus of a couple WIP sessions, expressing thoughts on discovering hosts through progressive ICMP probing, as well the continued exploration of what’s actually happening in the core as seen by Akamai.

Where does this growing understanding of wide-scale IPv6 usage take us?

  • Enumeration of candidate addresses is a new first step that will be needed for anything beyond a single prefix. Even then, scanning within a single organizational prefix can be considered an active brute-force attack, rather than the relatively ‘harmless’ reconnaissance of IPv4 scanning.
  • There are many potential sources to interact with for enumeration, including DNS records, observed network traffic, and default ::1 addresses. The Entropy/IP paper points out that shodan.io has already been observed adding itself as a member of the NTP pool to harvest candidate IPv6 addresses for scanning.
  • Address generation for many hosts is not fully random, embedding a mac address, IPv4 address, or other non-random information. This can be used to discover a subset of hosts more efficiently, though still not at Internet scale. (for example, 2^32 attempts to look for hosts of a specific brand within a 2^64 network address space.) This would still sends several gigabytes of traffic to an individual network in the process of scanning. Non-random addresses tend to be more often associated with servers and routers than with end-clients.
  • Discovery of network topology is possible by enumerating where error responses to guessed addresses come back from. This doesn’t allow for discovery of individual machines either.

What do we do about it?

There will probably not be a shodan.io for ipv6 in the same way there is for ipv4. Instead, much of the wide-scale scanning on the IPv6 network will be performed through reflection from hosts discovered through their participation in other active services, for instance bit torrent, NTP, or DNS.

Conversely, the number of vulnerable IPv6 hosts will keep growing, because they can exist for much longer before anyone will find them. This will likewise increase the value that can be obtained through scanning – both to hackers, and to academics looking at Internet dynamics. We can expect to see a marketplace for addresses observed passively by ISPs, the network core, and passive services.

It’s worth also watching the watchers here: which providers are “selling me out” so to speak? It would be worth building the honey-pots to observe which services and servers leak client information and lead top probing and the potential for compromise of end hosts.

Internet Censorship 2016

We have reached the end of 2016, as well as the annual CCC congress in Germany. I had the exciting chance to speak together with Philipp Winter on the shifting landscape of Internet censorship in 2016. The talk followed mostly the same format as last year’s, calling out the continuing normalization and ubiquity of censorship around the world.

I left congress once again energized to work on system infrastructure advancing the Internet community in the face of these existential threats.

Slides from the talk are on this site.
A writeup (in german) is on Netzpolitik blog.

First-party Google Analytics

Third party analytics services are suffering from the growing prevalence of ad blocking, tracking protection, and the trend of minimizing connections and requests. However, from a site owner perspective, receiving usage information remains important for measuring site growth.

My expectation is that we are already on the curve where ads and tracking software will be more tightly integrated into websites and make it significantly more difficult for clients to disambiguate
“good” and “bad” scripts, which are mostly done today from the URL.

Google already provides the tools needed to relay analytics communication through a third party server, and it took under an hour to put together a proof of concept that removes the final third-party requests that are required when viewing this page. In essence, my server proxies all the requests that would normally go to Google, and adds on a couple extra parameters to track who the real client is.

The modified loading script for google analytics, and the corresponding nginx configuration to make my server a relay are here.

Thoughts on China’s Updated Cyber-security Regulations

On Monday, China ratified an updated cybersecurity legislation that will enter effect next June. The policy regulates a number of aspects of the Chinese Internet: What data companies need to keep on domestic servers, the interaction between companies and the government, and the interaction between companies and Chinese users.

Notably, when considering the impact on the Internet, the law include:

  • Network operators are expected to record network security incidents and store logs for at least 6 months (Article 21)
    Note that the punishment for refusing to keep logs is a fine up to 10,000usd to the operator, and of up to 5,000usd to the responsible person.
  • Services must require real-identity information for network access, telecom service, domain registration, blogging, or IM (Article 24)
    The punishment for failing to require identity is up to 100,000usd and suspension of operations.
  • Network operators must provide support to the government for national security and crime investigations (Article 28)
  • If a service discovers prohibited user generated content they must remove it, save logs, and report to the government (Article 47)
    The punishment for this is up to 100,000usd and closing down the website

The concerns from foreign companies seem to center around a couple things: The first is that there’s a fairly vague classification of ‘critical infrastructure’, which includes power, water and other infrastructure elements explicitly, but also refers to services needed for public welfare and national security. Any such service gets additional monitoring requirements, and needs to keep all data on the mainland. Companies are worried they could be classified as a critical service, and that there aren’t clear guidelines about how to avoid or limit their risk of becoming subject to those additional regulations.

The other main concern seems to be around the fairly ambiguous regulation of supporting national security investigations by the government. There’s a concern that there aren’t really any limits in place for how much the government can request from services, which could include requiring them to include back doors, or perform significant technical analysis without compensation.

My impression is that these regulations aren’t much of a surprise within China, and they are unlikely to cause much in the way of change from how smaller companies and individuals experience Internet management already.

Watch your PAC

In the last week at Blackhat / Defcon two groups looked deeply at one of the lesser known implementations of network policy called Proxy Autoconfig. (In particular, badWPAD by Maxim and Crippling HTTPS with unholy PAC by Safebreach.)

Proxy AutoConfig (PAC) is a mechanism used by many organizations to configure an advanced policy for connecting to the Internet. A PAC file is written in JavaScript to provide a dynamic determination of how different connections should be made, and which proxy they should use. In particular, international companies with satellite offices often find the PAC system useful in routing some traffic through a corporate proxy for compliance or geographical reasons while other traffic is routed directly to the Internet.

These two talks both focus on what a malicious individual could do to attack the standard, and each find an interesting line of attack. The first attack is that the PAC file is allowed to make DNS requests in determining how to proxy connections, and in many browsers sees the full URL being accessed rather than only the domain. This means that even when the user is communicating with a remote server over HTTPS, the local network can learn the full URL that is being visited. The second attack has to do with where computers look for PAC files on their local network – for a file called `wpad.dat`.

While there is certainly the potential for an attacker to target a victim through these technologies, they are more accessible and arguably more valuable to a ISP or state level actor interested in passive surveillance. This explicit policy for connectivity is not inherently more invasive than policies employed by many ISPs already, and could likely be deployed on many networks without consumer push-back as a performance enhancement for better caching. It is also appropriate for targeted surveillance, since vulnerability can be determined passively.

The viability of surveillance through WPAD and PACs is a bit of a mixed bag. Most ISPs use DHCP already and set a “search domain”, which will result in a recognizable request for proxy information from vulnerable clients. While organizations often require all clients to enable discovery, this is not true of many consumer machines. Unfortunately, some versions of windows have proxy discovery enabled by default.

The NMAP tool used for network exploration, and pitched towards use as a tool facilitating network attackers, already has support for WPAD. In contrast, the network status and monitoring tools, like Netalyzr and OONI do not yet monitor local proxy status and won’t provide indication of malicious behavior.

Stunning

I’ve started to dive once again into the mess of connection establishment. Network address translation (NAT) is a reality today for most Internet users, and poses a significant hurdle in creating the user-user (or peer-peer) connections. NAT is the process used by your router to provide multiple internal (192.168.x.x) addresses that are all only visible as a single external address on the Internet. The challenge caused by this device is that if someone outside wants to connect to your computer, they have to figure out how to get the router to send their traffic back to you, and not just drop it or send it to another computer on your network.

Without configuring your router to add a ‘port forwarding’ rule, it isn’t supposed to do this, so many of the connection establishment procedures are really ways to trick your NAT into forwarding traffic without realizing what’s happening.

There are two main protocols on the Internet today: UDP and TCP. UDP is stateless, each “packet” of data is its own message, and is self contained. In contrast, TCP is a representation of a longer “stream” of data – many messages are sent with an explicit ordering . TCP is much harder to trick routers into establishing, and there has been little work there.

The current generation of p2p systems are led by high-bandwidth applications that want to offload traffic from central servers in order to save on bandwidth costs. Good examples of these are Google’s hangouts and other VOIP (video over IP) traffic.

These systems establish a channel to send UDP traffic between two computers both behind NAT routers using a system called ICE (interactive connectivity establishment). This is a complex dance with multiple sub-protocols used to try several different ways of establishing connectivity and tricking the routers.

One of the key systems used by ICE is a publicly visible server that speaks a protocol called STUN. STUN servers provide a way for a client to open a UDP connection through their router to a server that is known to be able to receive messages, and then learn what that connection looks like outside of its router. It can then provide that external view of how it’s connected to another peer which may be able to send messages to the same external address and port and have them forwarded back to the client.

One of the unfortunate aspects of this situation is that the complexity of these systems has led to very few implementations. This is unfortunate, since the existence of libraries making it easy to reuse these techniques can allow more p2p systems to continue working in the modern Internet without forcing users to manually configure their routers.

I’ve started work on a standalone go implementation of the ICE connectivity stack. Over the weekend I reached the first milestone – The library can create a STUN connection, and learn the external appearance of the connection as reported by the STUN server.

Web Hacker