Wednesday, April 13, 2011

Dropbox: Not recommended for business or secure financial or private medical organizations.

Two disturbing issues have been found in the authentication and privacy of Dropbox.

The first from Derek Newton, Titled Dropbox authentication: insecure by design, reads as follows, quoted in its superb entirety
For the past several days I have been focused on understanding the inner workings of several of the popular file synchronization tools with the purpose of finding useful forensics-related artifacts that may be left on a system as a result of using these tools.  Given the prevalence of Dropbox, I decided that it would be one of the first synchronization tools that I would analyze, and while working to better understand it I came across some interesting security related findings.  The basis for this finding has actually been briefly discussed in a number of forum posts in Dropbox’s official forum (here and here), but it doesn’t quite seem that people understand the significance of the way Dropbox is handling authentication.  So, I’m taking a brief break in my forensics-artifacts research, to try to shed some light about what appears to be going on from an authentication standpoint and the significant security implications that the present implementation of Dropbox brings to the table.

To fully understand the security implications, you need to understand how Dropbox works (for those of you that aren’t familiar with what Dropbox is – a brief feature primer can be found on their official website).  Dropbox’s primary feature is the ability to sync files across systems and devices that you own, automatically.  In order to support this syncing process, a client (the Dropbox client) is installed on a system that you wish to participate in this synchronization.  At the end of the installation process the user is prompted to enter their Dropbox credentials (or create a new account) and then the Dropbox folder on your local system syncs up with the Dropbox “cloud.”  The client runs constantly looking for new changes locally in your designated Dropbox folder and/or in the cloud and syncs as required; there are versions that support a number of operating systems (Windows, Mac, and Linux) as well as a number of portable devices (iOS, Android, etc).  However, given my research is focusing on the use of Dropbox on a Windows system, the information I’ll be providing is Windows specific (but should be applicable on any platform).

Under Windows, Dropbox stores configuration data, file/directory listings, hashes, etc in a number of SQLite database files located in %APPDATA%\Dropbox.  We’re going to focus on the primary database relating to the client configuration: config.db.  Opening config.db with your favorite SQLite DB tool will show you that there is only one table contained in the database (config) with a number of rows, which the Dropbox client references to get its settings.  I’m going to focus on the following rows of interest:

email: this is the account holder’s email address.  Surprisingly, this does not appear to be used as part of the authentication process and can be changed to any value (formatted like an email address) without any ill-effects.
dropbox_path: defines where the root of Dropbox’s synchronized folder is on the system that the client is running on.
host_id: assigned to the system after initial authentication is performed, post-install.  Does not appear to change over time.
After some testing (modification of data within the config table, etc) it became clear that the Dropbox client uses only the host_id to authenticate.  Here’s the problem: the config.db file is completely portable and is *not* tied to the system in any way. This means that if you gain access to a person’s config.db file (or just the host_id), you gain complete access to the person’s Dropbox until such time that the person removes the host from the list of linked devices via the Dropbox web interface.  Taking the config.db file, copying it onto another system (you may need to modify the dropbox_path, to a valid path), and then starting the Dropbox client immediately joins that system into the synchronization group without notifying the authorized user, prompting for credentials, or even getting added to the list of linked devices within your Dropbox account (even though the new system has a completely different name) – this appears to be by design.  Additionally, the host_id is still valid even after the user changes their Dropbox password (thus a standard remediation step of changing credentials does not resolve this issue).

Of course, if an attacker has access to the config.db file (assuming that it wasn’t sent by the user as part of social engineering attack), the assumption is that the attacker most likely also has access to all of the files stored in your Dropbox, so what’s the big deal?  Well, there are a few significant security implications that come to mind:

Relatively simple targeted malware could be designed with the specific purpose of exfiltrating the Dropbox config.db files to “interested” parties who then could use the host_id to retrieve files, infect files, etc.
If the attacker/malware is detected in the system post-compromise, normal remediation steps (malware removal, system re-image, credential rotation, etc) will not prevent continued access to the user’s Dropbox.  The user would have to remember to purposefully remove the system from the list of authorized devices on the Dropbox website.  This means that access could be maintained without continued access/compromise of a system.
Transmitting the host_id/config.db file  is most likely much smaller than exfiltrating all data found within a Dropbox folder and thus most likely not set off any detective alarms.  Review/theft/etc of the data contained within the Dropbox could be done at the attackers leisure from an external attacker-owned system.
So, given that Dropbox appears to utilize only the host_id for authentication by design, what can you do to protect yourself and/or your organization?

Don’t use Dropbox and/or allow your users to use Dropbox.  This is the obvious remediating step, but is not always practical – I do think that Dropbox can be useful, if you take steps to protect your data…
Protect your data: use strong encryption to protect sensitive data stored in your Dropbox and protect your passphrase (do not store your passphrase in your Dropbox or on the same system/device).
Be diligent about removing old systems from your list of authorized systems within Dropbox.  Also, monitor the “Last Activity” time listed on the My Computers list within Dropbox.  If you see a system checking in that shouldn’t be, unlink it immediately.
Hopefully, Dropbox will recognize the need for additional security and add in protection mechanisms that will make it less trivial to gain long-term unauthorized access to a user’s Dropbox as well as provide better means to mitigate and detect an exposure.  Until such time, I’m hoping that this writeup helps brings to light how the authentication method used my Dropbox may not be as secure as previously assumed and that, as always, it is important to take steps to protect your data from compromise.

And then we have a legal issue from slight paranoia, Titeled How Dropbox sacrifices user privacy for cost savings, reads as follows, again quoted in its entirety

Dropbox, the popular cloud based backup service deduplicates the files that its users have stored online. This means that if two different users store the same file in their respective accounts, Dropbox will only actually store a single copy of the file on its servers.

The service tells users that it "uses the same secure methods as banks and the military to send and store your data" and that "[a]ll files stored on Dropbox servers are encrypted (AES-256) and are inaccessible without your account password." However, the company does in fact have access to the unencrypted data (if it didn't, it wouldn't be able to detect duplicate data across different accounts).

This bandwidth and disk storage design tweak creates an easily observable side channel through which a single bit of data (whether any particular file is already stored by one or more users) can be observed.

If you value your privacy or are worried about what might happen if Dropbox were compelled by a court order to disclose which of its users have stored a particular file, you should encrypt your data yourself with a tool like truecrypt or switch to one of several cloud based backup services that encrypt data with a key only known to the user.

Introduction

For those of you who haven't heard of it, Dropbox is a popular cloud-based backup service that automatically synchronizes user data. It is really easy to use and the company even offers users 2GB of storage for free, with the option to pay for more space.

The problem is, offering free storage space to users can be quite expensive, at least once you gain millions of users. In what I suspect was a price-motivated design decision, Dropbox deduplicates the data uploaded by its users. What this means is that if two users backup the same file, Dropbox only stores a single copy of it. The file still appears in both users' accounts, but the company doesn't consume storage space nor upload bandwidth on a second copy of the file.

The company's CTO described the deduplication in a note posted in the "Bugs & Troubleshooting" section on the company's web forum last year:
Woah! How did that 750MB file upload so quickly?

Dropbox tries to be very smart about minimizing the amount of bandwidth used. If we detect that a file you're trying to upload has already been uploaded to Dropbox, we don't make you upload it again. Similarly, if you make a change to a file that's already on Dropbox, you'll only have to upload the pieces of the file that changed.

This works across all data on Dropbox, not just your own account. There are no security implications [emphasis added] - your data is still kept logically separated and not affected by changes that other users make to their data.
Ashkan Soltani was able to verify the deduplication for himself a couple weeks ago. It took just a few minutes with a packet sniffer. A new randomly generated 6.8MB file uploaded to dropbox lead to 7.4MB of network traffic, while a 6.4MB file that had been previously uploaded to a different dropbox account lead to just 16KB in network traffic.

Claims of security and privacy

There are long standing privacy and security concerns with storing data in the cloud, and so Dropbox has a helpful page on their website which attempts to address these:
Your files are actually safer while stored in your Dropbox than on your computer in some cases. We use the same secure methods as banks and the military to send and store your data.

Dropbox takes the security of your files and of our software very seriously. We use the best tools and engineering practices available to build our software, and we have smart people making sure that Dropbox remains secure. Your files are backed-up, stored securely, and password-protected.

Dropbox uses modern encryption methods to both transfer and store your data...

All files stored on Dropbox servers are encrypted (AES-256) and are inaccessible without your account password

Reading through this document, it would be easy for anyone but a crypto expert to get the false impression that Dropbox does in fact protect the security and privacy of users' data. Many users and even the technology press will not realize that AES-256 is useless against many attacks if the encryption key isn't kept private.

What is missing from the firm's website is a statement regarding how the company is using encryption, and in particular, what kinds of keys are used and who has access to them.

Encryption and deduplication

Encryption and deduplication are two technologies that generally don't mix well. If the encryption is done correctly, it should not be possible to detect what files a user has stored (or even if they have stored the same file as someone else), and so deduplication will not be possible.

Dropbox is likely calculating hashes of users' files before they are transmitted to the company's servers. While it is not clear if the company is using a single encryption key for all of the files users' have stored with the service, or multiple encryption keys, it doesn't really matter (from a privacy and security standpoint), because Dropbox knows the keys. If the company didn't have access to the encryption keys, it wouldn't be able to detect duplicate files.

While the decision to deduplicate data has probably saved the company quite a bit of storage space and bandwidth, it has significant flaws which are particularly troubling given the statements made by the company on its security and privacy page.

Cloud backup providers do not need to design their products this way. Spideroak and Tarsnap are two competing services that encrypt their users' data with a key only known to that user. These companies have opted to put their users' privacy first, but the side effect is that they require more back-end storage space. If 20 users upload the same file, both companies upload and store 20 copies of that file (and in fact, they have no way of knowing if a user is uploading something that another user has backed up).

Why is this a problem?

As Ashkan Soltani was able to test in just a few minutes, it is possible to determine if any given file is already stored by one or more Dropbox users, simply by observing the amount of data transferred between your own computer and Dropbox's servers. If the file isn't already stored by Dropbox, the entire file will be uploaded. If Dropbox has the file already, just a few kb of communication will occur.

While this doesn't tell you which other users have uploaded this file, presumably Dropbox can figure it out. I doubt they'd do it if asked by a random user, but when presented with a court order, they could be forced to.

What this means, is that from the comfort of their desks, law enforcement agencies or copyright trolls can upload contraband files to Dropbox, watch the amount of bandwidth consumed, and then obtain a court order if the amount of data transferred is smaller than the size of the file.

Last year, the New York Attorney General announced that Facebook, MySpace and IsoHunt had agreed to start comparing every image uploaded by a user to an AG supplied database of more than 8000 hashes of child pornography. It is easy to imagine a similar database of hashes for pirated movies and songs, ebooks stripped of DRM, or leaked US government diplomatic cables.

Responsible Disclosure

On April 1, 2011, Marcia Hofmann at the Electronic Frontier Foundation contacted Dropbox to let them know about the flaw, and that a researcher would be publishing the information on April 12th. There are plenty of horror stories of security researchers getting threatened by companies, and so I hoped that by keeping my identity a secret, and having an EFF attorney notify the company about the flaw, that I would reduce my risk of trouble.

At 6:15PM west coast time on April 11th, an attorney from Fenwick & West retained by Dropbox left Marcia a voicemail message, in which he reveled that: "the company is updating their privacy policy and security overview that is on the website to add further detail."

Marcia spoke with the company's attorney this morning, and was told that the company will be updating its privacy policy and security overview to clarify that if Dropbox receives a warrant, it has the ability to remove its own encryption to provide data to law enforcement.

While I want to praise the company for being willing to clarify the security statements made on its website, I hope this will be a first step on this issue, and not the last.

It is unlikely that the millions of existing Dropbox users will stumble across the new privacy policy in their regular web browsing. As such, the company should send out an email to its users to let them know about this flaw, and advise them of the steps they can take if they are concerned about the privacy of their data.

I also urge the company to abandon its deduplication system design, and embrace strong encryption with a key only known to each user. Other online backup services have done it for some time. This is the only real way that data can be secure in the cloud.

You are encouraged to follow and contribute to the discussion at those locations above. 

No comments: