IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 46,NO. 3, MAY2000 893 Tracing Traitors Benny Chor, Amos Fiat, Moni Naor, and Benny Pinkas Abstract—We give cryptographic schemes that help trace the source of leaks when sensitive or proprietary data is made avail- able to a large set of parties. A very relevant application is in the context of pay television, where only paying customers should be able to view certain programs. In this application, the programs are normally encrypted, and then the sensitive data is the decryp- tion keys that are given to paying customers. If a pirate decoder is found, it is desirable to reveal the source of its decryption keys. We describe fully resilient schemes which can be used against any decoder which decrypts with nonnegligible probability. Since there is typically little demand for decoders which decrypt only a small fraction of the transmissions (even if it is nonnegligible), we further introduce threshold tracing schemes which can only be used against decoders which succeed in decryption with proba- bility greater than some threshold. Threshold schemes are consid- erably more efficient than fully resilient schemes. Index Terms—Encryption, tracing, watermarking. I. INTRODUCTION I F only one person knows some secret, and this next appears on the evening news, then the guilty party is evident. A more complex situation arises if the set of people that have access to the secret is large. The problem of determining guilt or inno- cence is (mathematically) insurmountable if all people get the exact same data and one of them behaves treacherously and re- veals the secret. Any data that is to be available to some while it should not be available to others can obviously be protected by encryption. The data supplier may give authorized parties cryptographic keys allowing them to decrypt the data. This does not solve the problem above because it does not prevent one of those au- thorized to view the message (say, Alice) from transferring the cleartext message to some unauthorized party (say, Bob). Once this is done then there are no (cryptographic) means to trace the source of the leak. We call all such unauthorized access to data piracy. The traitor or traitors is the (set of) authorized user(s) Manuscript received March 29, 1998; revised November 28, 1999. The work of B. Chor was supported by the Fund for Promotion of Research at the Tech- nion. The work of M. Naor was supported under a grant from the Israel Science Foundation administered by the Israeli Academy of Sciences and by an Alon Fellowship. The work of B. Pinkas was supported by an Eshkol Fellowship from the Israeli Ministry of Science. The material in this paper was presnted in part at Crypto’94 and Crypto’98. B. Chor is with the Department of Computer Science, Technion–Israel Insti- tute of Technology, Haifa 32000, Israel (e-mail: benny@cs.technion.ac.il). A. Fiat is with the Department of Computer Science, School of Mathematics, Tel-Aviv University, Ramat-Aviv 69978, Israel, and Algorithmic Research Ltd. (e-mail: fiat@math.tau.ac.il). M. Naor and B. Pinkas are with the Department of Applied Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel (e-mail: {naor}{bennyp}@wisdom.weizmann.ac.il). Communicated by D. Stinson, Associate Editor for Complexity and Cryptog- raphy. Publisher Item Identifier S 0018-9448(00)03089-3. who allow other, nonauthorized parties, to obtain the data. These nonauthorized parties are called pirate users. In many interesting cases piracy is somewhat ineffective if the relevant cleartext messages must be transmitted by the “traitor” to the “enemy.” Typical cases where this is true include • Pay-per-view or subscription television broadcasts. It is simply too expensive and risky to start a pirate broadcast station. • Online databases, publicly accessible (say on the Internet) where a charge may be levied for access to all or certain records. • Distribution of data in an encrypted form where a sur- charge is charged for the decryption keys for different parts of the data. The encrypted data is often distributed on a CD-ROM or DVD and it is assumed that cleartext data can only be distributed on a similar storage device whose production involves relatively high setup costs (this as- sumption might not be currently justified for CD-ROM’s but it might be reasonable for other types of media, such as DVD’s. We use the term CD-ROM in order to use a concrete example and simplify the presentation). In all these cases, transmitting the cleartext from a traitor, Alice, to a pirate-user, Bob, is rather expensive compared to the mass distribution channels the legal data supplier uses. It might also be the case, as with on-line databases or newspapers, that the data is continuously changing and therefore it is very hard for the pirate to keep an updated copy of the data. As piracy in all these cases is a criminal commercial enterprise, the risk/benefit ratio becomes very unattractive. These three examples can be considered generic examples covering a wide range of data ser- vices. In this paper we concentrate on preventing traitors from dis- tributing the keys that enable the decryption of the encrypted content. Consider a ciphertext that may be decrypted by a large set of parties, but each and every party is assigned a different personal key for decrypting the ciphertext. (We use the term per- sonal key rather than private key to avoid confusion with public key terminology). Should the key used in a pirate decoder be discovered (by examining the pirate decoder or by counter-es- pionage), it will be linked to a personal key of a traitor and this traitor will be identified. Clearly, a possible solution is to encrypt the data separately under different personal keys. This means that the total length of the ciphertext is at least times the length of the cleartext, where is the number of authorized parties. Such overhead is certainly impossible in any broadcast environment. It is also very prob- lematic in the context of content distributed on a DVD because this means that every copy must be different. An encrypted on- line database, publicly accessible as above, must store an indi- 0018–9448/00$10.00 © 2000 IEEE