The town of Lindon, Utah stretches only a mile or so along the desert highway south of Salt Lake City, the site of last year's Winter Olympics. Lindon is a "bedroom community"-most of its 8,300 residents head out of town to go to work. But one Lindon business has put the town on the map. SCO (pronounced "sko") Group has become the unlikely epicenter in a major battle over Unix, IBM's AIX and Linux. The fight involves a $3 billion lawsuit filed in the Utah State Courts by SCO against IBM. In pursuing its claim, SCO has attempted to revoke IBM's license to AIX, told some 1,500 companies running Linux that they may be running pirated code, criticized Linus Torvalds for being lax about intellectual property, and incurred the extreme wrath of the open source community.
The path to this unlikely set of circumstances began back in May 2001 when SCO Group, then called Caldera Systems, acquired Santa Cruz Operation's Unix business. At the time, Caldera was a second-tier Linux distributor and the reasons for its acquisition were unclear. Santa Cruz Operation was known for its proprietary Intel-based operating systems: OpenServer, which is based on SCO's Xenix operating system, and UnixWare, which it acquired from Novell. Reporting on the acquisition in the February 2001 issue of Software Design, I wrote: "For the true believers, SCO is a dinosaur, stuck in the old economy, using the old way to develop software, and gravely threatened by Linux. Why, they ask, would a Linux distributor dirty its hands with a company that seemed to be running from Linux, rather than embracing it?"
In the months that followed, nobody at Caldera seemed to have the answer either. The following March, company shareholders approved a 1-for-4 reverse stock split, which is usually done to raise a company's sagging stock price. There were layoffs and more layoffs. Chief Technology Officer Drew Spencer left the company. Then, in June 2002, Caldera co-founder Ransom Love stepped down as CEO. His replacement was Darl McBride, who worked for Novell from 1988-1996, which included the period Novell sold its Unix business to Santa Cruz Operation. One of the first changes under his leadership was to change the company's name: from Caldera International to the SCO Group.
Looking back, the name change was an early clue of things to come. Reverting to the SCO name meant de-emphasizing SCO Linux, Caldera's legacy, and re-focusing on the company's proprietary Unix products. And of those products, UnixWare would become the real focal point. UnixWare is the trade name for Unix System V, which can trace its lineage back to Unix's creation at AT&T Bell Labs. McBride saw in the UnixWare pedigree a hidden revenue source for his struggling company. In October 2002, the company hired Chris Sontag to head its newly created SCOsource division, whose principal mission is to squeeze additional revenue out of SCO's Unix ownership. The company publicly disclosed the existence of SCOsource the following January at LinuxWorld.
"Caldera's focus was on Linux-that was a prior management decision," says Sontag. "This management's focus is more on Unix. We haven't seen too many Linux businesses that have been highly successful. It's hard to make money at something that's free-there are plenty of examples of 'free' not doing well." Sontag maintains that SCO doesn't have a problem with open source, per se, "but we have a problem competing with free products that use portions of our intellectual property that have been stolen from us."
In March, SCO filed suit against IBM, reportedly after negotiations broke down, claiming that IBM had purloined Unix System V code for AIX and dropped it illegally into Linux. In May, the company sent a letter to some 1,500 large Linux installations around the world, contending that the operating system is "an unauthorized derivative of Unix." Because open source does not have the built-in legal safeguards of closed source development, the letter said, "it is not surprising that Linux distributors do not warrant the legal integrity of the Linux code provided to customers. Therefore legal liability that may arise from the Linux development process may also rest with the end user." SCO advised recipients to consult their attorneys.
SCO's written legal complaint filed with the court even took Linus Torvalds to task, saying that his "inability and/
For the open source community, SCO's legal moves and accusations are blasphemous-a violation the trust and comradeship that makes free software possible. But from the perspective of SCO Group and its shareholders, the IBM lawsuit makes some internal sense. It its earlier life as Caldera, the company played a distant second to Red Hat as a Linux distributor. The operating systems it purchased from the old SCO competed directly against Linux on the Intel platform. As Sontag points out, it's hard to compete with "free." By contrast, IBM, Sun and HP-which all provide Unix and Linux servers-make their money primarily on preconfigured hardware and consulting services. Microsoft earns its money the old fashion way-by actually charging for its operating system and applications. Red Hat-the best known Linux distributor-has well-known investors, highly visible bundling deals, a product family targeting different size companies, and consulting and training services. SCO has a little bit of all of this, but perhaps not enough.
McBride's responsibility is to his shareholders. If SCO Group revenues go up through some combination of court verdict, out-of-court settlements, and increased licensing fees-if the company gets sold at a favorable price or stays viable through it all-McBride will have done his job.
The complaint: tainted code
SCO's public confrontation with IBM began with a $1 billion lawsuit, later raised to $3 billion. SCO also put IBM on notice that it had the authority to revoke IBM's AIX license. To put some teeth behind its accusations, SCO hired Boies, Schiller and Flexner-a 163-attorney legal firm headed by David Boies, who gained prominence representing the U.
The complaint says that IBM wanted to bolster Linux because it was moving to a services model-give away the software and charge customers for systems integration. "By undermining and destroying the entire marketplace value of UNIX in the enterprise market, IBM would gain even greater advantage over all its competitors whose revenue model was based on licensing of software rather than sale of services." To accomplish this, IBM "set about to deliberately and improperly destroy the economic value of UNIX and particularly the economic value of UNIX on Intel-based processors."
The alleged means for doing this was to give parts of the AIX to the Linux community, but in doing so, IBM gave away Unix source code and libraries, as well as Unix concepts, and know-how. Some of IBM's expertise was allegedly gleaned from a joint IBM-SCO development project, called Project Monterey. IBM, not surprisingly, has denied all of this: no misappropriated trade secrets, no unfair competition, no interference with SCO's contracts, no breach of its own contract. On June 16, SCO made good on its threat and revoked IBM's AIX license. IBM immediately responded by contending that "our license is irrevocable, perpetual and cannot be terminated."
Opponents question ownership
As is typical in U.
OSI argues that claiming "ownership" of Unix is a muddy business. "Even during the early days of Unix commercialization, the Unix code base was widely regarded as a commons worked by many hands," Raymond and Landley wrote. "As time went on and Unix evolved, possession of an AT&T source license came to be seen as more a pro-forma gesture in the direction of history than a concession that AT&T's intellectual property still contributed a dominating part of the value. This was especially so after the Berkeley hackers added Internet capability to Unix around 1980.
"Thus, the community of Unix hackers that had grown up around the pre-commercial releases never lost the conviction that, ethically, the Unix code belonged to them - the people who had the ideas and wrote the code - regardless of what the legal paperwork said. The outcome of the USL-vs.-Berkeley lawsuit in 1993, which severed the claims of AT&T and its successor Novell to the BSD source code, was universally regarded in the community as no more than simple and overdue justice."
That lawsuit, in which Unix Systems Laboratories and Novell sued the University of California, Berkeley and Berkeley Systems Design (BSD), was settled out of court, with the results sealed. But the authors contend that the unsealed proceedings would reveal that "there are many sources of code and engineering experience in the Unix design tradition entirely independent of AT&T/
The authors argue that IBM had good reason to walk away from Project Monterey, its joint development project with SCO. Four months before, IBM helped found the Open Source Development Lab with facilities in Beaverton Oregon and Yokohama. (The lab now employs Linus Torvalds.) "When OSDL spun up, IBM gained a choice: work with one small partner that lacks demonstrated expertise or focus on the enterprise market, or join a large consortium of industry heavyweights with man-centuries of relevant experience."
The authors contend that the internal workings of Unix were hardly trade secrets. "For more than thirty years, there has been a flourishing technical literature describing Unix operating system architecture and concepts. UNIX system internals and architecture are routinely taught in university computer science courses. Indeed, old SCO's and SCO/
Another unresolved argument has to do with the General Public License (GPL) that governs Linux distribution. SCO was, of course, a Linux distributor-a legacy from its Caldera days-but stopped selling SCO Linux after it sued IBM. Under the GPL, if you modify the code, you must make those modifications available to others. As part of the UnitedLinux alliance, SCO distributed the Linux kernel. Did it inadvertently put proprietary code under the GPL label? Many observers think not-that the GPL license requires a voluntary act. Others think the lawsuit would at least be a good test of the GPL.
The missing code
OSI contends that the source code for Unix System V is so diluted with the sweat of outside developers that it belongs to the world. But without further proof, that question can't be answered. In its letter to Linux users, SCO said it had evidence that portions of UNIX System V code had been copied directly into Linux, or modified first "seemingly for the purposes of obfuscating their original source."
Which code is that? At this writing, Chris Sontag has been showing some examples of alleged copying on a non-disclosure basis to journalists, analysts and companies. (In answer to my request, the company said that either I would have to go to Utah or meet Sontag at some unknown point when he visits the San Francisco Bay Area.) He claims SCO has "many, many examples of direct [Unix] System V in different Linux modules. We're talking about significant amounts of code, including human comments and grammatical mistakes." One example, he says, has 80 identical lines of code, though they aren't necessarily consecutive. "In some cases lines have been added before and after, and some lines have been modified, but for the most part, even a person who has never programmed would say: 'those are the same'.
Sontag says that even more code to Linux came via AIX and Dynix/
The question of whether source code has been plagiarized gets complicated, quickly. If you copy a page out of Harry Potter and call it your own, it doesn't matter which page you've copied-it's still plagiarism. But with software, content and context matters. "The question is: how important is the code to the overall program," says Andy Johnson-Laird, who has testified as an expert witness in several software plagiarism lawsuits. Johnson-Laird is a not an attorney, but a forensic software analyst-he looks for evidence of wrongdoing in the source code the way Sherlock Holmes did in a strand of hair or a footprint. A long-time systems analyst, he has trained both lawyers and judges in the intricacies of software.
"If you have 80 lines of misappropriated source code, how important are they? Are they 80 lines of comment? Or a crucial algorithm? Or 39 lines of blanks and the rest left and right braces demarking conditional code? The real question is: is the code substantial? If it's not, then the odds are it's not going to be used by the courts as worthy of copyright protection. On the other hand, one line of code, or even one semicolon, can have a huge impact on the program." He notes that many years ago, a single missing semicolon in the C source code controlling telephone switches caused a major outage on the American East Coast.
"Sometimes, the code is copied verbatim," he says. "And sometimes, that's just the tip of the iceberg-revealing other copying that's not as obvious. Perhaps the logic is the same or similar. Or someone has copied code from one language into another. But the intent on the part of the programmer is always the same: to save time."
Proving substantial similarity is the first test of source code plagiarism. But a lawyer must also show that the work was not the product of external constraint. Does the accused code resemble the original code because it was copied, or because there are only so many ways you can write it. There are only so many ways you can communicate with a server across the Internet-all of them involving TCP/
Johnson-Laird, who first started doing forensic work back in 1986, often sees copyright-related issues, largely because programmers don't always understand the law. "Many programmers think that if they write a program today and get a new job tomorrow, they can take either the actual source code or what's in their head-and simply develop a competitive product. But that's not necessarily so. The law defies the programmer's point of view-that it's their code and they can re-use it. Depending on their employment contract, they probably can't."
Johnson-Laird said he was approached on the SCO case but declined to participate. "If someone hands you a hand grenade with the pin missing, what are you going to do? I think the Linux effort is an important one in the overall scheme of things. There is something important at work here and it's Eric Raymond's concept of the cathedral and the bazaar. The major companies, like IBM, are starting to recognize the value of Linux. Therefore, we felt that societal overtones made it not in our or society's best interest to work on this case. And that's a decision we have to make on every prospective case."
His hand grenade analogy is apt-and it's unclear at this point where it will explode. SCO appears to have burned too many bridges to be a viable player in the open source business. Would you buy a Linux distribution from this company? Its proprietary products compete against a free operating system, whose adherents treat it with near-religious deference while donating their time to relentlessly improve it. So SCO's intellectual property would appear to be its most promising revenue source. And it has already gained some dividends: In May, Microsoft licensed SCO's Unix patents and source code. On the announcement, SCO's stock rose 38 percent.
As for Linux users, I think it unlikely they will have to pay SCO royalties. More likely, the offending code-if there is any-will be replaced and Linux will continue on. But no matter what happens with the SCO-IBM lawsuit, another layer of innocence has peeled away from open source development. Larger companies, especially, will think twice before using open source code. With every new version, the contributed code is bound to get more scrutiny. People will ask: is this original work? Or was it cut-and-paste? The Linux penguin had better start doing pushups.
Sidebar: "Litigation of intellectual property is just insane...":
A conversation with Intellectual Property Attorney G. Gervaise Davis
G. Gervaise Davis's legal career spans the entire history of intellectual property as it applies to computers and software. He began practicing in 1958 in the mainframe era, and 10 years later made technology law his area of focus. He helped found Digital Research, developer of the CP/
How should Software Design readers evaluate what's going on between SCO and IBM?
They should know that any time companies dispute the ownership of software, it's hard to separate the public relations from the BS. Historically, probably half of one percent of all litigation in this area actually goes to a full trial. Statistically something like .3 percent of these cases actually gets a verdict from a judge or jury. The reason is cost. A patent lawsuit can run from $2 million to $5 million, with lawyers and consultants charging from $150 to $500 an hour. Some years ago, I worked on a reverse engineering case in which every meeting was costing the parties about $11,000 an hour in legal and expert consulting fees.
Has there been any pattern in how these cases have been settled?
The most likely scenario is that the parties quietly sign a cross-license agreement. One of the reasons you don't see lawsuits very often between aircraft or car manufacturers is that these cases are generally settled this way. The parties know they have to work with each other again. Many times, I've told people that taking the claim to court makes no sense-that the cost of litigation is not worth it. Eventually, after their anger subsides, the business executives realize that's true. They also realize the jury may well get confused and make the wrong judgment, or may come up with some astronomical punitive judgment. People don't realize that numerous disputes like these are never made public. The most rational companies recognize that litigation of intellectual property is just insane from a cost-benefit standpoint.
In the SCO case, does that mean we should look for the public relations ramifications as much as the legal ones?
Absolutely, because once the dispute becomes public knowledge, it becomes a public relations war-not just for the opinions of you, me and other interested parties, but for the board of directors of IBM. They will keep reading about the case and wondering-do we have a problem here or not? Many of these cases effectively become like blackmail in the sense that somebody decides that it is worth paying the money just to stop all the questions and fuss.
If a case comes to the public's attention, can you assume something has already gone wrong?
That's right. Maybe someone's claim is just too preposterous to consider settling, or it generates a great deal of publicity, or is filed by a desperate company doing everything they can to raise money. Or a company has a claim so serious that the other side can't settle it because it would cost too much and they have to litigate to see if it increases their leverage.
Is this the first potential case involving open source?
Some other claims have come close, like Unisys's patent on the LZW compression algorithm, involved when you use GIF files. And then there's Rambus's claims over SDRAM memory. Rambus proposed a standard without disclosing they have an intellectual property claim on it. If you propose a standard, do you have an obligation to reveal that you have some claims to the technology? And in making such a proposal that it become the standard, do you give up your rights?
The SCO case has some similarities. Some people argue that the very use of the GPL makes this code available royalty-free. But it's more complicated than that. I have a client who makes software that allows people to collect and report on human rights violations. The software been made open source. We still insert a notice of copyright in it, even though we don't claim we're entitled to any royalties. Just because something is distributed under an open source license doesn't mean you are necessarily giving up any copyright claim to it-you're just giving up any right to any royalties to it. The distinction is that you are still the owner of it, and in theory, could stop other people from creating derivatives of it.
Under copyright law, you have the right to claim ownership and get credit for it. You have the right to prevent people from copying it, and to stop people from creating derivatives from it.
And all those could be business advantages even if you didn't derive revenue directly from it?
Correct. There are numerous people who have copyright ownership of something, but they allow people to use it. Java code is a great example. Sun owns the copyright but they clearly state that you have the right to use it freely as long as you give Sun credit as the owner. They still retain some control over the subsequent development of Java. JPEG is another interesting example. The Joint Photographic Experts Group owns the copyright, and yet anybody can use the format.
There's been a lot of confusion on the letter sent by SCO to the companies using Linux. The letter never told the recipient what actions to take.
There are two reasons for that letter. One is to get publicity over their claim. The other is that, under the intellectual property laws, if you claim copyright infringement, you must pursue the claim-or you may very well lose your rights. Numerous companies, when they think something has been infringed, will be very careful not to send a cease and desist letter. They will just notify people in a general way that they have these rights, and that companies may be infringing on them. In this way, they are not forced to bring a suit right away, but the implication is there.
Does that mean one could look at the letter and say this has more to do with the IBM lawsuit than it does to the recipient?
Yes. Exactly. Because if they really thought that the other people were infringing, they probably would send them a cease and desist letter. But once you send that letter, you have to file suit within a reasonable time, or you'll lose your rights.
How do you prove theft of source code?
Determining whether source code is stolen or derived from another's work is a bit like enforcing a musical copyright. It's hard to do. It's often a matter of finding bits and pieces from the first work in the second, or vice versa. The problem is, you rarely find whole sections or pages of identical code. The copyright law says that infringement occurs when there is "substantial similarity." That is what you have to prove. You do not have to prove that it is identical.
If the code performs a standard task but is written in a unique manner that resembles the original, then you begin to see substantial similarities. That is why comments become important. You don't have to put comments in code for it to work, and the comments don't have to follow a rigorous syntax. Comments become critical to proving plagiarism because every programmer has a personal style for writing them.
When you claim infringement, you look for similarities that don't have to be there for functional reasons, but are largely there because that's the choice the author of the code made. Coding is like writing an historical novel: part of the material has to be there because of the actual events-the functions the computer must perform-but most of it can be arranged and written in several different ways. If the code is similar for non-functional reasons, it is likely to be stolen or infringed. That is what you look for in an infringement.