uncertainty: 2016

Thursday, November 3, 2016

Prediction on Bitcoin

I strongly believe one should leave a track record for their own predictions. In this way, they can be evaluated without selection bias. I am going to put down my first prediction.

I believe bitcoin has no future. It is just a bubble.

I will not go into the specific reasons why I think this way (no intention to write a 10-page blog).

I first discussed this two years ago. but today, I even begin to develop the suspicion that this might have been a fraud all alone. Here is what I wrote on quora:

I believe there is no future for bitcoin despite its current growth. It is more like the housing bubble before 2007, and will head for a collapse. It is not a viable currency as it is currently implemented. I think this is a modern form of the Mississippi bubble or South Sea Bubble.
If this is the case, I suspect this might be a fraud altogether. The Satoshi might be aware of the dismal future of bitcoin. He blew the bubble and rode the bubble to make profits (early speculation in bitcoin is very profitable when mining a blockchain creates good money and the competition is not fierce). If the thing do ultimately collapse, having ones’ name associated with this grand fraud is not optimal.
I would love to revisit this post 20 years from now and see how all played out.

Wednesday, October 12, 2016

Harvard Dining service Strike, Google, Terrorism and American Revolution

The Harvard dining service has been on strike, and the dining service has been unable to accommodate all the students. At least for me, the lines are longer and the food is worse. What does this have anything to do with Google and American Revolution?

Collateral Damage

It is purely a conflict between the Harvard administration and the dining service workers. Then they decide to go on strike during the school year, not in summer. And it is extended. This decision is tactical: it brings about collateral damage. If they strike during summer, most of the students are gone, and Harvard has enough backup people to keep things going--the managers (who do not belong to the union) can work as regular employees, not to mention Harvard does employ some workers not from this union. So the disruption will be minimal. Without disruption, it will be a power match between the administration and the union. The union probably understands that the mighty administration might prevail, or at least, they possess limited bargaining power.

So the tactics is to impose some collateral damage. Make the students miserable. As the Harvard students are as vocal as they are pampered, they will sure be a powerful force. They might protest and exert pressure on the administration, if it works out. That is, if they do not direct their anger at the strike itself. This is not trivial, as I definitely heard highly critical complaints among students. The game is to bet on where the anger will ultimately go. With the current atmosphere if political correctness (which the Harvard Administration has itself inflicted) and the left-leaning culture (which the administration has a role promoting), the dining service is betting on the anger, at least ostensibly, will go against the administration.

I should note, that I do not want to make a value judgement on who is right. It is beyond me (actually anyone) to calculate a just wage. My interest is purely to focus on tactics.

It doesn't have to be

The key point here is that there does not have to be a collateral damage. It is all part of a plan. In fact, this key element is present in many events.

Google

Google ostensibly left China because it says it won't offer censored content. There are two types of content, sensitive content and unsensitive content. My bet is the unsensitive content makes up more than 99.9% of the content. By bundling these two contents, Google is imposing a collateral damage on ordinary users (who does not bother to look up sensitive content) and hope this majority will complain loud enough as to make Chinese government budge. To say Google's move is principle driven is naive beyond comprehension, as one only needs to look at how Google tampered with search results to unfairly compete with Yelp and etc.

Terrorism

There are conflicts between two political power A and B. A is too powerful, and B is on the losing side. So B attack A's civilian, so as to inflict collateral damage on them. The hope is to make them complain so that A will pursue a more isolationist policy and leave B alone.

Some might protest that by drawing the analogy between terrorism and dining service, I am painting a negative picture of the strike. No. The tactic is neutral. It can be employed by the good and bad. What is more, let me add more cynically, one has many terms for terrorism. When the victim is US, it is called terrorism. When the attack is directed towards victim in China, it is no longer terrorism (see example 1, example 2, etc.)

American Revolution

To balance out the negative painting (if you insist), let me add something to cheer the readers up. American revolution. Remember what Americans did after the Stamp Act and other acts that replaced Stamp Act? Yes, Boycott. The tactic is to inflict collateral damage on the business people, who like harvard's students, are highly vocal and exerts lots of political power.

Framing

All the events mentioned involve the same tactic: inflicting collateral damage where it does not have to exist. The hope is to make a third party angry, and they direct anger at my opponent. But anger can fly either way: when framed as terrorism, anger is directed at the terrorist; when framed as a principled movement, the anger is directed at the other party. The key is to frame it nicely and conceal the intention of inflicting collateral damage. Tactics is always useful in a world where people are stupid, but that stupidity (receptiveness to manipulation) makes the outcome of tactics unpredictable. It might very well back-fire. That is probably why pundits and people with an agenda likes to make analogy: if you make an analogy with terrorism, it looks pretty bad; if you make an analogy with American Revolution, it looks damn glorious.

Tuesday, October 11, 2016

Random thoughts on US Election

It is quite funny to watch the US election. quite enlightening to ponder how a robust republic, that is capable of emerging from the ashes and debts of the Independence war and the War of 1812, has now turned into a dysfunctional “democracy” and end up with two awful candidates.

What is telling is the the enthusiasm and blindness on both sides. The Clinton supporters, for example, are so against Trump that they are willing to overlook the dark sides of Clinton, which are many. They come up with stories like Putin is manipulating the US election so that Trump can be elected and destroy America. This is the most stupid argument I have ever heard.

First, one should look at the validity of the information leaked by wikileaks, regardless of its sources. If it exposes true problems, address the problem, not just blame and speculate about the sources. The problems are real, and people choose to ignore them. This mentality is awful. With such a partisan mentality of looking at problems exposed, the self-correcting mechanism that is claimed to be so exclusive to democracy, is missing.

Second, it is quite hypocritical. US is not behind anyone in influencing the politics in other countries. Sometimes through money (Marshall Plan being the most famous), sometimes through military support, or sometimes through revealing information (not all of them are true). The truth is, every country does influence the politics in other country, via diplomacy, trade, or military. This is not special, and it is quite insincere to give this any special significance.

Many people on the left view the other side as uneducated, credulous, or downright evil. The truth is far from that. Many of the people I know who is on the right, are well-educated (sometimes with a Harvard PhD in sciences). Many people on the right similarly dismiss people on the other side as thoughtless and arrogant pseudo-intellectual. Not true either. I know some incredibly smart and thoughtful person who support the left while cognizant of the problems with the Clinton. Arrogance is the exclusive asset for either side.

No wonder the politics can sink this low. I guess it will sink lower. The fact that someone with Trump's temperament can be a candidate is a symptom, not the problem. It signals the dysfunction of US politics. The real problem is the dishonesty of the system. Leftists think he is the problem and they never realize their own camp is filled with cancerous cells as well.

Saturday, October 1, 2016

The Elusive Quest for Best Practice

(foreword: I am actually thinking of moving to medium.com. Their comment in line function is quite appealing. I really hope to get more critisisms or comments. so I also published this on medium)

People love to find a best practice. Who can blame them? After all, we are confronted with so many problems, and they give us a headache. Wouldn't it be great if we can follow some best practice and solve all these problems, or at least mostly solve the problems?

Problems

In public policy, we see the problem of corruption and bad policy. Wouldn't it be nice if there is a system that could guarantee the public policy serves the people?

In banking, we see reckless moral hazard. Wouldn't it be nice if we bind all fund managers to some best practice and manage money soundly?

In research, we see the problem of careless or even fraudulent statistical analysis, and the ensuing non-credible results. Wouldn't it be great if we can whip researchers all into some best methodology, and take the 'con' out of everything?

Fake Solutions

In public policy, we were sold ideologies like democracy and Washington Censuns. We point to successful cases and ignore failed attempts, and say democracy can cure every country except those who cannot be cured. (borrowing from doctors before the start of Randomized control trials) Or we say, ahh, those are not real democracies, these and that. But face it, democracy is not a pancea. The Dupont story and Flint Water Scandal should give us pause: in a well-run society, discovering and exposing such deeds should not be so difficult. Or consider this, Ohio regulators rubberstamped the merger between Aetna and Humana (two of the biggest US health insurance company) and kept the decision secret for more than a month.

In banking, we are sold all kinds of ideas: capital requirements, Bessel II, Bessel II, triple A....The 2007 financial crisis speaks to their limits (or absurdity)

In research, we were sold ideas like instrumental variable, natural experiment, propensity score matching. These made us feel like we are in the land of Asymptopia (see Leamer's paper), where we churn out more non-sense, just in more complicated forms.

Problems with Best Practice

I used to think that looking for best practice is a fool's errand. Now, it feel it results more from laziness than stupidity. Intellectual laziness. It is the kind of wishful thinking that one can rest sound and tight, and never need to do anything, if one can find the best practice. But you never will.

Sometimes the best practice does not exist. Though people or countries may have similar symptoms, the underlying disease differ. What jump-started western economies (free market) could leave late-comers in a resource curse, while the opposite (industrial policy) actually enable Japan, South Korea, and others to leap ahead. Just as Dani Rodrik pointed out: Diagnostics before prescription.

More fundamentally, the problem with best practice is that it is a static concept: there is something set in stone. It might work well for handling physical objects, like atoms, or chemicals, but it won't work for people. People find work-arounds and adapt to new circumstances (this idea is at least as old as Lucas critique, but deplorably, economists (including Lucas himself) address this critique with a best practice: rational expectation framework).

Starting with some best practice is great. The problem is most people stop there. It is only a starting point, and is no substitute for real work, like constantly monitoring and continuous problem solving. It is not enough to let people vote, it is paramount to stay vigilant against any forces that stealthily swap the democratic spirit for democratic form.

"Yes, there are all sorts of problems with best practice, but, what is the alternative?" people ask, "Doesn't is prevent the worse from happening?" Maybe, but at a cost. It gives us a false sense of security, complacency or even over-confidence; it encourages us to join a wild party celebrating "the Great Moderation" without realizing the boat is sinking. We might improve a little bit, but also swept the remaining mess under the rug of best practice.

It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so.

The lesser evil paves the way for the greater evil.

Running for Life

When I see all those researchers thoughtlessly implementing Inverse Probability Weighting (IPW) or carrying out a structural estimate, and present their results as credible. I cannot help thinking to myself it is another triple-A rated Credit Default Swap being sold. I should scramble for life.

Tuesday, September 13, 2016

What Can You Believe: the worrying rise of academic corruption

NYT published an expose on how the sugar industry paid three Harvard scientist to write a review paper that minimize the link between sugar and heart disease, but instead shifted the blame on saturated fat and heart disease. That paper was published in the prestigious New England Journal of Medicine (NEJM).

Academic Imperialism

I always felt that academic research exerted a huge influence on our daily life. Keynes once remarked:

The ideas of economists and political philosophers, both when they are right and when they are wrong are more powerful than is commonly understood. Indeed, the world is ruled by little else. Practical men, who believe themselves to be quite exempt from any intellectual influences, are usually slaves of some defunct economist.”

In fact, research in many practical disciplines other than economics influence people's life. They shape the policy or even our choices. When we read nutrition research, we are likely to change our lifestyle for the better.

For the better, if research is right. Of course, I admit even sincerely honest research cannot be always right. After all, conclusions rely on noisy observations, and won't be perfect. Hence, the perfectly rational way is to understand the imperfection of research and pool research conclusions together, and sort of "average" them out. Of course, we give more weight to research from prestigious journals or researchers from well-known institutions. That is what Bayes told us.

Academic Corruption

But, here is the rub.

Research is not honest. They are not just noisy. They are biased.

This is perfectly understandable. People conduct research, and when they face perverse incentive, their research is biased. Sadly, academic corruption is all too prevalent today.

Money

This is obvious. I am giving you money and you are expected to deliver certain results. NYT's article is one such example. The documentary Inside Job discusses such corruption in Economics (note: The documentary's depiction is not entirely accurate, but it is on to something.) The Flint water accident also reveals how good science was filtered out by money. The hero Edwards, remarked:

The pressures to get funding are just extraordinary. We’re all on this hedonistic treadmill — pursuing funding, pursuing fame, pursuing h-index — and the idea of science as a public good is being lost.

Career

This might sound strange, but a researcher's career might be in jeopardy if he refuses to cooperate. Consider this, you are a professor/PhD student in a business school. To do research, you need data. You go to a company for data. After much haggling, you get your data. Now you do the research. You found something bad about the company. Will you publish it? Even if the company has signed a contract with you that enable you to publish the research no matter what you find, you might still think twice. After all, you have a "reputation" to maintain. You need more than one paper for your career. Do you want to haggle with another company for data in the future and spend tons of time figuring out the details of that new company? (Empirically, not many people choose to do that. Most researchers focus narrowly in one subfield before getting tenure.) Furthermore, if other company know that you published something negative with the data you get from the previous company, will that company be happy to give you data? These considerations, render the disclosure rule useless.

On a sublter level, one needs to realize the power wielded by established researcher. They sit on grant committee (whether you have money to start research), and they sit on the editor board (whether you can publish your research to get tenure). The industry usually pay such established researcher to rubber stamp a paper (one often wonder how much work they did in the write up), and young researchers will be hesitant to challenge the "old guys".

The Consequences

In the beginning I mentioned how a rational person will pool the research results, and weigh research from more established institutions and journals. The NYT expose sent a bad news. NEJM is a top journal, if not the top journal. It is a journal even I, who does not do research in this field, check often. Harvard is a pretty famous institution. Both combined, cannot withstand the power of money.

People might tell me things are getting better. We have better disclosure rules. Yet, I remain skeptical. I have seen perverse incentives under these disclosure rule (as I mentioned). I hesitate to cast a vote of confidence.

It is ultimately a game we play. The corporations (money), us the citizens, government, and researchers. People get outraged when they discover something awful, and they demand change. As it is true in banking regulations, change in rules can be circumvented. What is more awful, the money learns. They learn to change the game, instead of just playing the game. They learn how to prevent leakages (by making things such complicated and opaque), and they learn to prevent meaningful change (regulatory capture). I still remember one quote from a game theory class Jerry Green taught: " many researchers are studying how players can change the structure of the game to improve what they can get out of the game. It is an interesting research question". I agree, and yet I felt it is also a depressing research question.

Tuesday, August 9, 2016

The Gold Standard for What?

If you know anything about causal inference, you probably know that randomized trail is the gold standard. Period?

That is very incomplete. If you want to answer the question how effective penicillin can fight infection, doing a randomized trial is the gold standard for answering that particular question. Period.

However, things get tricky when we have a different question. How effective is a Fosamax in promoting bone health? For beginners, it seem a no-brainer. Do a randomized trial, and measure the bone density at the end of the trial. Oops, you are changing the question. We were talking about bone health, not bone density. Bone density is a surrogate for bone health. In fact, research discovered that Fosamax might make bones more fragile while at the same time increasing bone density. This phenomenon is called surrogate paradox. Excluding surrogate paradox is not easy, see the paper (full disclusure: my girlfriend is a co-author). What about measuring bone health directly? It is hard. You cannot take someone's bone and apply pressure to it and see what is the maximum pressure under which it won't break. Measuring bone density is the best you can do for randomized trials. Running a randomized trial is the gold standard for answering the question about bone density, it might not be the gold standard for answering the question about bone health.

Randomized trail can only answer a certain set of questions. One should be able to manipulate the treatment in a controlled setting.

The fetish with randomized trials is dangerous. It makes research question subservient to methodology. We move towards research questions that is admissible to randomized trials. We could be missing important questions.

Development is an interesting topic. When I think of it, I always think of industrial policies that make the economy really take off. These are of first order importance. However, lately, research in this area becomes poverty-relief program evaluations. Does giving an egg daily to the poor help them earn more money? People should care about such questions, but ultimately, people should ask is there something we can do so that an African country, could like China really experience industrialization, and lift its people out of poverty en masse and permanently?

Another area is education. "Active learning" is the fad. People like seminars not lectures. Research show them to be more effective. Whatever. When it comes to education effectiveness, the format of class is not independent of many other factors, like the mentality and incentive of the student and the quality of the teacher. They are hard to measure and manipulate in a lab setting. So we take them out. Maybe, with a good teacher and motivated students, a lecture is more effective than a seminar format, while with crappy teacher and unmotivated student, seminar is better? Manipulating teaching format is easy, a low-hanging fruit. Picking low-hanging fruit does not help us find the root cause of our problems (bad teacher and wrong incentive).

Once you choose a question, if randomized trial is an option, do it. That is the gold standard. But if you change the question so that you can do a randomized experiment and pretend to have answered the previous question, I am not sure it is the gold standard anymore.

I wrote about gold standard. But the problem of subordinating research question to methodology is more general. For methodological superiority, we answer question B and pretend we answer A. Gold standard, silver standard, whatever.....

Let me end with a story: I have a servant. I ask him to do things for me. I asked him to do the laundry. He said yes and did it. I asked him to mop the floor, and he told me "actually, I am not good at mopping the floor, let me do something else." ...now, every morning I get up, I call my servant, and ask him politely, "sir, what can you do for me?"

Saturday, April 30, 2016

Why I applaud EU's Anti-trust investigation of Google

Last year, Europe opened an investigation into Google's practice in favoring its own content over its rivals like Yelp, in its attempt to extend its dominance in general search to specialized search. (see WSJ). Recently, Europe started a new investigation into Google's monopolistic behavior. This time, the concern is on Android, and whether Google unfairly uses its dominance in smartphone market to unfairly promote its mobile search business. The practice flagged including bundling the Google Play Store with Google search, Google Map and other Google's own services (See WSJ, NYT, Guardian and Reuters).
From my observation, there has been much hostility towards' Europe's investigation. They felt that Europe is acting aggresively because Google is an American firm. Maybe. Regardless of the motivation, I think Europe did a great thing. Not just for Europe, but for consumers around the world.

What is so unfair about Giving things away for Free

Many people point out that Google is offering Android for free so why complain? When DOJ charged Microsoft of anti-trust behavior, one central concern is Microsoft's unfair competition in the "browser war". Microsoft was giving away Internet Explorer for free as well. Giving things away for free does not mean no unfair competition exists. In fact, software does compete for the limited hardware space on the desktop then, and the tablet today. To the extent that many apps are willing to pay to get their apps pre-installed by OEM or set as the default app, Google's behavior now, and Microsoft's behavior then, hurts consumers in the form of higher hardware cost.

It is another mistake to think that we should be gratified that Google made a great product like Android and give consumers a choice other than iOS. Without Google, there would be other products. Ubuntu (like Android, is also based on Linux), and Microsoft, or even Blackberry could come up with feasible alternatives. One might point out that these products suck now. Yes, that is precisely because Android's dominance. Due to the network effect, the smartphone market could only support a couple of operating systems. Other OS, would not command enough users to attract app developers and would have a hard time taking off. If Google had not made Android available, one of the aforementioned OS would take off, though it might take them a bit longer to figure out the smartphone market.

The Long term concern

I think what really concerns me is the long term trajectory of Google. Everyone thinks of Google as an innovative firm now. There is lots of active PR behind this perception, but there are lots of truth as well. However, we should not take this for granted. AT&T in the 19th century was an equally, if not more innovative firm. (See a list of achievements). Eight Nobel prizes were awarded for works done at Bell Labs, AT&T's subsidiary. It developed a wide range of revolutionary technologies including radio astronomy, the transistor, the laser, information theory, the operating system Unix (to which Android owes), the programming languages C and C++. Its record would make many fund-eating state universities ashamed.

It did not end well. The government was not willing to be aggresive toward the monopolistic behavior of AT&T. Its anti-trust act was relatively weak (see Kingsbury Commitment). But AT&T's (or Bell Ma, as it was called) monopoly needed stronger hand. Finally, AT&T was broken up into seven baby Bells in 1984. Even after the break up, the baby Bells dominated the market, and you can still see them today. Bell Atlantic and NYNEX became part of today's Verizon; Pacific Telesis and Southwestern Bell became part of today's AT&T.

Even today, the lack of competition, the wireless telephone service remain among the lower-scoring categories in terms consumer satisfaction, along with health insurance and airlines, according to ACSI (American Consumer Satisfaction Index). It could be much worse, if AT&T's monopoly was not dealt with aggresively. The worst companies in terms of consumer satisfaction are Time Warner Cable and Comcast, the Internet service providers. The reason is simple: according to ACSI report:

Wireless telephone service, however, is a more competitive industry than Internet service, and most customers can choose from four major national carriers plus at least one smaller regional or prepaid carrier. Indeed, the aggregate of smaller wireless companies shoes the highest customer satisfaction by far with a 1$ uptick to a combined score of 79. [Compared to Verizon's 71 and AT&T's 70]

Going Forward

Google innovated greatly. We should not take that for granted. If we allow its anti-competitive ploys to continue, we may turn Google into today's AT&T or worse, Comcast, the parasites of the society.
Even today, we do not know if Google would be forced to be more innovative if the anti-trust regulators are more forceful and vigilant. Even during the innovative years of Bell Labs, Bell lab suppressed innovations such as answering machines, mobile phones, and FM technology to protect its monopoly.

Saturday, February 20, 2016

A Grand War for A Small Payment

A willingness to pay to use a copyrighted material creates a potential for copyright holder to reap financial benefits. The payments, however, sometimes could be so insignificant (de minimis), that the cost of collecting them dwarfs the payments themselves. Copyright holders have long relinquished such de minimis payments. However, as I will document in the first part of the blog, technology has enabled copyright holders to collect such de minimis payments with lower or negligible cost. In the second part of the blog, I will explain the apparent paradox that copyright holders are willing to wage a grand war for de minimis payments.

De Minimis is obsolete

One source of missing de minimis payments is small scale “infringement”. Some infringements, like distributing a chapter from a book for classroom usage, are allowed under fair use. Additionally, as a practical matter, copyright holders will not choose to pursue de minimis infringement, even if it is illegal---“the recovery might be de minimis, so that no one have any incentive to sue.”[1] Even if a user is willing to pursue a license, there is not enough money for the licensing agent to manage such requests.[2] Technology has made collection of such de minimis payments cheaper. Consider the case of YouTube, where Content ID automatically detects matching between User Generated Content (“UGC”) and copyrighted material, insignificant the copying might be. In the past, copyright holders will never discover small infringements, or even when they do discover, they would not go after such small infringements; now technology has allowed them to monetize, or block such de minimis infringement. (The fact that YouTube is able to take a large share of monetized value is a separate issue: it concerns Google’s unfair monopolistic behavior.)

Another source of missing de minimis payments is the limited commercial life of copyrighted material. Most books, for example, have a very limited commercial life. Though they are still in copyright 60 years after publication, it is most likely that they are out of print. Though there are still sporadic demands for such books, authors were never able to reap such benefits due to the high fixed cost of printing---it is economically infeasible to print a book for a small number of copies. Now consider the settlement of Author’s Guild vs. Google. Books that has reached its end of commercial life, are now infused with a new commercial life: readers can purchase digital access to out-of-print books; institutions can buy access to Databases containing out-of-print books; and advertisers can pay to place ads on Google Book Search. All these payments will be split between Google, publishers, and authors, with the majority (70% of net profit) goes to publishers and authors. In summary, technology has enabled publishers and authors to monetize in ways that were not possible in the past.

Why Fight a War over De Minimis

It is revealing to note how much effort publishers and Authors’ Guild (AG) put into collecting such de minimis payments, every penny of it. First, the whole negotiation took two and a half years. Second, the ultimate settlement went into great lengths to insure that publishers and AG get every penny out of it: From the determination of optimal prices, to running “Google Tests” to choose the best preview modes. The goal is simply to maximize “sales and revenues”. They did leave some consumer surplus on the table, but this is not inconsistent with monopolistic profit maximizing behavior (except in the case of first degree price discrimination, which is only a theoretical curiosity, profit maximizing monopolies will not be able to extract all consumer surplus).

It is misleading to call such payments de minimis. It is not de minimis for big publishers and AG, who will benefit from a huge collection of books: though payment on each individual books is small, it adds up. However, it is de minimis for individual authors. How many digital access can an author of an out-of-print book realistically sell each year? When the advertising fee, after publishers and AG take a cut, is divided among so many copyrighted material, how many cents can an author realistically expect? This potential income is de minimis in another sense: from an ex ante point of view (before the author writes the book), such incomes, arriving so many years later, will be heavily discounted. Assuming a 5% discount rate, payment after 28 years (original term of copyright) will be discounted by 76.3% (one dollar is valued at 0.23). Thus, strengthening copyright protection to enable such de minimis payments to be extracted serves little to ex ante incentivize authors.

This mismatch between the insignificant ex ante incentivization and the whopping eagerness to extract the payments is not unprecedented. It was present when Congress extended the term of the copyright. Time inconsistency played a key role: Revenue from copyrighted material 70 years later might be a negligible 0.01% of the present value of a copyrighted material ex ante; from the perspective of 70 years later, it is a gigantic 100%. Owners of expiring copyrighted material have every incentive to extend their monopoly. Here in Google Book Search case, there is an additional problem. The key players shaping the policy, publishers and AG, not only fail to discount the revenue due to time inconsistency, but also benefit from aggregating de minimis payments. It is wealth from a thousand cuts.

[1] Goldstein, P. (1994). Copyright's highway: From Gutenberg to the celestial jukebox. Stanford: Stanford Law and Politics. P96

[2] http://zoekeating.tumblr.com/post/108898194009/what-should-i-do-about-youtube

Wednesday, February 10, 2016

Copyright Infringement and Type I/II error

“It is more important that innocence be protected than it is that guilt be punished, for guilt and crimes are so frequent in this world that they cannot all be punished.”

--John Adams, defending British Soldiers involved in Boston “so-called Massacre”

Motivation

When one thinks of copyright protection, one thinks of barring certain behaviors that violate copyright, like massively duplicating a movie and offer copies for sale. That view and impression is outdated. Starting with Universal City Studios Inc. v. Sony Corp. of America, copyright holders have increasingly used copyright protection as a vehicle for shaping product design, devices or platform. They exert influences in three ways: litigations (Universal City Studios v. Sony Corp. of America; Cahn v. Sony; Viacom v. YouTube.), regulations (Audio Home Recording Act; Digital Millennium Copyright Act), and agreements formed by private parties (User User Generated Content Principles). This shift in focus has shifted the power of determining and executing copyright disputes from the court to private parties. The result, so far, has been dire. This blog will focus on the agreement that shaped today's User Generated Content services, like YouTube.

User User Generated Content Principles(“UGC services”), agreed to by leading commercial copyright holders and UGC services, set the current framework for UGC services. Within this framework, copyright owners provide reference data (video footage or audio track) to which they “believe” to own copyright. Then UGC services then employ some matching technology to identify content that matches key elements of those reference data. Finally, UGC services will “block such matching content before that content would otherwise be made available”. The rest of the blog will focus on the problems of this framework, exemplified by YouTube’s Content ID system (“the System”).

Contrast this with a court handling a copyright case. The plaintiff has to prove two things: 1) that he has the right to the relevant portion of the work in question and 2) that the defendant has violated one of the “exclusive rights”. After that, 3) only if the defendant fails to come up with an affirmative defense, the court will issue injunction or grant damages. However, on YouTube, the System effectively becomes the “judge, jury, and executioner” of copyright disputes. In the rest of the blog, I will first point out the fallibilities of the System with respect to the three elements and the problems arising from that. Then I will propose a framework for thinking about this issue.1.

Fallibility of the System

Copyright ownership.

This framework swaps the burden of proof with “good faith”.[1] This low standard of “proof” enabled fraudulent copyright claim. Additionally, even when one has a legitimate copyright claim to some content, he does not necessarily own every single part of the content. For example, when media companies broadcast some news, they will use some material in public domain, that is material with no copyright owner. The media companies have copyright to the overall news broadcast or comments, but not to those public domain footage. Everyone, should be free to use those footage. However, the System is not able to distinguish which part of the reference data is copyrightable, and when it finds other videos matches the portion of public domain material, it automatically makes the infringement judgement. This is exemplified by YouTube taking down NASA’s Mars landing clip and Lon Seidman’s video discussion of the landing.

Existence of infringement.

The matching technology is, in the first place, imperfect. As any statistical procedure, it inevitably makes “spurious matching”. For example, a YouTube user called eeplox uploaded a video with only sounds of nature like bird calls. The System matches its audio track to a composition licensed by Rumblefish, a music licensing firm (Wired report).

Affirmative defense.

Whereas the System is fallible in the first two dimensions, it is incapable of respecting fair use. Works involving fair use, such as parody and critical review, usually remixes or resemble copyrighted material. However, the System does not understand fair use, and automatically finds infringement. A most ironic example concerns Prof. Larry Lessig, who posted a lecture video on the cultural importance of remixes. The system muted the lecture, because it uses excerpts of music owned by Warner Music (report on EFF).
Given the inadequacies discussed above, the power yielded by the System is huge. It can and often does automatically and block the content. One can dispute the block, but the accuser gets to block the video for 10 business days for free. This is in stark contrast to the way court handles copyright infringement—unless copyright ownership and infringement is proven, and fair use defense fails, the content would not be blocked for a single second.

A Framework

When making judgements, any system make both Type I errors (falsely classify legitimate material as infringement) and Type II errors (fails to catch infringing material). One trades off these two errors by making the system to be either more stringent (more type I errors and fewer type II errors) or less stringent. A rational decision maker should weigh the costs of making these two types error and adjust the stringency of the system accordingly.
What is the cost of type I error? Stifling creativity and censorship. What about type II error? Possibly loss of revenue to copyright holders. However, as is clear from the US constitution, the purpose of copyright protection is to give copyright holders some rewards to incentivize them to do creative work. Copyright protection is a means to an end (stimulate creativity) not an end in itself. Copyright holders are not categorically entitled to all possible rewards. Type II errors are a cost for copyright holders, but not necessarily to the society at large. The real cost of type II error emerges only if the erosion of reward fails to incentivize creative work. It is not obvious creator are primarily motivated by financial incentives (though intermediaries like media corporations certainly do): In fact, some musicians willingly give up their financial rewards: allow their fans to stream or download their works for free (see Zoe Keating's complaint of YouTube). Furthermore, psychology research has pointed out that financial incentive could demotivate creativity. Thus there are few arguments for tipping the balance in favor of over-vigilance, and creating a “copyright scare”. I will come back to this issue in a later blog.

Concluding thoughts

Can one expect private parties to sort out the issue?

No.

YouTube always adjusted their practice to serve their business priorities, not public interest. In the past, when it was optimal to attract infringing material to mobilize their service, it chose intentional negligence. Now, its business priorities have shifted. It want musicians to make all of their music available on YouTube first. To do this, YouTube bundled this commitment with the privilege of the System (of near perfect copyright protection) (see Zoe Keating's complaint of YouTube).This privilege, comes at society’s cost (over vigilance), but the profit it generated is reaped by YouTube.

Note:

This change in business priorities might seem strange at first sight, so it deserves a careful explanation. In the beginning, when YouTube started, big media companies were unwilling to to put their contents on YouTube or online in general. For one things, they fear increased piracy risks. Second, they are not willing to let some third-party be cut into the distribution business, interrupting their vertical integration (an anti-trust issue, but no time for it now). Third, YouTube was just a small start up then, and there is no way big media companies will want to sign any contract with YouTube. Given the three reasons, it will be impossible for YouTube to get those contents in any legitimate way. The only solution is to let users upload pirated material so that YouTube can attract users. In fact, one of YouTube founders himself uploaded such material.
Now, media companies take it as a given that many media contents will be distributed online by third parties. The business question becomes how to attract them onto my platform. Due to network effects, a platform with the best and most inclusive content will attract the most viewers, and this in turn make the platform more attractive. Google is a big players, but so is its competitors, like Amazon. As a result, Google has adopted carrot (over vigilance on copyright if you join us and have content ID) and stick (negligence if you do not join us) tactic.

[1] Paragraph 3(g) of Principles for User Generated Content Services.