The first critique and reply is addressed at the general audience, while the second is more technical and abstract (euphemism for boring for most people), in that it tries to connect the problem with the problem of statistical inference.
I will address two critiques:
When you say "aggregate welfare ex post" do you mean maximize over \theta given the observed distribution of types in the world? Say that I am of type i*, but this is unknown to me, so I assume that i* follows the observed distribution of types in the world (e.g. just tally how many rich/poor people there are in the world). I choose \theta to maximize my expected utility. Is this what you mean by maximizing aggregate welfare ex post? But I think Rawls is considering the following situation which is slightly different: I am of type i* (again, I don't get to observe this), but this time there is a veil of ignorance so that I do not get to observe the distribution of types (pretend someone deleted data on how many rich/poor are in the world). I can't do expected utility maxmization in this case. Depending on whether data on rich/poor is available, this could lead to drastically different outcomes. Suppose that I am told most people are rich in the world, then given this observed distribution, I'd choose \theta to benefit rich people, since most likely that will benefit me. But if I need to make a decision without knowing the relative number of rich and poor, I'd make sure that everyone is somewhat well off under my decision rule.
So I will formalize the problem of inference and justice, in such a way to make the structure parallel.
Inference:
We have a type $i|\alpha\sim F_\alpha$. That is $i$ is the unknown parameter, and its distribution comes from a family of distribution (priors) with parameter $\alpha$. But we do not know which prior it is, that is we do not know $\alpha$.
We are looking for a rule $\theta$ according to which we can do inference. Every time we act according to the rule, we incur some loss, and this is captured by the perfomance function $U(\cdot)$ (negative of loss). Since the performance of the rule also depends on $i$ itself, $U$ is a function of not only the rule $\theta$, but also the type $i$.
As remarked, $i$ is a random variable, how $\theta$ performs in expectation depends on $\alpha$, that is the number we are concerned about is
\[
E[U(\theta,i)|\alpha]
\]
We would like to maximize that number, but we do not know $\alpha$. So we choose the rule such that it will gives us the best result in its worst case scenario. In essence we are applying the minimax principle to $\alpha$ the parameter that determines the distribution, not to $i$, the type that we are interested in.
I have taken some time to set up the problem of statistical inference slowly, because I do not need to do this for social choice again. We are facing the very same problem, maximizing the following
\[
E[U(\theta,i)|\alpha]
\]
We can apply the minimax principle to $\alpha$, but not to $i$. What Rawls argue is that, we need to apply minimax principle to the type $i$, to maximize the welfare of the worst of individual, not to $\alpha$ to maximize the general welfare in the worst kind of society. This, is in my views, unjustified. While I can acknowledge the logic minimax estimator in statistical inference, that logic does not justify Rawls argument, which I believe is flawed.
Where does my utilitarian principle lie? It still fits this framework and it has deeper connection to statistical inference than I thought. Since this is not the focus, I will be brief, but it suffices to say that a utilitarian rule is always admissible (in the sense of Bayesian statistical inference).
Can we still reach the optimal utilitarian rule under this extension of veil of ignorance? The answer is yes.
First, I state the obvious. If people has the right prior (they know $\alpha$, they should choose the rule according to utilitarian view, that is
\[
\theta^*=argmax_\theta\: E(U(i,\theta)|F_\alpha)
\]
Now we consider that they do not have the prior, they would like to be able to choose the same rule, but they cannot. but they observe that they do not have to choose "the rule" now. They can postpone choosing "the rule" after the veil is lifted. All they need to do is to choose a rule for choosing "the rule" after the veil is lifted. After the veil is lifted, they will observe $\alpha$. They stipulate that the rule chosen after the veil is lifted will be the same rule if they were to choose the rule now with the actual prior. In essence they stipulated that the rule that will be chose after the veil is lifted will be in accordance with the utilitarian principle. Put another way, they choose a family of (utilitarian) rules $\theta_\alpha$, and say they will implement the rule $\theta_{\alpha_0}$ when they observe $\alpha=\alpha_0$ after the lifting of the veil.
Finally, from my point of view, I am not in favor of extending the veil of ignorance towards the distribution of types as a thought experiment. For me Rawls was more concerned that our knowledge of our own relative position in the type distribution will bias us (rich do not want too much welfare state). Thus the veil of ignorance is to avoid bias between the relative well off and relative worse off within a society, not to avoid bias resulting from knowledge of differences across societies. I now actually feel I should argue this point more forcefully because it does have implications. If we think the veil of ignorance should extend over the distribution of types as well, then it will imply that all societies, should only have one just rule. But if we only allow the veil of ignorance over types, then we would allow rules to differ across societies. For example, in a society where everyone is already so rich that they live comfortably, there should be less redistribution; while in a society when only some live very comfortably and some live miserably, we might need to be more aggressive in redistribution.
I sincerely thank my two great friends for their input, because you make me to think harder on the question. I recently saw the following statement on a blog:
I love to argue one on one, and common beliefs are not important for friendship — instead I value honesty and passion.I think it is fitting to put it here. I still cling to my belief, and really appreciate that you are there to challenge me.
I will address two critiques:
Critique 1
So would you say the Rawls argument would be favorable by someone who is extremely risk adverse? If the axioms of rationality doesn't hold then your argument wouldn't hold. the stock example that you use has such small loss that it makes the rationality argument seem more reasonable than it really is. what if it is like this- stock A: 0 all the time, B: 90% chance of 100, 10% chance of -800. or even: A: 0 all the time, B: 10% chance of 900, 90% chance of -9. the axioms of rationality surely doesn't describe how we actually make decisions. But I think it isn't even necessarily the way that we SHOULD make decisions, because in this case there is reason to be extremely risk averse, and that because of the human beings that we are, equity could be an intrinsic good such that the huge disparity in outcome would discount the expectation that you take. but the point you made about ex-ante and ex-post is very illuminating for me.
Reply:
Von-Neunmann Framework does accommodate risk aversion. Indeed, it was developed to accommodate risk aversion when people try to analyze the decision making of gambling, when they find the concept of expected value incapable to deal with risk aversion. The stock example you give can be analyzed in such a framework. Depending on our parameters, we will get different prescriptions, but I think this is a strength, in that is shows this framework is rich enough to allow different degrees of risk aversion. Of course, some degrees of risk aversion are reasonable, others are not, which could be revealed by the decisions people make.
Over the years, this framework has been examined with different further restrictions---for example, utility should be bounded so that we will not run into some sort of St. Petersburg Paradox. This is an example where our intuition of risk aversion help us refine this framework, in this case putting a lower bound on risk aversion.
As for the axioms of rationality. If you look at them, they are very very primitive---so primitive that it is shocking when we find empirical evidence that people could violate such principles when we make decisions---we just could not even pass such primitive requirements of rationality. But as I commented, those evidence point to the irrationality of human beings, not the logical coherence of this framework. It is a perfect model for how decision should be made, though a crappy model for decisions are actually made.
Critique 2:
I was trying to argue that the veil of ignorance extends to the distribution with which one takes the expected value to get expected utility. Analogizing to Bayes, when you don't have an actual prior, then you should do minimaxity and I think minimaxity will lead to Rawls's conclusionWhen you say "aggregate welfare ex post" do you mean maximize over \theta given the observed distribution of types in the world? Say that I am of type i*, but this is unknown to me, so I assume that i* follows the observed distribution of types in the world (e.g. just tally how many rich/poor people there are in the world). I choose \theta to maximize my expected utility. Is this what you mean by maximizing aggregate welfare ex post? But I think Rawls is considering the following situation which is slightly different: I am of type i* (again, I don't get to observe this), but this time there is a veil of ignorance so that I do not get to observe the distribution of types (pretend someone deleted data on how many rich/poor are in the world). I can't do expected utility maxmization in this case. Depending on whether data on rich/poor is available, this could lead to drastically different outcomes. Suppose that I am told most people are rich in the world, then given this observed distribution, I'd choose \theta to benefit rich people, since most likely that will benefit me. But if I need to make a decision without knowing the relative number of rich and poor, I'd make sure that everyone is somewhat well off under my decision rule.
Reply
Despite its resemblance to minimax (in fact Wikipedia attributed an application of minimax to Rawlsian idea), I will argue the resemblance is only superficial, and the logic does not carry through.So I will formalize the problem of inference and justice, in such a way to make the structure parallel.
Inference:
We have a type $i|\alpha\sim F_\alpha$. That is $i$ is the unknown parameter, and its distribution comes from a family of distribution (priors) with parameter $\alpha$. But we do not know which prior it is, that is we do not know $\alpha$.
We are looking for a rule $\theta$ according to which we can do inference. Every time we act according to the rule, we incur some loss, and this is captured by the perfomance function $U(\cdot)$ (negative of loss). Since the performance of the rule also depends on $i$ itself, $U$ is a function of not only the rule $\theta$, but also the type $i$.
As remarked, $i$ is a random variable, how $\theta$ performs in expectation depends on $\alpha$, that is the number we are concerned about is
\[
E[U(\theta,i)|\alpha]
\]
We would like to maximize that number, but we do not know $\alpha$. So we choose the rule such that it will gives us the best result in its worst case scenario. In essence we are applying the minimax principle to $\alpha$ the parameter that determines the distribution, not to $i$, the type that we are interested in.
I have taken some time to set up the problem of statistical inference slowly, because I do not need to do this for social choice again. We are facing the very same problem, maximizing the following
\[
E[U(\theta,i)|\alpha]
\]
We can apply the minimax principle to $\alpha$, but not to $i$. What Rawls argue is that, we need to apply minimax principle to the type $i$, to maximize the welfare of the worst of individual, not to $\alpha$ to maximize the general welfare in the worst kind of society. This, is in my views, unjustified. While I can acknowledge the logic minimax estimator in statistical inference, that logic does not justify Rawls argument, which I believe is flawed.
Where does my utilitarian principle lie? It still fits this framework and it has deeper connection to statistical inference than I thought. Since this is not the focus, I will be brief, but it suffices to say that a utilitarian rule is always admissible (in the sense of Bayesian statistical inference).
Can we still reach the optimal utilitarian rule under this extension of veil of ignorance? The answer is yes.
First, I state the obvious. If people has the right prior (they know $\alpha$, they should choose the rule according to utilitarian view, that is
\[
\theta^*=argmax_\theta\: E(U(i,\theta)|F_\alpha)
\]
Now we consider that they do not have the prior, they would like to be able to choose the same rule, but they cannot. but they observe that they do not have to choose "the rule" now. They can postpone choosing "the rule" after the veil is lifted. All they need to do is to choose a rule for choosing "the rule" after the veil is lifted. After the veil is lifted, they will observe $\alpha$. They stipulate that the rule chosen after the veil is lifted will be the same rule if they were to choose the rule now with the actual prior. In essence they stipulated that the rule that will be chose after the veil is lifted will be in accordance with the utilitarian principle. Put another way, they choose a family of (utilitarian) rules $\theta_\alpha$, and say they will implement the rule $\theta_{\alpha_0}$ when they observe $\alpha=\alpha_0$ after the lifting of the veil.
Finally, from my point of view, I am not in favor of extending the veil of ignorance towards the distribution of types as a thought experiment. For me Rawls was more concerned that our knowledge of our own relative position in the type distribution will bias us (rich do not want too much welfare state). Thus the veil of ignorance is to avoid bias between the relative well off and relative worse off within a society, not to avoid bias resulting from knowledge of differences across societies. I now actually feel I should argue this point more forcefully because it does have implications. If we think the veil of ignorance should extend over the distribution of types as well, then it will imply that all societies, should only have one just rule. But if we only allow the veil of ignorance over types, then we would allow rules to differ across societies. For example, in a society where everyone is already so rich that they live comfortably, there should be less redistribution; while in a society when only some live very comfortably and some live miserably, we might need to be more aggressive in redistribution.
No comments:
Post a Comment