Thursday, March 28, 2013

A little bit of thoughts from school visit

First, I need to make it clear that these thoughts developed from my current school visit to Harvard Stats department, but it has nothing to do with the school choice I am to make now. This blog contains lots of mere speculations that came to me, and I found them to be interesting. I am jotting them down, not because I think I am getting at something, but rather, I want to keep those thoughts (naive as they are) and look back at them when I have time.

When I was talking to Joe, he mentioned that he is skeptical of the claims machine learning people are making these days. They tend to claim that big data will solve everything, and that the kind of machine learning algorithm makes no assumption.  If that's true, that would be almost a miracle. He commented: it seems that there is always a trade-off.

I have been a skeptic as well. Coming from an econ background, I am always suspicious of claims like this, which is too good to be true. The problem is I know too little, and many machine learning algorithm is quite opaque to me, and I have minimal intuition. It would be nice if one day, I can understand them and make it precise what kind of implicit assumption they make.  To me, one CANNOT make any inference without making any assumption. (this is no original idea) There is always a efficiency-robustness trade-off. Somehow through machine learning, we can break that, that seems implausible.

For me, inference is an iterative process. A naive and simplistic description would be: We start with a problem. We think about the problem. We decide on what assumptions we gonna make about the problem using our judgement. We start with these assumptions (models) and look at how well the data fit. When there is enough evidence suggesting the insufficiency of the models. We think hard and use a different set of models and do the inference...until we think we did a good job. There is a wonderful, wonderful quote from a paper by Blyth:

Dempster (1998) contrasted the ‘logicist’ and ‘proceduralist’ approaches to statistical modelling and inference. The logicist paradigm is concerned with reasoning about a specific situation under analysis, as opposed to the rote application and simple
reporting of defined procedures...The logicist approach requires choices – necessarily subjective, driven by the judgement of the investigator – about which objective features to include or omit in the modelling formulation. Thus, careful assessment is required to determine which features should be included in any formal [] model.

My hunch is that the logicist's judgement could not be proceduralized. (This is of course a very philosophical question, I have not yet had a very solid back-up, but that will be my guess if you point a gun at me and demand an answer).

One related thing we do is that, we want to have a "non-informative" prior in Bayesian analysis. I think this is both hopeless and pointless.  The difficulty is well-known for students of statistics--reparametrization mess things up (with some thinking it is actually not that surprising). I mean if you are a Bayesian, why shy away from prior? A more fruitful question is to investigate how sensitive results are to prior. or actually trying to find a good prior is helpful.

The thing is that, philosophically, there is no such state as non-informative prior and we observe data and update our prior. Priors have to be developed via ingredients other than data, and our mind is very good at doing that.  This cannot (or is very hard) be proceduralized.

On a certain level, the process of making higher-order assumptions is like developing priors. The difference is that we have more confidence in those assumptions, and for simplicity, we act as if they are the truth, so there are fewer layers in our inference structure. I think our mind and hunch has a comparative advantage in doing that. (I am actually interested to read up on Godel's incompleteness theorem to see if there are some insights from there).

Two trivially related comments:
1. I do not think highly of using computability and computational complexity from CS to study bounded rationality though it is in the fad. The complexity of the same problem is different for the brain and the computer. If we do, the best chance comes from psychology, where we can use insights from there study to define a different kind of complexity.

2. A never-changing financial model is doomed. There is a lot of judgement that came to each model, and those judgements have to be reevaluated and tinkered constantly. For some history, see More Money Than God. I thus think physics and philosophy (on the other extreme) is much better preparation than CS for those who work in finance.

No comments:

Post a Comment