## I was right!

The idea for this post popped into my head through a series of connecting memories; IOW, I was doing something that triggered a memory and I quickly connected that to another idea/memory and then again. It’s amazing how fast one’s mind can jump across time (history) and ideas, starting at one place and ending up somewhere very different.

What the F are you talking about, you say! Well, I’ll get there and maybe it’s even interesting.

I was working on my vocabulary app and encountered a situation where normal statistical analysis didn’t seem to make sense since my dataset was changing as I was interacting with it (and extracting some statistics). I don’t know of any “theory” that handles that case. But I realized, as I often do, I could address this with simulation, i.e. Monte Carlo type approach. And eventually I’ll get back to how this is my point of this post.

For instance I recently wrote a little simulation that I believe the theory (abstract math, lots of scribbling strange equations) couldn’t solve. It comes from a simple trivia game (like, but not, Trivial Pursuit). There are six categories of questions, selected by roll of dice, and you have to get three correct answers to complete each category and you “win” when you complete all categories (IOW, 18 correct answers).

Now let’s say, on average, I know the right answer to 50% of the questions. Does that mean, that on average, it will take me 36 questions to get the 18 right answers. Well, not exactly – remember that roll of the dice to select category thing. Sometimes the die will require me to answer a question in a category where I already have all three answers, IOW, an extra question not getting me closer to the final result (a correct answer does get me an extra turn, but that relates to how fast I might win, not how many questions I have to answer).

Now in case this isn’t clear, think about a simpler case: two right answers in two categories with 100% knowledge. Got to get four correct answers, so only four tries – right! No, wrong. If I get two correct answers for one of the categories, but then need to complete the other category, the random selection of category could send me back to having to answer a question for the category I’ve already completed. IOW, calling this categories A and B (now use the die, 1,2,3 -> A, 4,5,6->B) I could get a sequences (in four rolls of: ABAB or AABB, which would be nice, but here’s all the possibles: AAAA, AAAB. AABA, AABB, ABAA, ABAB.ABBA. ABBB, BAAA, BAAB, BABA, BABB, BBAA, BBAB, BBBA, BBBB, each sequence equally likely. But I’m only “done” (given always get answer right in the six of the cases; for the other 10 I have to take at least one more turn (and for AAAA and BBBB I’ll have to take at least two more turns). So my average, over many trials, is going to be based on p=6/16 I need only 4 turns, p=8/16 I need 5, and gets messy after this, i.e. in the 8 cases where I need at least one more turn, 50% of the time I’ll need to take another turn (plus for the AAAA and BBBB case, I’ll definitely need two more turns). IOW, my average number of turns will definitely be > 4, but how much?

Now maybe some great statistics theory person can figure out the exact answer purely through abstract math, but I can figure out approximately the right answer through simulation (as an exercise I hope I’ll do just that in this simplified case). The simulation answer will NOT be exactly right (for instance, the more trials I do, now using statistical theory, the closer my average answer will be to the “truth” (and by luck could be exactly right), but until infinite number of trials (plus a really good random number generator) I wouldn’t get the precise answer (the real dislike of simulation, how good is your result vs the cost of running the simulation longer, AND, can you actually know how good your result is).

So back to the more complicated case of the game, I wrote the simulation and chose a few values of my “average” knowledge of the questions and got some interesting outcomes:

I ran 2400 iterations of my simulation. Each iteration randomly picks a value for what % of the questions I’ll get right (column A for the specific values I used). Then it simulates random roll of die to pick a category, then random right/wrong on the question, and continues until got 3 correct answers in all six categories. The graph on the right (and numbers in column C) show the mean of all the simulations for that level of accuracy on the questions. Not surprisingly the more accurate you are on answering questions the less total questions are required BUT it might be surprising that, on average, it takes 32.3 questions to finally get 3×6 right answers, nearly twice. That is the effect of randomly picking a category where you already have three right answers, i.e. a “wasted” question. Just in case you’re wondering column B shows the counts of how many simulations I did at the corresponding accuracy – simplistically 2400 iterations / 8 categories = 300 tries in each category. But of course that’s not the actual numbers (column B and the bar graph underneath this). This shows another issue with simulation, the statistically expected number of tries per accuracy level doesn’t exactly happen in any given simulation (and my simulation took about 3 minutes to run). So I’d either have to: a) run many more iterations than 2400 so the counts are closer to the expected value, or, b) run many iterations of the 2400 iteration simulation and average the counts. Even with days of compute time there will still be some deviation. IOW, if my simulation is truly random we can, from “theory”, predict the outcome of one measurement, yet in this simple case we deviated from that and even a longer simulation we would still deviate. That drives abstract math guys nuts. OTOH, I’ll assert we have a “good enough” answer and that the real useful result of this simulation is the graph on the right (even though: a) it’s not precisely accurate, and, b) it’s based only on 8 values of question accuracy). The math guys want it to be precisely accurate for all possible values of question accuracy.

And every now and then they can crank their abstract math (usually for idealized and simplified cases) and get equations that are “proven” and “right for all possible values”. To a math person that’s an answer, none of this numerical crap of simulations.

Whew, long explanation before I even get to my second thought in this chain.

Last night I finished the book In Pursuit of the Unknown: 17 Equations That Changed the World by Ian Stewart. The last chapter was: “The Midas Equation, Black-Scholes Equation”. Of course this equation is familiar to me having concentrated in finance for my MBA at the MIT Sloan School (one of the early “rocket science” finance departments). The book writes this slightly differently but here’s what Wikipedia has for this:

Gobbledygook – right? I’ll save you, Dear Reader, the agony of my attempt to explain this (either the book or Wikipedia does a better job than I could anyway), but I will (briefly) talk about two things about the equation: 1) it radically changed finance and arguably is responsible for the 2008 crash, and, b) it’s wrong (bodacious for me to say as its authors got the Nobel Prize, but fortunately lots of other people say it’s wrong too).

Well it took a couple of centuries before inventing derivatives and for a while derivatives were still connected to REAL goods and mostly derivatives did work to stabilize markets, i.e. how a social value, not just gambling. But in the second half of the 20th century things really took off. Computers happened and: a) made complex calculations possible, and, b) provided infrastructure for rapid trading in markets. AND financial theory, as epitomized by Black-Scholes happened. With Black-Scholes, in theory, the farmer can determine exactly what the value of selling a futures contract is today for wheat delivered in the future. If the market price is about that price then he knows it OK to sell (and if the market price deviates from the calculated price the farmer quickly turns into a speculator and buys or sells the contracts and to hell with actually growing wheat).

So using this formula and computers some clever con artists decided we could have derivatives of things that aren’t even real, again, after all, a derivative is just bet about some number (that might or might not represent a real thing). And since it does take a fair amount of education + math skill + audacity (aka willing to steal from innocents) the more complicated the derivative, especially not in standard form on some exchange, the more money the crooks (AKA Wall Street) can steal from the rest of us. The trouble was, in 2008, greed never has a limit and the crooks not only conned the innocent, but they conned themselves, and so had absurdly worthless derivatives actually treated as some real asset. Of course when the whole house of cards collapsed many of these totally fake and made-up “assets” turned out to be nearly worthless, thus requiring the taxpayers to come pay all the bad debts the crooks had created.

That’s one side of Black-Scholes, how, like any discovery, knowledge can be turned into a weapon.

But is Black-Scholes even right, even if it could be used for some social value (not just to enrich crooks, i.e. bankers). And lots of people say NO and that will, eventually, be the point of this post.

There are two simple (and lots of subtle) flaws in Black-Scholes. First, see all those squiggly symbols – those represent continuous variables or functions or operators. BUT, the real world, esp. finance, consists of many discrete transactions. Yes, there are many of them and it may be fair to treat them as though they were continuous, but often they’re not. And this becomes especially true when the fundamental underlying theory of the Black-Scholes equation is violated, that is, that price changes are random, when in fact, in the real world, the price changes are rigged, esp. by expert crooks like Goldman-Sachs. You can’t build math on assumptions and say it still works when those assumptions are violated. So AT BEST Black-Scholes merely represents an idealized statement about an idealized market so applying it to a real, dirty, discrete, and dishonest market can be reasonably determined to be a really bad idea (as events proved).

The second thing is that to say abstract math represents the real world means the real world needs to have lots and lots of data. Because the real world is a specific instance of an idealized case. But with lots and lots of data the deviation of the real world from the idealized model is small. So, in the horrible (and IMO criminal) misuse of Black-Scholes, i.e. things like CDOs and CDSs, was there enough data from anyone to crank into this equation and pop out a meaningful price. Absolutely NOT. And in fact this shows one of the challenges of everyone’s latest fad, BIG DATA. We now know the ratings agencies, who applied formula like this to toxic pieces of crap Wall Street invented, got it wrong, either they lied (after all they were paid by the crooks) or they were just dumbshits (a popular claim that is very arrogant, i.e. Goldman Sachs hires the smartest people and S&P gets the GS rejects, so naturally the GS people can con the S&P people). That’s a favorite trope but only a bit correct (it justifies the notion the GS people got paid some much more, but not much else). The real flaw was data – S&P simply did not have enough data AND WORSE the data they did have was false.

Now Ian Stewart does a fine job of explaining “black swans”, i.e. “fat tails” and other ways statistics can be wrong, but I’ll deal with this other silly issue, that the data they used (wrongly) was itself inherently useless. The S&P people had to evaluate the insane liar-loan and reverse-accruing and variable-rate mortgages forced onto people who could neither afford them or understand them (sub-prime) by the worst scumbag crooks, the mortgage brokers. But even though the brainy math guys knew some of the loans were crap and would never be repaid, they just cranked that through their models. But what historical data did they use (all models have to have data to mean anything). Well, mostly nowhere near enough, i.e. less than two decades (thus not counting any major financial crashes, i.e. the Great Depression) AND typically the data was for PRIME and fixed-rate mortgages with substantial down payments. Give me break – anyone past high school should know that dataset was completely inapplicable to subprime and non-traditional mortgages. But given it was the only data they had they misused it and that was the con. The ratings agencies should have spoken the truth – IT IS NOT POSSIBLE TO RATE THESE SECURITIES, not that they weren’t AAA (who knows, maybe they were) or junk bonds. THEY DIDN’T KNOW. Their models and their equations were useless because their data was junk.

But everyone was so caught up in the mystique of this abstract math (and that the authors got Nobel Prizes, despite the fact they also caused the collapse of LTCM just before the bigger 2008 crash, like duh, the man on the street has enough common sense and doesn’t need finance MBA/PhD to know that’s bullshit). And what it really was, was the whole system (including the watchdogs) knew it was just a giant scam but they could wave the Black-Scholes equation and Nobel Prizes around and claim it wasn’t (under the euphemism of “financial engineering” – I never fly in a plane with that built by aeronautical engineers that are as wrong and dishonest as financial engineers, who, unfortunately, frequently are from the same college and department I attended).

And that gets me to my endpoint in the quick burst of thoughts I had (it took me seconds, if you’ve actually read this, it takes you 100x longer).

I was at Sloan in the early 70s. Data was scarce. Computer time was scarce and expensive (we’re talking IBM 360/65, millions for a computer that wouldn’t even play the tune in throw-away greeting cards). Grad students didn’t get much access to either data or computer time. But I had the delusion, concentrating in finance, I could use computers to predict markets and make a quick fortune. Through another source I had computer time. But I didn’t have any data. Now there was a dataset, created by Merrill-Lynch, that was hugely expensive (but available to research universities) and so it was very carefully controlled. In order to get this data (and run through my own models) I had to do a real project. And I got my chance. The big issue then was how to calculate beta (simple idea, won’t bore you with it). And back then, pre-Reagan, we believed in regulation, at least to the point of transparency and telling the truth. So the SEC (maybe FTC, I forget) wanted mutual funds to publish their beta AND all of them would have to use the same formula (algorithm, actually) to compute it so you could honestly compare one mutual fund to another.

So as is standard in research universities the professors needed grunts to do the work, so I volunteered, in full disclosure with the hidden agenda of just getting access to the restricted data (which, btw, I never actually used for my own purposes, but that’s because I’d gotten bored while doing my real project). Almost immediately, after writing some simple Fortran, I realized the same thing I showed in my simulation above or the same thing as Ian Stewart criticizes about using Black-Scholes and that is that instead of nice pretty numbers popping out, the numbers were too discrete, based on too few values, to be meaningful.

For instance, if I calculated beta over a month (for a particular mutual fund) and then repeated that calculation over many months I got beta values all over the map (but not enough to build a nice histogram to even just with eyeballs see if it looked Gaussian). If, for the same mutual fund, I used two months (or some interval, the longer, of course, meaning the less calculated values since we only, typically, had a few years data), those values significantly differed from the monthly values.

IOW, after diddling some, my answer was there was no “accurate” way to compute beta and all the various variations on the algorithm were irrelevant so just pick one (consistency still might be relevant). Plus the heterogeneity of the data was an obvious problem. Say your algorithm is compute beta over three month interval, then average those results over lifetime of mutual fund – what about a fund that had been in business for 10 years vs one only in business for 3 threes – are you comparing apples to oranges? And mutual funds change. They get new managers. They split, they merge. They have a little money (so make fewer investments), the get lots of money (so, just to meet rules, have to have many more investments). In short mutual funds are apples and oranges and applying a single measure, no matter how it was computed, to compare them was futile.

IOW, I said what I claim S&P should have said – not pick some other rating than AAA, but pick NO RATING because attempt to rate was meaningless.

This shows why I never had a career in finance. You don’t get paid for telling people it makes no sense to do something that the enterprise attempting to do it (and its top managers and there bonuses) will get paid a fortune to do anyway, even though it is meaningless. Plus in general, in finance, you don’t tell anyone what they don’t want to hear, unless you’re a few rare fund managers that get to be contrarian.

But, back to why I claim I was right – in this case the argument I really had with the prof was over simulation vs abstract math. He was disappointed my preliminary paper was all just computer code with no symbols scrawled all over and proofs and derivations. Now: a) while I wasn’t too bad at math, abstract math used in statistics never sank in, so I didn’t use it because I couldn’t, and, b) what I did know was to make the math work you had to make silly assumptions that struck me as wrong (i.e. use Poisson distribution, rather than Gaussian, because the math works in closed form – SO WHAT, is Poisson distribution the right answer).

So I wrote a simple program. In the program I created “god’s truth” (just a cliche for saying the absolutely real answer, not that god had anything to do with truth). IOW, I started from know beta. Then I generated, with some randomness, the actual daily price (share valuation) of the mutual fund, tied to a randomly generated market index (took a while to get a function that approximated the chaotic behavior of the entire market – beta is computed relative to market index, like the Dow (a terrible index), and all that is a different digression).

So having generated 10 years of daily price data (from a know index and a known beta, only with truly random variation) I could throw all the algorithms for computing beta at this. And I could do this for as many iterations as my computer budget allowed.

And guess what, my intuitive notion that computing beta was silly was born out by the simulation data. IOW, I was right.

But no I wasn’t, according to the professor (and so, despite doing far more work than any of my classmates, and frankly some fairly clever stuff, I only got a B on my paper). The trouble was I had never used abstract math and “proofs” to make my claim. I only used a computer and simulation. He claimed, since I couldn’t do infinite number of iterations even a correct simulation would not deliver the same result (as a certainty) as cranking the math.

Now the Black-Scholes equation was just being formulated at that time and all the profs, esp. at MIT, UChicago and Stanford were in love with that stuff BUT they wanted the academic standard of nice fancy looking formulas with new fancy looking proofs, not computer simulations. So what I’m saying is that early in the development of “financial engineering” that then inevitably became the 2008 crash they simply ignored the data problem(s): a) not enough, b) real data is discrete not continuous, c) datasets are often no comparable. So in their quest for beautiful math (which does win Nobels) they invented an imaginary result that doesn’t actually apply to the real world. And all the rocket scientists and thieves of Wall Street knew this, they just used math and Nobel Prizes for legitimacy to cover their tracks on what were obviously crappy investments guaranteed to lose money. As I’ve pointed out, in finance, you don’t get paid to disagree. (And if you did and were vaguely credible Goldman Sachs would hire you to shut you up).

So in a flash of a few seconds, thinking about one thing, connecting it to something I just read, and connecting that to my history, I realized it’s all full circle . I still use simulation and almost never abstract math and history has proved that is right and the math, while not “wrong” per se, is simply useless in the real world and will, sooner or later, be abused to just steal.

old fat (but now getting trim and fit) guy, who used to create software in Silicon Valley (almost before it was called that), who used to go backpacking and bicycling and cross-country skiing and now geodashes, drives AWD in Wyoming, takes pictures, and writes long blog posts and does xizquvjyk.
This entry was posted in musing and tagged . Bookmark the permalink.

### One Response to I was right!

1. Pingback: I was right! -2 | dailydouq