Monday, January 20, 2020

Chapter 1: Decision conundrum

The journey to greatness starts with a simple step. Just turn the envelope and write on its back.

The professor of machine learning asked the student, “Luke let’s begin with learning some simple ML [Machine Learning] techniques. What’s the probability of throwing a 5 with two dice?”

Luke grinned and went, “OK professor, trick question. Hmmm…let me guess…it’s gonna be like, 2 over 36?” belched out Luke with a smirk.

“OK smartass. How did you calculate that?” went Professor Stan.

“Professor, I’m a genius. Of course you knew that right? It’s the number of ways you can get a five divided by the total number of combinations of numbers 1 through 6…” he trailed off and realized how wrong he had been. A wave of ash swept over his face as he whitened.

“It’s alright Luke; everybody makes the same mistake. I used to forget one dice and say 1 over 36 with one die” Stan grinned. The class erupted in laughter, though not everyone got the joke.
Stan looked at the class and smiled his satisfied smile. He knew not everyone got it and wasn’t going to let go that easily today.

“So, class, we’re going to get to the bottom of this, OK? I think we all know probability is the chance of getting your favorable outcomes out of a total number of probable outcomes. I usually remember this through “FOTO”, that is, the ratio of number of Favorable Outcomes over Total possible Outcomes. In this case, the total number of outcomes of numbers is as follows:
{1, 1}, {1, 2}, {1, 3}, {1, 4}, {1, 5}, {1, 6},
{2, 1}, {2, 2}, {2, 3}, {2, 4}, {2, 5}, {2, 6},
{3, 1}, {3, 2}, {3, 3}, {3, 4}, {3, 5}, {3, 6},
{4, 1}, {4, 2}, {4, 3}, {4, 4}, {4, 5}, {4, 6},
{5, 1}, {5, 2}, {5, 3}, {5, 4}, {5, 5}, {5, 6},
{6, 1}, {6, 2}, {6, 3}, {6, 4}, {6, 5}, {6, 6}, and everything else is a repeat for two identical dice. That is, though the 2nd dice could have the same numbers with the 1st dice assuming the numbers 1 through 6, the total number of outcomes (TO) would really not change and repeat.
Therefore TO = 36.
Now what is the number of favorable outcomes (FO)? What are all the number of ways we can throw a 5? FO = {1, 4}, {2, 3}, {3, 2}, {4, 1} è 4.
Therefore the probability of throwing a total 5 è FO/TO è 4/36 è a 1 in 9 chance.

“Now class, did that make sense?” he saw a majority of heads nodding and some still had their eyes open like deer. He was going to let that go and made eye contact with one of the deer, “sometimes you just need to sit alone and write it all out. Guarantee you you’ll get it” and smiled his wide comforting smile. The deer looked up and closed its eyes in agreement.

Stan turned around and looked at the board, “OK now, what if I wanted my machine to figure this out on its own?” He turned back with a quick swish, his eyebrows perked up with a slight frown of curiosity.

“While some of it is rocket science, it’s actually not that hard. I’ve got one word for you; actually, make it two – pattern recognition. Machines learn by recognizing patterns. And who makes them learn patterns?” he was now pointing to Heart with his sharp index. Heart looked up, her dark shaded brown hair flowing around her sides, wave after wave sloshing against her brownish yellow skin of her tall cheeks.

“You mean us? humans?” her eyes were suddenly large concentric balls of white with glittering black. Heart Soledad was Chief Marketing Officer at Milky Comp, a global leader with the slogan “offering solutions for creation” – primarily scientific and mathematical software for industries like semiconductors, automotive, food and beverage, packaging, aerospace, agriculture, real estate and others. She was forty five and had graduated from Harvard ten years ago, majoring in public administration in international development with grounding in basic economics. Her favorite though was behavioral influences in economics and she was a powerhouse executor of this thinking.

Professor Stan looked at Heart with a jubilant look “Yes! Thanks Heart. My heartfelt thanks” he smiled and continued. Heart smiled back, her bright teeth a beautiful contrast to her Afro-Cuban complexion; she liked Stan and felt a slight attraction and quickly shrugged off the feeling.

“So, ladies and gentlemen, we can make computers learn, but WE need to make them recognize patterns. Let’s see how this works with the dice” he was looking at Hex now firmly. Hex nodded. Henry Excelsior aka Hex was Chief Strategy Officer at Milky Comp, a Chicago graduate in economics and finance, with a mechanical engineering major. He was a hardcore finance strategist but had an unbounded curiosity about the world, which had thrust him into this role when he had just turned fifty earlier in the year.

“Hex, I want you to be my guinea pig. What would you do if you wanted to make a machine learn how to calculate these probabilities?”

“I knew you’d pick on me Dr. Stan” Hex grinned and continued; Stanford and Chicago were healthy rivals for economics and finance.

“I would make the computer generate two sets of random whole numbers between 1 and 6 for about 100 times. Then take a number of times the total came out to 5 and divide by the number of times the numbers were generated” he stopped for a breath. “This ratio should then technically be close to the fraction 1/9 or approximately 11% like you calculated”.

“Bingo! Nice job Hex. Class, do you see the logic behind this and trying to generate the relative frequency of occurrences of the 5 on the machine? The same logic can now be used to generate various patterns and calculate various probabilities, right?” Stan’s eyebrows were raised again.

“So ultimately it’s all about teaching the machine to recognize patterns in the data that we’re feeding it. Questions, bouquets, brickbats?” he lisped and smiled. There were several heads nodding and a few just stared off into space, expressionless.

“Prof. Stan – can you show us how this would work exactly, like you did with looking at all the combinations of dice throwing possible?” this was Luke, chiming in with feigned curiosity.

“Aha!” went Stan. “I suggest you use a random number generator in Excel and try it out for yourself! Sometimes it’s better to roll up your sleeves and get into the weeds.” He wasn’t going to spoon-feed every bit of this thing.

“Now that’s machine learning folks. You’ve just learnt the first lesson: a machine learns differently. And you can’t dismiss this simple fact. Pattern recognition is not learning. Human learning is an inherent, imbibed trait that has a multi-dimensional aspect to it. It’s not a linear, look at a pattern and you know it type of thing. Learning happens with training and observing and listening and repeating. And failing. And correcting. And training. And on and on. It’s a continuous process with ups and downs in space and time”. 

Stan stopped and looked at the silent class gaping at him with an unexplained eerie quiet. He then smiled back at Luke and said “thanks for being the best setup in an ML class!” everyone burst into giggles and a few audible laughs, with some exclaiming “come on!”

Stan plodded on, “with the advent of calculators, laptops and the quintessential smartphone we’ve lost the ability to do ‘farmer math’ as my manager likes to call it. And farmers themselves have lost this ability, especially the ease with which to churn a few numbers mentally and do a DIMS[1] test. We’ll go through this journey by telling a few stories. Our first story is something more real than a silly dice throwing problem for Luke. It’s about crop yield, say in a corn field”.

The backstory and summary: Once a farmer wanted to be able to quickly calculate his crop yield, given that his people had just given him some data, like how many acres they covered that day, the number of bushels of corn they were able to harvest, the total number of acres that were waiting to be harvested, and some area that had been cordoned off for another crop. With this information the farmer figured out what his annual yield that year could be, and then think through market conditions that would prevail later in the year. With all this data he then quickly figured out his profitability with corn for the year, and then went on to repeat this calculation for his other crops. And finally he figured this for his entire portfolio of crops and with some constraints on costs and selling prices and accounting for variations of these, he was ultimately able to see what his annual performance was going to look like. This then made him realize he might need to hire a couple of extra hands to support the back half of the year.

And all this was done in about 5 minutes, on the back of an envelope (sometimes it just rests in the cranium and never gets to the envelope).

The basic point of such an exercise is to get that initial feel for something, dig in, and have the wherewithal to calculate basic numbers – this in itself is a big differentiator between an average person who does what he’s told and a person that stands out and is “bold” enough to say “Does It Make Sense? Does this pass my DIMS test? I’m gonna find out for myself”.

The DIMS test has a basic VÉTUDE[2] framework embedded in it – this is something to teach kids to get to a goal quickly by doing quick calculations. This is NOT only for kids, it’s for everyone!


V – Visualize (on paper or in the mind – whichever you’re comfortable with)
É – Extract all info – write it down so you can see it!
T – Think thru approaches
U – Understand (or get to an Understanding)
D – Double check or the quintessential sanity recheck, and
E – Expound (explain) with a flourish!

So what math did the farmer do? The first thing he did was to visualize the problem in his mind.

V – Imagine his field was a rectangle with an area of, say, 1000 acres[3]. Sketching something like this would take a minute, and go a long way to understanding what we’re calculating.

É – let’s say they harvested about 7% of his field that day – that’s 70 acres.
If the yield % per acre is 98 bushels per acre[4] then his total harvest from that day would be
= 98 bushels/acre * 70 acres = 6,860 bushels.
Now there’s 93% of his 1000 acres of field left for harvest. Assuming 70% of his field is corn, that’s 700 acres. Using the above info, his total yield that year from harvesting his full field would be = 6,860 * 10 = 68,600 bushels (assuming, of course, that the remaining 63% yields corn at the same rate as the first 7% harvested!).

T – thinking through what he could do with all this yield brings us to the crux of the calculation – show me the money![5] Assume the price of corn (determined, say, by a mercantile exchange like the CME[6]) is $3.75 per bushel. Then the farmer’s sales revenue (“income”) for that year from corn alone would be = 68,600 bushels/yr * $3.75/bushel = $257,250/yr (from corn alone, from 700 acres of harvest). And then let’s say he’s got to pay his farm hands, farm boys, costs of maintenance for the field, tractors, combines, backhoes and other equipment, storage costs (barn) etc. total to ~$175,000. So his profit from corn would be ~$83K for the year – this is the money that “he makes”. A similar calc for his other 300 acres (with, say, soybean) might get him another $25K of profit (assumed) from a total sale of ~$100K. Then his total earnings (profits) for the year would be ~$133K. Not a bad chunk of change but let’s not forget the intellectual and physical capacities that are expended from childhood to adulthood to create and build all this – respect!

U – the understanding arising out of all this could be several-fold viz. what it takes to maintain and run a farming operation for corn and soy, what the profit margins in such a business could be (which is, incidentally = $133K/$358K = ~37%). While the margin looks good on paper, there’s this thing called volatility that can fluctuate the exchange prices of corn which can be pretty wild, to put it mildly. A quick Google on this shows corn prices shooting up from $2/bushel to over $8/bushel from ~2007 due to various factors. When prices go up its great for margins and life in general is good. When prices crash due to a collapse in demand, margins can suffer a lot and we might be in a loss-making operation very quickly, especially as our fixed costs are in general, well, fixed! Take land maintenance, tilling etc. plus equipment that’s used (planters, combine harvesters, skid steers and what-have-you). So it’s critical to build this fundamental understanding of the situation not just one point in time, but over time and space. Its understanding at a deeper level creating the richness for further innovation and thinking of other ways to skin the cat.

D – These are basic double-checks of calculations to make sure we’re using the right data, triangulation of information by using 2 or 3 different sources etc. Once he had the double checking done, the farmer felt confident and he could plan out his crop portfolio for next year.

E – Expound or explain the calculations and the overall farmer math to someone who can be the “listener” (e.g. your manager or a peer or anyone who reports to you). This is the best way for us to understand a concept deeply; explain it to someone for crystal clear clarity, answer their questions and clarify. The best way to learn is to help others learn. It worked with Richard Feynman and it will with you as well!

“That, ladies and gentlemen, is how a BOTE’s done”, went Prof. Stan. “Basic farmer math with deep understanding of the problem at hand built in to the process. ML won’t do this for you, so this creates the foundation to feed into an ML algo.” Silence. The class erupted to a standing O.

Hex and Heart looked at each other, and thought of their problem at hand. The DIMS test, the VETUDE framework and farmer math all had great relevance for their days ahead.


[1] DIMS: Does It Make Sense?
[2] VÉTUDE: VÉTUDE is a play on the French word étude, meaning to study.
V – Visualize, É – Extract, T – Think, U – Understand & Calculate, D – Double check, and E – Expound (explain)
[3] One acre is ~43,560 sq. ft., (~4047 square meters, 4840 square yards); the ~ or tilde means “approximately” or “about”
[4] A bushel is ~35 lbs of corn (MAY mean different weights for different crops!). An acre of corn in general yields ~100 bushels
[5] Yes the allusion to Cuba Gooding Jr. in Jerry Maguire is implicit J
[6] CME: Chicago Mercantile Exchange

No comments:

Post a Comment

Chapter 1: Decision conundrum

The journey to greatness starts with a simple step. Just turn the envelope and write on its back. The professor of machine learning as...