≡ Menu

Niels Hoven

Stop letting the data decide

“Everybody gets so much information all day long that they lose their common sense.” – Gertrude Stein

My first job as a product manager was in games. I worked at Playdom, Zynga’s primary competitor during the social gaming boom of 2009. The sophistication of our data analysis techniques and the platform supporting them played a large role in our eventual $700 million acquisition by Disney.

data-picard

For most companies at the time, “analytics” just meant counting pageviews. If you were really fancy, you could track the order in which users viewed certain pages and assemble a funnel chart to quantify dropoff. Gartner’s report on the state of web analytics in 2009 describes a range of key challenges, like “how to obtain a sustainable return on investment” and “how to choose a vendor”.

web analytics challenges 2009

“Why would we need an analyst to tell us what our hitcounter is saying?”

In contrast, social gaming powerhouses like Zynga and Playdom were custom building their own event-based analytics systems from the ground up. They tracked almost every action that players took in a game, allowing them to deeply understand their users’ needs and build features to fulfill them, rather than simply taking their best guesses.

For me, it was incredibly exciting to be on the cutting edge of analytics. For the first time, we could get real insights into players’ actions, aspirations, and motivations. Games are tremendously complex software products with huge design spaces, and even now it blows my mind that for most of the industry’s history, development decisions were made purely on gut instinct.

The power of these new data analysis techniques seemed limitless. Zynga went from zero to a billion-dollar valuation in under 3 years. And while gaming companies were the first to really showcase the potential of event-based metrics, they certainly weren’t the only ones. There was a digital gold rush as startups popped up left and right to bring the power of quantitative data insights to every industry imaginable.

Perhaps the most famous example of putting data-driven design on a pedestal is Marissa Mayer’s test of 41 shades of blue. (It’s an absurd test for many reasons, not the least of which is that with so many different variations, you’re basically guaranteed to discover a false positive outlier simply due to random noise.)

In this brave new world, metrics were king. Why would you need a designer? Everything could be tracked, measured, tested. A good PM was one who could “move the metrics”. MBAs and management consultants were hired by the boatload. One friend told me about the time he had to talk his CEO out of firing all the game designers in the company and replacing them with analysts.

A quick note about the game industry

As an aside, the game development industry has interesting market dynamics because of how many people dream of working in it. In some unglamorous industries (e.g. Alaskan crab fishing, logistics, B2B startups), demand for labor vastly exceeds supply. In games, it’s the opposite – many people stay in games out of passion, even when the money doesn’t justify it, leading to a market that is oversaturated and extremely competitive.

The evolutionary pressures of this absurdly competitive market mean that the pace of product innovation is extremely quick. The quality bar constantly increases, production costs go up, advertising prices rise, margins disappear, and mediocre products fail.

The gaming market’s competitiveness forces rapid innovation just to keep up, and when better tactics emerge, they are quickly adopted and rapidly bubble up to dominate the top of the market. As a result, the gaming market can be a bellwether of trends in the larger tech market, such as the power of the freemium model, microtransactions, sophisticated performance marketing, and strong product visions.

The competitive advantage of a strong product vision became undeniable in early 2012. At that time, Zynga had been around for about 5 years, with a peak market cap over $10 billion, and the company’s success had been repeated on a smaller scale by other strongly “data-driven” gaming companies on Facebook and on mobile.

However, an interesting trend was beginning to occur, with new games like Dragonvale and Hay Day dominating the mobile charts with innovative mechanics supported by a single, unified product vision.

Purely metric-driven iteration with no vision or direction could bring a product to a local maximum, which was good enough in the very early days of mass-market casual gaming. But as the market matured and competition intensified, a local maximum wasn’t good enough. Derivative products and products developed by only metric-driven iteration were vastly inferior to products driven by a strong creative vision from their inception, like Supercell’s Clash of Clans or Pocket Gems’ Episode. That vision was a necessary prerequisite to create a product strong enough to land at the top of the charts.

apple top grossing

Fortnite was announced in 2011 and launched in 2018. Gaming is a tough industry.

And being at the top of the charts is critical – revenue on the Top Grossing Charts follows a power law, with the handful of apps at the very top of the charts making more money than all the rest of the apps put together. As Zynga’s apps slipped down the charts, their inability to adapt to this new world became apparent and their stock price fell 80%.

Data-driven design had failed, as did intuition-driven design before it. The industry needed a more fundamental shift in perspective. Good teams now design for the long term, guided by intuition but informed by data.

Personally, I like to emphasize the difference between data-driven design (relying on data to make decisions because we have no user empathy) and data-informed design (use data to understand our users, then build features to delight them)

Data-driven design

When I say “data-driven design”, I’m referring to the mentality of “letting the data decide”. In this paradigm, PMs and designers surrender to the fallibility of their intuition, and thus they elect to remain agnostic, using A/B testing to continuously improve their products.

A number of companies I’ve talked to have bragged about about the fact that they’ve removed intuition from the decision making process. It’s comforting to be able to say “We don’t have to depend on intuition because the data tells us what to do!”

Of course, everyone knows that data is noisy, so companies use large test groups and increased rigor to mitigate those concerns. But the real problem isn’t tests giving the wrong answer, so much as it is the assumption that the infinite degrees of freedom of creating a compelling product can be distilled to a limited number of axes of measurement.

With the exception of straightforward changes like pricing, most design changes have complex effects on the overall user experience. As a result, treating metrics as end goals (rather than simply as indicators of good product direction) results in unintended consequences and a degraded user experience. Testing isn’t a magic bullet either. Sometimes this degradation occurs in an unexpected part of the user experience, and sometimes it occurs on a different timescale than the test.

Split tests typically gather data for a period of days or weeks. User lifetimes are typically months or years. If you’re only looking at the data you’ve gathered, it’s easy to unintentionally trade off difficult-to-measure metrics like long term product health in exchange for easy-to-measure short-term metrics like revenue.

Example: Aggressive paywalls

Zoosk is a dating app that built a huge userbase as a Facebook app during the heyday of data-driven design. They’re extremely aggressive with their monetization, with misleading buttons designed to constantly surprise the user with paywalls.

Oh boy, a message!

Gotcha! Paywall!

A company naively focusing on revenue will naturally iterate their way to this point, experimenting with increasingly early and aggressive paywalls and discovering that the spammier the app becomes, the more money they make.

However, while an aggressive approach can be very profitable in the short run, it quickly drives away non-payers and makes it difficult to engage new users. In the dating space, this results in a user experience that becomes worse every month for subscribers.

Sure enough, judging from AppAnnie/SensorTower estimates, Zoosk’s revenue has probably fallen about 50% since their 2014 high of $200 million.

Example: Searches per user

One of my favorite stories is from a friend who worked on improving the search feature at a major tech company. Their target metric was to increase the number of searches per user, and the most efficient way to do that was to make search results worse. My friend likes to think that his team resisted that temptation, but you can never be totally sure of these things.

Example: Brand tradeoffs

If you start a free trial with Netflix, you’ll get an email a few days before the end of the free trial reminding you that your credit card is about to be charged. I’m sure that Netflix has tested this, and I’m sure that they know that this reminder costs them money. However, they’ve presumably decided to keep the reminder email because of its non-quantifiable positive effect on the Netflix brand (or more precisely, to avoid the negative effect of people complaining about forgetting to cancel their free trial).

Netflix email

Short term revenue loss, long term brand gain

Notably, Netflix only reminds you before billing your card for the first time, and not for subsequent charges. At some point, a decision was made that “before the first charge but not before subsequent ones” was the correct place to draw the line on the completely unquantifiable tradeoff between short term revenue loss and long term brand benefits.

Example: Tutorial completion

A standard way to measure the quality of an onboarding experience is to measure what percent of users who start a tutorial actually finish it. However, since there will always be a natural drop off between sessions or over days, one obvious way to increase tutorial throughput is to build a tutorial that attempts to teach all the features in a single session.

Sure enough, tutorial throughput goes up, but now users are getting overwhelmed and confused by the pace of exposure to new menus and features. How to help them find their way? Maybe some arrows! Big, blinking arrows telling the user exactly which button to tap, directing them into submenus 7 levels deep and then back out.

You’ll be able to do this on your own next time, right?

Arrows everywhere can boost tutorial throughput, but all the users will be tapping through on autopilot, contradicting the point of having the tutorial in the first place! Excessive handholding of users increases tutorial completion (an easy to measure metric), but decreases learning and feelings of accomplishment (difficult to measure but very important metrics).

Example: Intentionally uninformative commununication

“You’ve been invited to a thing! I could tell you where and when it is in the body of this email, but I’d rather force you to visit my website to spam you with ads. Oh, look at how high our DAUs are! Thanks for using Evite!”

email from evite

If this email were helpful, Evite would have to find a different way to make money

Equally frustrating to users: Push notifications that purposely leave out information to force users to open the app. Users will flee to the first viable alternative that actually values the user experience.

Example: User experience

In a purely data-driven culture, justifying investment in user experience is a constant uphill battle.

Generally, fixing a minor UI issue or adding some extra juice to a button press won’t affect the metrics in any kind of a measurable way. User experience is one of those “death by 1000 cuts” things where the benefits don’t become visible until after a significant amount of work has already been put in.

As a result, it’s easy to constantly deprioritize improvements to the user experience under the argument of “why would we fix that issue if it’s not going to move the needle?”

To create great UX requires a leap of faith, a belief that the time you’re investing is worthwhile despite what the metrics say right now.

Hearthstone is a great example. Besides being a great game, it’s full of moments of polish and delight like the finding opponent animation and interactive backgrounds that are completely unnecessary from a minimum viable product perspective, but absolutely critical for creating a product that feels best-in-class.

Example: Sales popups

When I was at Playdom, we would show popups when an app was first opened. They’d do things like encourage users to send invites, or buy an item on sale, like this popup from Candy Crush does.

candy crush sale popup

Do you want revenue now or a userbase in the future?

I hate these. They degrade the user experience, frustrate the user, hurt the brand, and generally make interacting with your product a less delightful experience.

On the other hand, though, they WORK. And you can stack them: the more sales popups you push users through, the more money you make – right up until the point where all of your metrics fall off a cliff because your users are sick of the crappy experience and have moved on.

It always gave me a bit of schadenfreude to open a competitor’s game and see a sale popup for the first time, because the same pattern always repeated itself: As the weeks went by, more and more aggressive and intrusive popups would invade the user experience, right up until the game disappeared from the charts because all the users churned out.

Even retention isn’t foolproof

As a final note, while most of the examples above involve some variation on accidentally degrading retention, even optimizing for retention doesn’t prevent these mistakes from occurring if you’re optimizing for retention over the wrong timescale or for the wrong audience of users.

Typically, companies will look at metrics like 1-day, 7-day, 30-day retention because those numbers tend to correlate highly with user lifetimes. But focusing on cohort retention runs the risk of over-optimizing your product for the new users that you’re measuring, perhaps by over-simplifying your product, or neglecting the features loved by your elder users, or creating features that benefit new users at the expense of your existing audience.

Data-informed design

In contrast to “data-driven design”, which relies on data to drive decisions, in “data-informed design” data is used to understand your users and inform your intuition. This informed intuition then guides decisions, designs, and product direction. And your intuition improves over time as you run more tests, gather more data, and speak to more users.

When I’m making the case for the benefits of introducing intuition back into the decision-making process, there are two benefits that I keep coming back to: leaps of faith, and consistency.

Leaps of faith

Purely data-driven product improvement breaks down when a product needs to get worse in order to get better. (If you’re the sort of person who likes calculus metaphors, continuous improvement gets you to a local maximum, but not to a global maximum.) Major product shifts and innovations frequently require a leap of faith, committing to a product direction with the knowledge that initial metrics may be negative for an extended period of time until the new direction gets dialed in and begins to mature.

When Facebook introduced its newsfeed, hundreds of thousands of users revolted in protest, calling for boycotts and petitioning for removal of the feature. Now we can’t imagine Facebook without it.

Consistency

When products are built iteratively, with decisions made primarily through testing and iteration, there’s no guarantee of a consistent vision. Some teams take pride in the fact that their roadmaps only extend a week into the future. “Our tests will tell us what direction to go next!”

Data-informed design helps your product tell a consistent story. This is the power of a cohesive product vision.

It can be hard to explain exactly WHY a cohesive product vision translates to a better product, and also why it’s so hard to get there purely by data-driven iteration. Perhaps an extremely contrived example can help illustrate my point.

Let’s say you’re designing a new experience. You’re committed to good testing practices, and so over the next several months, you run tests on all 20 features you release. Each test is conclusive at the 5% significance level, and sure enough, users respond very positively to the overall experience that your tests have led you to.

Now, even with rigorous testing at a 5% significance level, 1 out of 20 tests will be wrong, and interestingly enough, 19 of the tests are consistent with the belief that your users are primarily young women, while 1 of them conclusively indicates that your users are middle-aged men.

Allowing your decision-making to be informed by data rather than dictated by it allows the team to say “Let’s just ignore the data from this particular test. Everything else we’ve learned makes us quite confident that we have a userbase of young women, and we believe our product will be better if all our features reflect that assumption.”

Obviously, if more tests come back indicating that your users are middle-aged men, your entire product vision will be thrown into question, but that’s ok. It’s preferable to ignore data in order to build a great product reflecting a unified vision that you’re 95% confident in, rather than creating a Frankenstein with 95% confidence on each individual feature.

The role of data in data-informed design

I believe that saying “just let the data decide” isn’t good product management, it’s an abdication of responsibility. As a PM or a designer, your job is to develop empathy for your users. Use data to understand them, their aspirations, and their motivations, and then take a position on what direction the product needs to move to best serve them.

Sometimes this means knowing your users better than they know themselves, as in the Facebook newsfeed example. More commonly, it means having enough faith in your product vision to recognize early false negatives for what they are, and being willing to grind through the trough of sorrow to realize your product’s potential.

Eric Reis gives an example of a registration flow that he worked on that performed poorly after launch. But based on earlier conversations with users, the team still believed in the design, and chose to continue working on it despite the data. Sure enough, it turned out that there was just one relatively minor design flaw, and once that was discovered, the new flow performed much better.

In this case, it was a relatively small feature with a relatively small flaw. But the same pattern holds on a larger scale as well – as visions become more innovative and ambitious, sometimes it requires commitment to a product vision over an extended period of time to see a product achieve its potential.

When to stop

I’m often asked, “If you know you’re just going to keep building no matter what the data says, then what’s the point in having data at all? How will we know when to kill the project?”

That’s a great question, since it’s often difficult to tell the difference between a false negative and a true negative. But there are two clear red flags to watch for: when a team loses faith in the project, and when a project stops improving. Ed Catmull cites the same criteria in Creativity, Inc. for knowing when one of Pixar’s movies is in trouble. Recognizing when a product is stuck is a challenge for any company committed to creativity and innovation, regardless of medium.

In data-informed design, learning is a continuous and parallel process. Rather than trying to design a rigorous enough test to validate/invalidate a direction at a particular moment in time, data is consistently gathered over time to measure a trajectory. If the team understands their users well, their work should show a general trend of improvement. If the product isn’t improving, or even if the product IS improving, but the metrics aren’t, then that’s a sign that a change is needed.

Some rules of thumb for data-informed design

It can be hard to know how to strike the right balance between data and intuition, but I do have a few rules of thumb:

Protect the user experience

Peter Drucker famously wrote: “What gets measured gets managed.” That’s true, but in my experience, “What gets measured gets manipulated, especially if you are being evaluated on that metric.” Example example examples

The challenge in product development is recognizing when we’re “teaching to the test”, regardless of whether it’s intentional or not. For anything that we’re measuring, I like to ask “is there a way I could move this metric in a positive way that would actually be really bad for our product long-term?” Then I ask, “is the feature I’m thinking about doing some flavor of that accidentally?”

A few examples of good intentions with potential for unintended consequences:

MetricTacticResult
Tutorial completionShorten the tutorial.
Users learn less
Conversion
Create misleading sales page. Buyers remorse
Revenue
Run frequent sales. Users trained to only buy at a discount

Have a “North Star” vision

I always advocate for having a “North Star” vision. This is a product vision months or years away that you believe your users will love, based off your current best understanding of them.

Since products take a lot of iterations to get good, early product development is full of false negatives on the way to that North Star. People love to talk about the idea of “failing fast” or “invalidating an idea early”, but a lot of times that just isn’t possible. The threshold for viability in a minimum viable product isn’t always obvious, and sometimes it does just take a little more polish or a few extra features to turn the corner.

The best way to get a more trustworthy signal is to just keep building and shipping. A North Star lets you maintain your momentum during the inevitable periods of uncertainty. Over time, small sample sizes accumulate, and noise averages out. Evidence about the product direction will build with time.

Treat metrics as indicators/hints, not goals

It’s important to remember that metrics are leading indicators, not end goals. Similar to how taking a test prep class to improve your SAT score doesn’t actually increase your odds of college success, features that overfocus on moving metrics may not actually improve the underlying product.

The most important question that data can answer is “does the team understand the users?” If so, features will resonate and metrics will improve over time. To validate/invalidate a product direction, look at the trajectory of the metrics, not the result of any individual test.

The right time to kill a project is when the trajectory of improvement flattens out at an unacceptably low level. Generally this means that a few features have shipped and flopped, which is an indicator that there’s some kind of critical gap in the team’s understanding of their users.

This also means that it can be difficult to get away from innovative product/feature ideas quickly. This can be an unpopular opinion in circles that are dogmatic about testing, but the fact of the matter is that I have never seen the “spray and pray” approach work well when it comes to product vision.

I’ve tried a wide range of nutrition hacks over the years (high carb, low carb, paleo, fasting, etc) but the only habit that has actually stuck with me is my spinach smoothies.

For over 10 years now, I’ve had a mostly-vegetable smoothie nearly every day. They’re delicious and nutritious and the only way I know to get 3 or 4 servings of fruits and veggies in about 5 minutes.

Most people know that they should be eating more fruits and vegetables (“5-a-day” was the hot campaign a while ago). Most people aren’t even close to that, and even those who try don’t realize that:

  • an entire bag of fresh salad from the supermarket is only about 1.5 servings of vegetables
  • you should really be eating closer to 9 servings a day

That’s a lot of veggies! I actually tried to do that at one point and after a few weeks of eating absurd quantities of salad, I just got tired of chewing. It was time to find a better way.

After a few false starts, my roommate and I discovered that orange juice was the secret to hiding the taste of spinach in a smoothie. Seriously, you can put an absurd amount of greens in a smoothie and not even taste them if you have an orange juice base. Hence, this recipe:

Super Spinach Smoothie

  • 2 servings frozen spinach/kale (1/3 bag)
  • Orange juice (try half OJ, half water if you want it less sweet)
  • 1 serving frozen berries or pineapple or banana or mango etc

Blend and drink.

Fills two glasses with a little left over. Whether this serves one or two is up to you.

A medium-rare steak is 135 degrees in the center. For thousands of years, the best way to accomplish this was to put the steak on a really hot grill and attempt to pull it off at just the right time. This is silly. Fortunately, technology has found a better way.

Take your steak, vacuum seal it in a plastic bag, and then lower it into a water bath whose temperature is carefully maintained at exactly 135 degrees. Let the steak come up to the temperature of the surrounding water, then pull it out, sear it with a blowtorch or a hot pan, and you’re ready to serve!

This method of cooking (known as “sous-vide“, French for “under-vacuum”) has several advantages.

1) No clean-up. Just open the bag, torch the steak, and you’re ready to serve!
2) No overcooking. Overcooking means accidentally bringing food above your target temperature. With sous-vide, the water bath maintains your food at the exact desire temperature, so overcooking is impossible.
3) No food safety concerns. Want a super-rare hamburger, but worried about e. coli? Pasteurization is a function of both temperature and time, so you can pasteurize your meat at a relatively low temperature by just holding it there for a couple hours.

Fish takes about 20 minutes and is perfectly cooked every single time. Chicken is moist and tender in a way I’ve never had it before. Sous-vide duck is amazing. After 2 hours, it’s deep red and juicy, unlike the dry grey stuff I had at Chinese restaurants growing up. Flank steak is one of the most flavorful cuts, but is usually one of the toughest. After 2 days in the sous vide, it’s as tender as filet mignon.

I have no desire to eat out anymore, because the food I make at home is faster and tastier. “Making dinner” now consists of taking a piece of meat (still in its original vacuum-sealed packaging from the supermarket) and dropping it into my sous-vide. For vegetables, I blend a spinach smoothie, or if I’m feeling fancy, I’ll put a tray of broccoli in the oven to roast. I can prepare an entire dinner in less than 60 seconds.

I’m writing this blog post because I’ve had a number of friends ask me how I put together my sous-vide setup. This is the email I’ve been forwarding them:

You can buy a countertop sous vide machine for $450. Alternatively, you can build your own for about $75.

I did neither, and bought a temperature controller that I can plug a rice cooker into. It’s cheaper and more flexible than a dedicated sous-vide machine. It has lower risk of electrocuting me than a DIY solution. And finally, if I ever want to sous vide something larger (say, an entire animal), I can just swap out my rice cooker for a larger heating element and I am good to go.

So without further ado, here’s my sous vide set-up (booze is optional, but recommended):

sous-vide-magic

Temperature Controller, $170
(This is the HD version, which is $10 extra, but get it in case you want to power a bigger heater later.)
Update: I’ve been informed that there are cheaper alternatives.

Perforated plate, $15
(You need something to keep the temperature probe away from the food. I use a small metal cheese grater, which works fine.)

Non-digital rice cooker, $30:
(This is big enough to do 2 flank steaks, a small roast, or a rack of ribs)

(optional, but fun) Cooking torch, $35:
(Get a butane refill from your local smoke shop, or just sear your meat in a pan after cooking it. Caveat – some people believe that butane can flavor the meat, and recommend just getting a blowtorch.)

I get my meat from Trader Joes already vacuum sealed. You might eventually want a vacuum sealer or a water bath that can handle larger items, but the above setup has been working great for me.

The definitive guide to sous-vide cooking times and temperatures can be found on Douglas Baldwin’s website. If I can’t find the info I need there, a quick google search usually turns up good suggestions. But to get you started, here are some times and temperatures that have been working well for me:

Food Temperature Time Notes
Duck breast 135 2 hours Crisp skin-side down in a pan before serving
Flank steak 131 2 days
Pork loin 137 2 hours
Soy ginger cod 132 20 mins Find it in Trader Joe’s frozen foods aisle. Thaw first.
Salmon 126 30 mins Add some slices of lemon if you vacuum seal it yourself
Pork shoulder 140 2 days
Pork chops 138 4 hours
Pork belly 155 2 days Leave it under the broiler afterwards to get super-crispy, and don’t forget plenty of salt!

I recently reallocated my retirement investments. This was my first major rebalancing in nearly a decade, and I thought it would be interesting to write up a guide to how I think about investing, which has been strongly influenced by the excellent posts on Bogleheads and indexfunds.com.

First of all, one core assumption:

Won’t need this money for decades. I’m fortunate enough to have a pretty stable job. Major expenses in the distant future (kids, college) will be funded by my future earnings, not current savings. For possible expenses and emergencies in the near future (house? unemployment? piano accident?), I set aside cash in a liquid and low-risk savings account. The remainder of it, I invest as follows.

I have three goals from my investment strategy:

High returns. The vast majority of professionally managed mutual funds underperform. I want to be confident that I have maximized my returns for whatever level of risk I decide to take on.

Low risk of going broke. I am willing to accept large swings (e.g. stock market fluctuations). I am not willing to accept a chance of losing everything (e.g. no lottery tickets or angel investments). I reduce my risk through diversification. Those looking for more sophistication can read about Kelly Betting.

Minimal time commitment. This is the big one. Tim Ferriss defines investing as “allocating resources to improve quality of life.” Every hour that I spend thinking about my investments is a lost hour that I could have spent doing something way more fun. I don’t want to have to think about when to buy and when to sell, when to short and when to long. I want to set my investment machine running and forget about it until I’m ready to retire.

On a related note, at this point in my career, the value of my potential future earnings far exceeds what I’ve accumulated in savings. Even if money were my main concern, I’m better off thinking about how to earn more (ship some product, contract on the side, ask for a raise, advise a new company, found a startup) than figuring out how to squeeze an extra few tenths of a percent out of my investments.

This rules out any sort of active investment strategy. I want to passively invest my money and have it grow at an high rate of return. Most people who claim to have done this are trying to sell you a video or brochure. I’m just going to try to sell you on efficient market theory. (Warning: Controversy Ahead…)

INVESTING IN AN EFFICIENT MARKET

Efficient market theory roughly states that a stock’s market price reflects its true value. In other words, there are millions of other people out there who are also looking for an underpriced stock and you’re not going to find anything they missed.

The case for efficient market theory has been severely damaged by bubbles and financial crises during the past few years, but it still turns out to be a decent rule of thumb.

Another way to think about it is that while there are inefficiencies in the market, it takes time to find them. Since I’m not willing to spend that time, I may as well just assume that the market is efficient and leave it up to the billion-dollar hedge funds to go looking for tiny 0.001% arbitrage opportunities.

So if the market is efficient, can you beat it? The answer is yes – sort of.

RISK VS REWARD

Even in an efficient market, there is a tradeoff between risk and reward. For example, stocks are riskier than bonds, startup companies are riskier than big companies, businesses in emerging markets are riskier than US businesses. All of these riskier bets tend to have higher returns to compensate their investors for the higher risk. (If they didn’t, investors would just put their money someplace else that had the same return but lower risk.)

You can think of “risk” as a measure of how wildly a stock’s price swings, i.e. its variance. More risk/variance = a higher average rate of return (probably). The relationship looks a little something like this:

Risk vs Reward

Stocks are generally riskier than bonds, so you can adjust the riskiness of your portfolio by allocating more or less of it to stocks. A typical guideline for risk exposure is to “allocate your age in bonds”, i.e. if you’re 40 years old, 40% of your portfolio should be in bonds.

Now, the value of an index like the Dow or the S&P 500 is largely determined by enormous companies like Microsoft, General Electric, and ExxonMobile. As companies go, these are pretty stable.

So if I want to “beat the market”, meaning some market index, I just have to find some riskier corners of the market in which to stick my money. By looking at historical data, you can approximate both the historical variance and rates of return of different asset classes. (The numbers below come from Richard Ferri’s All About Asset Allocation and assume diversification within the asset class – see next section.)

DIVERSIFICATION

Why is diversification so important? Imagine that I had two stocks, A and B. Let’s say they are equally risky, and so no matter which of them I invest in, I will average a 5% annual return. However, every time stock A goes up, stock B goes down (maybe A makes umbrellas and B makes sunglasses).

So what happens if I put half my money in A and the other half in B? Well, I’ll still earn my 5% annual return, but since every time A goes down B goes up, and vice versa, I’ll see much smaller swings in the value of my portfolio. Smaller swings = less risk.

I get this diversification bonus not only within an asset class, but between asset classes. For example, stocks (higher-risk, higher-return) and bonds (lower-risk, lower-return), typically move in opposition. This actually shifts the efficient frontier of the risk reward curve upwards for blended portfolios:

diversification

So diversification is important, and the more stocks I can buy, the better. This sounds like it could be a pain, but fortunately, some smart people have already done that and packaged a whole bunch of stocks into a single bundle, which we now call a mutual fund.

INDEX FUNDS AND ETFS

The specific mutual funds we’re interested in are called “index funds”. An index fund is just a mutual fund that is composed of the stocks in a particular index (like the Dow, or the S&P 500) and doesn’t attempt to do anything other than mirror that index’s performance. Because of this simple strategy, index funds typically have extremely low fees – often a tenth of what actively managed funds charge. Index funds also simplify more complicated investing tactics like tax loss harvesting.

Since all the funds tracking the same index should have the same performance, we want to buy the funds with the lowest expense ratio. Vanguard is known for its low expense ratio index funds, but there are other options. You can buy index funds as either mutual funds or ETFs (“exchange traded funds”), they’re just different ways to purchase the same thing. I like ETFs because they’re slightly more tax efficient.

ASSET ALLOCATION

At this point, execution is straightforward. Decide how much risk you feel comfortable with, spread your portfolio across a mix of asset classes so that you end up at an appropriate point on the risk-reward curve, and then go out and buy index funds in each of those asset classes.

If you’re really, really lazy, there are services like Wealthfront that will do this for you for a small fee. But you can set up a brokerage account and buy three ETFs yourself in less time than it took you to read this blog post.

Example 1: Three-Fund Portfolio. For a simple, risk-adjusted portfolio, allocate your age in bonds (30 years old? 30% bonds), split the rest between domestic and international stocks, and call it a day. For example, 30% bonds (BND), 35% US stocks (VTI), 35% international stocks: (VXUS).

Example 2: Wealthfront’s “Very Aggressive” Portfolio. For a 30-year old looking for a high-risk, high-return allocation: 35% US Stocks (VTI), 22% Foreign Stocks (VEA), 28% Emerging Markets (VWO), 5% Dividend Stocks (VIG), 5% Natural Resources (DJP), 5% Municipal Bonds (MUB)

If you’re tolerant of wild swings (i.e. how big a loss can you tolerate before you panic and pull your money out of the market at the worst possible time), you could increase your ratio of stocks to bonds, or even tilt your asset allocation toward some riskier but higher reward asset classes like small-cap value. My portfolio is heavily weighted toward small-cap value and emerging markets.

Finally, when purchasing ETFs, do your best to place the least tax-efficient funds (i.e. your bond funds) into tax-sheltered accounts (IRAs, 401k, etc).

SUMMARY

In short, it only takes a few hours a year to beat the stock market. If you’re as forgetful as I am, you can probably reduce that time to a few hours a decade and not suffer major consequences.

1) Decide on your tolerance to risk
2) Allocate your portfolio across asset classes
3) Buy index funds with low expense ratios

Fine print: This post assumes a fixed amount to invest. Leverage makes things more complicated, and organizations that can borrow money cheaply enough can get returns above the efficient frontier. Also, take the risk and return estimates with a grain of salt; past performance is not necessarily indicative of future results. Finally, I am not a financial advisor, I’m paid to make games. I just find this stuff interesting, so please don’t sue me. Here’s a disclaimer that says all of this could totally be made up so don’t listen to me and talk to a qualified professional before you do anything, which you agree would be at your own risk anyway. Finally, I’m sure there are errors in this post. If you find one, leave a comment or just tweet and me and I’ll fix it.