As Tony Yiu cautions,

The investment opinions expressed in this article are my own. Please do your own due diligence before investing.

With that out of the way, let’s dive in.

When the Efficient Market Hypothesis (EMH) entered mainstream thinking in the 1960s and 1970s, financial economists found themselves on opposing sides:* markets were either efficient, or they were not*.

We now know that it’s not a question of *if *the markets are efficient but to what degree they are *inefficient*.

The question is especially significant for active traders. If markets are mostly efficient, the average investor cannot hope for…

This isn’t investment advice. Always do your own research before investing.

I was inspired to write this article after reading Tony Yiu’s The Investments I Like And Why I Like Them.

It made me realize that financial writers should be transparent about their investments. If you’re telling people which stocks (or houses or ETFs or mutual funds) to purchase, then you had better have purchased those assets yourself.

Since contributing to *Alpha Beta Blog*, I’ve written two articles on the benefits of index funds over actively managed funds. …

*(Not investment advice. Do your own research before investing).*

In my previous article for *Alpha Beta Blog*, I argued, first, that market efficiency in the Canadian stock market changes over time, and, second, that it’s more efficient in the long run than it is in the short run.

I then discovered, through other research, that anywhere between 65% and 85% of Canadian, actively managed mutual funds fail to beat their benchmark over a ten-year period.

I advised retail investors to invest in index funds, as it was unlikely that they would choose a mutual fund that could beat the market.

…

By nineteen, I had the first of what would soon become many existential crises.

Boredom caused many of these crises–the brain’s favorite pass time is conflict, after all–but this crisis came differently: it came from speaking with a friend. We were discussing our plans after graduation. Without thinking, I told him that I wanted to study law.

“Why?” He asked.

It was an excellent question. When I couldn’t think of a reply, he asked me if I really wanted to go to law school, or if I felt like I had to go to law school, given that I was…

With all the fanfare surrounding the normal distribution, newcomers to data science can make the mistake in thinking that data scientists care *only* about the normal distribution.

While the normal distribution is arguably the most important (and most perversely used) probability distribution, it is not always the best tool, nor the best assumption to make, in solving business problems.

Say that you work at Dunder Mifflin Paper Company. Assistant (to the) Regional Manager Dwight K. Schrute gives you files on one-hundred clients. Mr. Schrute wants to know the following:

- Of the one-hundred clients, what is the probability that 5 of…

Logistic regression is the first binary response models that undergraduate statistics students learn to use. It is also the first classification algorithm that MOOCs teach to aspiring data scientists.

There are a few reasons why logistic regression is so popular:

- It is easy to understand.
- It only requires a few lines of code.
- It is a great introduction to binary response models.

In this article, I will explain the math behind the logistic regression, including how to interpret the coefficients of the logistic regression model, and explain the advantages of logistic regression over a more *naive *method.

Before delving into…

Forecasting is the goal for most data science projects.

While this has lead to some interesting work, it has also led to a disincentive to:

- Understand the mathematics behind the machine learning models,
- Fully appreciate all aspects of the data analysis process; and,
- Build understandable models

The purpose for this article is to encourage more aspiring (and current) data scientists to finish projects that have **causality **as their stated goal.

Why?

Because persuading the reader that there is a causal relationship between

XandYis oftentimes more difficult than persuading the reader that a certain ML model has a…

The p value is an important concept in frequentist statistics, and it is usually taught in introductory statistics courses.

Unfortunately, many of these courses either do a poor job of explaining what the p-value can (and cannot) do *or blatantly promote false propaganda relating to the role of the p value in causal inference.*

This has led many undergraduate students, and even academics who should know better, to make incorrect claims in their research, all because they found a p value of less than 0.05.

The goal for this article is to clear up the myths surrounding the p value…

I started tutoring math to high school students in December 2018. I had zero tutoring experience — outside of helping a friend with their calculus homework — but the tutoring center desperately needed a math tutor, and I desperately needed to pay my tuition.

I remember my first day on the job. I wore a dress shirt (no tie, thankfully), black pants, and formal footwear. Meanwhile, everyone else (including the manager) wore jeans and a sweater.

At the time, I thought that my first tutoring session went great. …

Abraham Maslow writes, “I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail”.

This is the situation that aspiring data scientists find themselves in when analyzing time series data. The seasonal_decompose function from Python’s Statsmodels library is the hammer, and every time series data is just another nail.

Decomposing our time series is an important step in improving forecast accuracy and creating causal insights.

The seasonal_decompose function is okay for time series decomposition but there are other approaches that are great. …

I'm a research assistant in the Bank of Canada's financial stability department. You can connect with me at https://www.linkedin.com/in/andrew--plummer/