Cracking the Causal Code: Insights from Econometrics to Machine Learning
Week 35-2023 [Issue 6.0]
Welcome to The Repo! 🚀
Great work deserves to be shared. Especially in data science.
Each week, I'll curate three gems from the data science community:🗄️ Re: Remarkable Repository 💻 P: Prolific Programmer 🏢 O: Outstanding OrganizationI hope you find them as valuable and insightful as I do.
You can find all recommendations in the GitHub repository at finnoh/repo!
py-why/EconML
🗄️ Repository | ML-Based Heterogeneous Treatment Effects Estimation
EconML is a Python package developed at Microsoft Research, aiming to combine advanced machine learning techniques with econometrics for automated complex causal inference tasks. EconML is part of this week’s organization, PyWhy (see below).
The package is flexible in modeling treatment effect heterogeneity using techniques like random forests, boosting, lasso, and neural networks. Despite the complexity, the learned models can retain their causal interpretation and often provide valid confidence intervals. The main focus is on estimating the causal effects of interventions on outcomes while accounting for features and their interactions.
Nick C. Huntington-Klein
💻 Programmer | An Econometrician, with loads of resources on causal inference.
Nick Huntington-Klein is an assistant professor of economics at Seattle University with a focus on econometrics, causal inference, and higher education policy. He is renowned for creating widely shared educational resources on econometrics. I first heard about him through his lecture on Causality, which slides are available on his GitHub page.
He has authored an introductory textbook on causal inference and research design, adequately titled "The Effect: An Introduction to Research Design and Causality", which is also available to read online. Additionally, he also writes a Substack titled "Data, on Average".
Highly recommend.
PyWhy
🏢 Organization | Open source ecosystem for causal machine learning
PyWhy is an organization that creates an open-source ecosystem for advancing the field of causal machine learning and making these advancements accessible to practitioners and researchers.
They offer tools like DoWhy, a Python library for explicit modeling and testing of causal assumptions; EconML (see above), an introduction of Causal Inference with EconML, and causal-learn, a package for causal discovery methods. The organization also provides learning opportunities, and case studies, e.g. on A/B testing and customer segmentation.
Do you want more curated content?
Each morning, The Sample sends you one article from a random blog or newsletter that matches up with your interests. Kind of like The Repo, but daily. When you get one you like, you can subscribe to the writer with one click. Sign up here.
The Repo grows through support by The Sample.