Hanna van der Vlis
Hanna is a creative and passionate data scientist with experience in energy, agriculture, and credit risk. She has 3+ years of experience in data science and machine learning, and proven skills in ML Ops. She is currently working to help Kenyan smallholder farmers run more profitable businesses at Apollo Agriculture.
Sessions
At Apollo Agriculture, a Kenya based agro-tech startup, one of the challenging problems we face is to predict yields of Kenyan maize farmers. Like almost all data-sets, this data-set has a hierarchical structure: farmers within the same region aren’t independent. By ignoring this fact, a model could predict yields entirely from the region of the farmer, but fails to find any other meaningful insights, and we may not even realize. However, if we “overcorrected,” treating each region as completely separate, each individual analysis could be underpowered. Enter the hero of our story: Bayesian hierarchical modeling. Using a practical example in Pymc3, we’ll follow this hero as they identify and overcome clustered data-sets.