One of the best features of Random Forests is that it has built-in Feature Selection. Explicability is one of the things we often lose when we go from traditional statistics to Machine Learning, but Random
Tune the min_samples_leaf parameter in for a Random Forests classifier in scikit-learn in Python
Code snippet corner is back! Tune the max_depth parameter in for a Random Forests classifier in scikit-learn in Python
Tune the n_estimators parameter in for a Random Forests classifier in scikit-learn in Python
Ah, hyperparameter tuning. Time & compute-intensive. Frequently containing weird non-linearities in how changing a parameter changes the score and/or the time it takes to train the model.
RandomizedSearchCV goes noticeably faster than a full
GridSearchCV but it still takes
Let pandas do the heavy lifting for you when turning JSON into a DataFrame.
(And a way to clean it up with SQLAlchemy)
Let's say you were continuing our task from last week: Taking a bunch of inconsistent Excel files and CSVs, and putting them into a database.
Let's say you've been given a new CSV that conflicts with some rows you've already
A compelling case for robot overlords.
A decade has passed since I stumbled into technical product development. Looking back, I've spent that time almost exclusively in the niche of data-driven products and engineering. While it seems obvious now, I realized in the 2000s that you could