One of the best features of Random Forests is that it has built-in Feature Selection. Explicability is one of the things we often lose when we go from traditional statistics to Machine Learning, but Random
Code Snippet Corner
Tune the min_samples_leaf parameter in for a Random Forests classifier in scikit-learn in Python
Code snippet corner is back! Tune the max_depth parameter in for a Random Forests classifier in scikit-learn in Python
Tune the n_estimators parameter in for a Random Forests classifier in scikit-learn in Python
Ah, hyperparameter tuning. Time & compute-intensive. Frequently containing weird non-linearities in how changing a parameter changes the score and/or the time it takes to train the model.
RandomizedSearchCV goes noticeably faster than a full
GridSearchCV but it still takes
Pandas and Excel Pt. 2
What if, like during my data import task a few months back, the dates & times are in separate columns? This gives us a few new issues. Let's import that Excel file!
import pandas as pd import xlrd import datetime
Pandas & Excel, Part 1
Different file formats are different! For all kinds of reasons!
A few months back, I had to import some Excel files into a database. In this process I learned so much about the delightfully unique way Excel stores dates &
Picking Low-Hanging Fruit With Dask
Ah, laziness. You love it, I love it, everyone agrees it's just better.
Flesh-and-blood are famously lazy. Pandas the package, however, uses Eager Evaluation. What's Eager Evaluation, you ask? Is Pandas really judgey, hanging out on the street corner and
Reshaping Pandas dataframes with a real-life example, and graphing it with Altair
Last few Code Snippet Corners were about using Pandas as an easy way to handle input and output between files & databases. Let's shift gears a little bit! Among other reasons, because earlier today I discovered a package that exclusively
Code Snippet Corner
This isn't really a tutorial on
cron in general; Better people at Linux have written way better ones than I could write. Here's one: http://mediatemple.net/blog/news/complete-beginners-guide-cron-part-1/ This is more of a code journaling exercise for a