Tools of the week:
- SuperSQLite: A feature-packed Python package and for utilizing SQLite in Python by Plasticity.
- Ggstance: ggstance implements horizontal versions of common ggplot2 Geoms, Stats, and Positions.
- TFX: (tensorflow) TFX components interact with a ML Metadata backend that keeps a record of component runs, input and output artifacts, and runtime configuration.
Reads of the week:
- Series A investing in Europe 2014 -2018. Yoram does a great job publishing his Dealroom findings. This report highlights a few interesting findings: (1) is that no US based firm sits in the top 20 most active lead investors and (2) that the total number of investments has doubled in the last 4 years (but it’s still a fraction compared to the nominal amount in the US) – and (3) unsurprisingly, London is the capital for the deals.
- Out of Forbes from a recent study, and a troubling find: “An AI trawled 3.5M books and found fundamental differences in the written language we use to describe men and women.” Additionally in the article it’s quoted regarding algorithms we use day in and out for voice, text, HR hiring, driving, etc. “If any of these patterns refer to biased language, the result will also be biased. The systems adopt, so to speak, the language that we people use, and thus, our gender stereotypes and prejudices.”
- This is a neural network and shows a great description, especially for those visual learners like myself. What we’re looking at are a few things: (1) the input layer is reading what’s called the MNIST data set, which is around 60,000 handwritten digits use for training and has an additional 10,000 for testing (it’s a great data set to play with for pattern recognition; (2) the feed-forward neural network itself which currently has three convolutional layers and two dense layers. These condense the images and observe the features of the digits. The dense layers then classify the image. The parameters in the layers are randomly set and it’s run on the digits, a random output is given and compared to the actual label. Calculus is used to update the parameters and errors are minimized. It then repeats following that on the training data. Pay attention to what each layer is doing, and also how the output layer can read the input layer with the confidence interval shown below.
- Code here