Introduction Post
Hello, there! I'm Princewill Monday, and I'm so excited to be here. I'm new here and I wish to... Tagged with webdev, beginners, programming, javascript.
Square Every Digit
Instructions: Welcome. In this kata, you are asked to square every digit of a number and concatenate... Tagged with javascript, challenge, codewars, 7kyu.
How to Become a 9k Developer
In software development, reaching "9k" isnât just about being good. Itâs about being world-class â... Tagged with webdev, programming, ai.
Day 1144 : Chill
liner notes: Professional : Pretty chill day. I think a bunch of folks took the day off so not a... Tagged with hiphop, code, coding, lifelongdev.
Trump goes to war with the Fed
Donald Trump's simmering discontent with the US Federal Reserve boiled over this week, with the president threatening to take the unprecedented step of
The Crunchbase Tech Layoffs Tracker
Tech layoffs: At least 95,000 workers at U.S.-based tech companies were laid off in mass job cuts in 2024 and the cuts have continued into 2025.
Topological Abelian Groups
This post will venture further into abstract mathematics than most of my posts. If this isn't what you're looking for, you might try browsing here for more concrete articles. Incid...
AutoRABIT Launches Guard
Traditional security tools don't account for Salesforce's unique architectureâGuard provides comprehensive security posture management specifically designed for these complex envir...
Millionth powers
Find a number whose millionth power begins with four distinct digits. Puzzle question by Richard Stanley. Stanley's solution explained, and a smaller solution.
Repost: R 4.5.0 and Bioconductor 3.21
Reposted from the original at https://blog.stephenturner.us/p/r-450-bioconductor-321.Faster package installation, import only the functions you want with use(), built-in Palmer pen...
Introducing Gemini 2.5 Flash
Gemini 2.5 Flash, is now in preview, offering improved reasoning while prioritizing speed and cost efficiency for developers.
#465 â Robert Rodriguez: Sin City, Desperado, El Mariachi, Alita, and Filmmaking
Robert Rodriguez is a legendary filmmaker and creator of Sin City, El Mariachi, Desperado, Spy Kids, Machete, From Dusk Till Dawn, Alita: Battle Angel, The Faculty, and his newest ...
Can Quantum Gravity Be Created in the Lab?
Quantum gravity could help physicists unite the currently incompatible worlds of quantum mechanics and gravity. In this episode, Monika Schleier-Smith discusses her pioneering expe...
Eye On AI: Startups See More And Bigger Deals
Dealmaking involving venture-backed startups is up slightly year to year overall, and part of that uptick is thanks to artificial intelligence, with 81 deals involving AI startups ...
Mr. Bell and Bell numbers
How Eric Temple Bell discovered what we now call the Bell numbers. Bell's triangle, analogous to Pascal's triangle, was discovered the year Bell was born.
Host concurrent LLMs with LoRAX
In this post, we explore how Low-Rank Adaptation (LoRA) can be used to address these challenges effectively. Specifically, we discuss using LoRA serving with LoRA eXchange (LoRAX) ...
Welcome, Ashley Tarasiewicz
Hello pharmaverse community! Iâm thrilled to announce that Ashley Tarasiewicz will be taking over the Atorus seat from me on the Pharmaverse council. Ashley has been a contributor ...
Running R on Windows on ARM on GitHub Actions
Introduction GitHub has recently announced that Windows ARM64 runners are now available under the windows-11-arm label. I help maintain an R package, TwoSampleMR, which has quite a...
How to Become a Prompt Engineer
Organizations of all sizes are searching for qualified prompt engineers. Is this exploding career choice the one for you?Constantly Updated â The download
1000 most common words
If you're reading text that consists of the the 1000 most common words, how much text would you expect to read before seeing every word at lest once
Sport data from Runalyze
Runners â CC-BY by ~jar{} I already explored how to get your activities from Strava. Maybe you use Runalyze instead? In that case you can do some web scraping to export your data. ...
Choroplethr v4.0.0 is now on CRAN
choroplethr version 4.0.0 is now on CRAN. You can install it like this: With this version, I have transferred the maintenance of choroplethr to Zhaochen He, an economics professor ...
Train Predict Evaluate Basics
Goal Our goal for this exercise sheet is to learn the basics of mlr3 for supervised learning by training a first simple model on training data and by evaluating its performance on ...
Getting My Feet Wet With `Plumber` and JavaScript
Tried out plumber and a bit of JavaScript to build a simple local API for logging migraine events đ§ đ». Just a quick tap on my phone now records the time to a CSVâpretty handy! đ±â
Mo...
Integration testing in Epiverse-TRACE
In Epiverse-TRACE we develop a suite of R packages that tackle predictable tasks in infectious disease outbreak response. One of the guiding software design principles we have work...
R Version 4.5.0 is Out!
Some windows. Olympus XA, Portra 800. Photo by Nicholas Tierney The new R version 4.5.0 is out, and you should get it! Iâve read through the NEWS file, which details every change -...
AI-generated code comes with security risks
More and more students are using AI-generated code in their studies, without necessarily understanding the security risks that this entails. This has consequences for users such as...
ErdĆs-Mordell triangle theorem
An inequality regarding triangles that was conjectured by Paul ErdĆs and first proved by Louis Mordell. A weighted generalization of the same theorem.
The apply() Family of Functions in R
The apply() family of functions in R is a powerful tool for applying operations to data structures like matrices, data frames, and lists. These functions help you write concise and...
Logarithmic sawtooth
Here's a strange integral I ran across recently [1]. It's a little surprising that the integral even exists, and more surprising that its value has a simple expression. Here's a pl...
When Benfordâs law is exact
Sometimes Benford's law seem too good to be true, such as predicting how often the leading digit of 2^n will be a 1.
Exploring a 3-D Synthetic Dataset
Exploring the HistData package Over on BlueSky, I have been working through a few challenges. For the months of February and March, I participated in the DuBois Challenge, where yo...
[R] data.tableâs frank()
Zhenguo Zhang's Blog /2025/04/12/r-data-table-s-frank/ - knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE) library(knitr) library(data.table) One can use data.ta...
Lots of flat sides
Today's exponential sum has an unusual number of sides parallel to the coordinate axes.
Hopping gives this tiny robot a leg up
A hopping, insect-sized robot can jump over gaps or obstacles, traverse rough, slippery, or slanted surfaces, and perform aerial acrobatic maneuvers, while using a fraction of the ...
Research Focus: Week of April 7, 2025
In this issue: We introduce a new dataset designed to assist renewable energy infrastructure planners, a new method for denoising MRI imagery, and an AI tool for analyzing distant ...
The enterprise path to agentic AI
Feeling the pressure to adopt agentic AI? Learn how to scale safely through the evolving stages and avoid costly mistakes along the way.
Top 100 radio
Suppose you are listening to a radio station that plays the top 100 songs in some genre. How long until you've heard all 100 songs? Assume Zip's law.
Linear KdV dispersion
The linear KdV equation provides an interesting contrast to the full (nonlinear) KdV equation. It is also useful in applications in its own right.
Fundamental solution
Fundamental solutions to PDEs. An asside on rigor, and an example of a logarithm yearning to be free.
No matter how dubious
Using dubious methods to find a solution is fine, and often necessary, as long as you can verify by more reliable methods that you have indeed found a solution.
Superhyperbola
The superhyperbola is the lesser-known sister of the superellipse.
The glass disk game
Removing all the mathematical content of the five lemma and friends, visualizing the lemmas as a game with glass disks.
Research Focus: Week of March 24, 2025
In this issue, we examine a new conversation segmentation method that delivers more coherent and personalized agent conversation, and we review efforts to improve MLLMsâ understand...
MIT Maritime Consortium sets sail
MIT Maritime Consortium brings together MIT and maritime industry leaders to develop new technologies for all aspects of the global shipping industry, from nuclear propulsion to on...
Moving To Substack
Iâm freezing this blog and starting to post on my Substack instead. The authoring experience is much more convenient for me there. Please follow me there, and check out The Illustr...
Voyagerâs slingshot maneuvers
The discovery of gravitational assist maneuvers made the Voyager tours possible. Graph of how Voyager 2 gained, or lost, speed passing each planet.
ChatGPT: The Great Equalizer
New research from Stanford University reveals that ChatGPT and similar AI writers are surprisingly popular among the less-educated.
The reality of generative AI in the clinic
In Episode 1 of the podcast âThe AI Revolution in Medicine, Revisited,â host Peter Lee, president of Microsoft Research, & guests Dr. Christopher Longhurst & Dr. Sara Murray discus...
Building Resilient Data Infrastructure
We know it's possible to build new data infrastructure, but we urgently need to figure out how to do so sustainably and ethically. Join us at CNG Conference and at Fed Geo Day as w...
At the core of problem-solving
Stuart Levine, director of MITâs BioMicro Center, keeps departmental researchers throughout the Institute at the forefront of systems biology.
Staying Sane in an Insane World
Staying sane is a daily practice. Discover seven practical strategies to stay sane, reclaim your clarity, and strengthen your resilience amid uncertainty.
Introducing Gemma 3
Today, we're introducing Gemma 3, our most capable, portable and responsible open model yet.
When to Use GRUs Over LSTMs?
Choosing between LSTMs and GRUs for your NLP or time series project? This guide breaks down their differences in efficiency, memory & more.
Artificial muscles for tremor suppression
Scientists have developed a biorobotic arm that can mirror human tremors, such as those experienced by individuals that live with Parkinson's disease. Artificial muscles on either ...
A leg up for STEM majors
MIT undergrads Erin Hovendon and Kevin Guo share an understanding that their public policy and political science minors provide crucial perspectives on their research on the enviro...
What misbehaving AI can cost you
AI security gaps drive up costs and risk. Learn how AI governance and risk-proofing strategies help control spend and strengthen security.
Puzzling out climate change
MIT Accenture Fellow Shreyaa Raghavan explores ways to reduce transportation emissions with the help of machine learning.
Too Good to Be True?
Increasing numbers organizations are banning the AI chatbot DeepSeek -- China's head-turning answer to ChatGPT.
Creating a common language
MIT Associate Professor Kaiming He discusses the role of AI in interdisciplinary collaborations, connecting basic science to artificial intelligence, machine learning, and neural n...
Injecting Domain Expertise Into Your AI System
Domain experts can help you connect the dots between the technicalities of an AI system and its real-life usage and value. Thus, they should be key stakeholders and co-creators of ...
#459 â DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel is the founder of SemiAnalysis, a research & analysis company specializing in semiconductors, GPUs, CPUs, and AI hardware. Nathan Lambert is a research scientist at the...
Some Lessons on Reviews and Rebuttals
Writing and responding to reviews is the bread and butter of any academic and especially in AI research, PhD students are confronted with both rather early compared to other displi...
Minecraft with object impermanence
I generally am uninterested in generative AI that's too close to the real thing. But every once in a while there's a modern AI thing that's so glitchy and broken that it's strangel...
What is Mixture of Experts?
Explore the performance differences in Mixture of Experts (MoE) models and how they impact output predictions across various tasks. Read Now!
AI Safety Index Released
The Future of Life Institute has released its first safety scorecard of leading AI companies, finding many are not addressing safety concerns while some have taken small initial st...
Trip Report - PyData Global 2024
I attended PyData Global 2024 last week. Its a virtual conference, so I was able to attend it from the comfort of my home, although presenta...
Reward Hacking in Reinforcement Learning
Reward hacking occurs when a reinforcement learning (RL) agent exploits flaws or ambiguities in the reward function to achieve high rewards, without genuinely learning or completin...
Foundations of diffusion networks
Diffusion networks As thereâs a lot of recent developments around image generation and diffusion models in general, I took a deep dive in the fundamentals of...
Build Your Own OCR Engine for Wingdings
Discover how OCR technology transforms text recognition, from handwritten notes to custom fonts like Wingdings. Learn about cutting-edge models and create tailored OCR solutions fo...
Max Tegmark on AGI Manhattan Project
A new report for Congress recommends that the US start a "Manhattan Project" to build Artificial General Intelligence. To do so would be a suicide race.
The Complete Guide to NetSuite SuiteScript
NetSuiteâs flexibility comes from its powerful customization tools, and SuiteScript is at the heart of this. If you're looking to customize your NetSuite instance beyond just the p...
Thinking About Research Ideas vs. Technology
In this article, I want to share some thoughts on the difference between research ideas and technology, particularly in machine learning. This distinction is have been contemplatin...
Introducing mall for R...and Python
We are proud to introduce the {mall} package. With {mall}, you can use a local LLM to run NLP operations across a data frame. (sentiment, summarization, translation, etc). {mall}...
#450 â Bernie Sanders Interview
Bernie Sanders is a US Senator from Vermont and a two-time presidential candidate. Thank you for listening †Check out our sponsors: https://lexfridman.com/sponsors/ep450-sc See be...
Paris AI Safety Breakfast #3: Yoshua Bengio
The third of our 'AI Safety Breakfasts' event series, featuring Yoshua Bengio on the evolution of AI capabilities, loss-of-control scenarios, and proactive vs reactive defense.
Botober 2024
Back by popular demand, here are some AI-generated drawing prompts to use in this, the spooky month of October!
Longtime AI Weirdness readers may recognize some of these. That's b...
Panda vs. Eagle
FLI's Director of Policy on why the U.S. national interest is much better served by a cooperative than an adversarial strategy towards China.
Join the Most-Awaited Chatbot Conference
The conference features a range of events designed to enrich attendeesâ understanding of the chatbot industry: With upcoming events already lined up, now is the time to get involve...
Demystifying AI in the Water Industry
Water industry professionals explored the intersection of artificial intelligence (AI) and machine learning (ML) during a pre-conference workshop in Ocean City, Maryland yesterday,...
Can AI agents learn to be good?
AI agents are different from AI assistants because they can initiate actions independently. Here we discuss the safety concerns involved with AI agents and what we are doing to mit...
Understanding the cp Command in Bash
The cp command in Bash is used to copy files and directories from one location to another.. âUnderstanding the cp Command in Bashâ is published by Javascript Jeepđđš in Becoming Hum...
Save up to $400 on Your Conference Tickets!
Whether youâre a returning attendee or new to our community, this is the perfect chance to experience the future of AI and chatbot technology at a discounted rate. Simply go to the...
Experiments with Prompt Compression
I recently came across Prompt Compression (in the context of Prompt Engineering on Large Language Models) on this short course on Prompt Com...
Extrinsic Hallucinations in LLMs
Hallucination in large language models usually refers to the model generating unfaithful, fabricated, inconsistent, or nonsensical content. As a term, hallucination has been somewh...
Book Report: Pandas Workout
Unlike many Data Scientists, I didn't automatically reach for Pandas when I needed to analyze data. I came upon this discipline (Data Scien...
An exercise in frustration
There's an anonymous facebook posting that's been making the rounds, in which a studio art director tried to hire AI prompters to make art, only to discover that they were complete...
Introducing Keras 3 for R
We are thrilled to introduce {keras3}, the next version of the Keras R package. {keras3} is a ground-up rebuild of {keras}, maintaining the beloved features of the original while r...
Finetuning RAGAS Metrics using DSPy
Last month, I decided to sign-up for the Google AI Hackathon , where Google provided access to their Gemini Large Language Model (LLM) and ...
FAQ for our Monte Carlo Conformal Prediction
Over the past months, I have given several talks about Monte Carlo conformal prediction and the problem of calibrating with uncertain ground truth, for example, stemming from annot...
KGC/HCLS 2024 Trip Report
I was at KGC (Knowledge Graph Conference) 2024 , which is happening May 6-10 at Cornell Tech . I was presenting (virtually) at their Health ...
Hidden 3D Pictures
Do you know those autostereograms with the hidden 3D pictures? Images like the Magic Eye pictures from the 1990s that look like noisy repeating patterns until you defocus your eyes...
Hidden sheep
AI Weirdness: the strange side of machine learning
On NeurIPSâ High School Paper Track
The decision to have a separate High School Project Track at NeurIPS 2024 has sparked quite some controversy, with many prominent AI researchers debating pros and cons and personal...
Diffusion Models for Video Generation
Diffusion models have demonstrated strong results on image synthesis in past years. Now the research community has started working on a harder taskâusing it for video generation. T...
Chat with AI in RStudio
Interact with Github Copilot and OpenAI's GPT (ChatGPT) models directly in RStudio. The `chattr` Shiny add-in makes it easy for you to interact with these and other Large Language...
Shaped like information
Hey look, it's a guide to basic shapes!
Not only does it have the basic shapes like circle, tringle, hectanbie, and sqale, it also has some of the more advanced shapes like renstq...
Bonus: More shape shaped shapes
The image I shared in my main post isn't one of the more incorrect examples of DALL-E3 generated guides - it's actually one of the more correct ones.
Here's another generated imag...
Learn your farm animals with AI!
Hey kids! What sound does a woolly horse-sheep make?
The image above is what you get when you ask dalle-3 (via chatgpt) for some basic educational material: "Please generate an il...
Itâs Ethics. Not Tech Ethics.
There is no such thing as tech ethics, venture capital ethics, startup ethics, blockchain ethics, IPO ethics etc. There is only ethics and bullshit ethics.
DALL-E3 generates candy hearts
I've experimented a couple of times with generating candy heart messages using various kinds of machine learning algorithms. Originally, short messages were just about all the orig...
Q&A with Mala Kumar, Our Newest Board Member
We are pleased to welcome Mala Kumar to our Board of Directors. In this Q&A profile, we talk with Mala about her career journey, joining our Board, and the intersections between te...
Thinking about High-Quality Human Data
[Special thank you to Ian Kivlichan for many useful pointers (E.g. the 100+ year old Nature paper âVox populiâ) and nice feedback. đ ]
High-quality data is the fuel for modern data...
Chocolates, labeled
So much of current AI-generated stuff is derivative sludge that I'm enjoying the pockets of weirdness where I find them. One of my favorite things right now: DALL-E3's attempts to ...
Unicorns, Show Ponies, and Gazelles
A few four-legged animal metaphors that explain how weâve been building global data infrastructure to date and how we might do better in the future.
PyData Global 2023: Trip Report
I had the opportunity to present at PyData Global this year. It is a virtual conference that ran over 3 days in multiple tracks from Decemb...
Adversarial Attacks on LLMs
The use of large language models in the real world has strongly accelerated by the launch of ChatGPT. We (including my team at OpenAI, shoutout to them) have invested a lot of effo...
Hugging Face Integrations
Hugging Face rapidly became a very popular platform to build, share and collaborate on deep learning applications. We have worked on integrating the torch for R ecosystem with Hug...
STAC API 1.0.0 Released
The STAC API specification reached its 1.0.0 version. With this release, the spec is fully aligned with the OGC API - Features Version 1.0 standard.
LLM Powered Autonomous Agents
Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspi...
Understanding LoRA with a minimal example
LoRA (Low Rank Adaptation) is a new technique for fine-tuning deep learning models that works by reducing the number of trainable parameters and enables efficient task switching. I...
GPT-2 from scratch with torch
Implementing a language model from scratch is, arguably, the best way to develop an accurate idea of how its engine works. Here, we use torch to code GPT-2, the immediate successor...
Future Shock Keynote Rap
Baba Brinkman performs an impromptu Future Shock Rap based on Nikola Danaylov's titular keynote speech for The Event in Gatineau, QC.
My Faculty Application Experience
I spent roughly a year preparing, and then interviewing, for tenure-trackfaculty positions. My job search is finally done, and I am joining theUniversity of ...
safetensors 0.1.0
Announcing safetensors, a new R package allowing for reading and writing files in the safetensors format.
We may finally crack Maths. But should we?
Automating mathematical theorem proving has been a long standing goal of artificial intelligence and indeed computer science. It's one of the areas I became very interested in rece...
torch 0.11.0
torch v0.11.0 is now on CRAN. This release features much-enhanced support for executing JIT operations. We also amended loading of model parameters, and added a few quality-of-life...
Say Hello to Radiant Earth
Announcing some changes to the Radiant Earth brand and our plans to serve our community moving forward.
Generative AI and AI Product Moats
Here are eight observations Iâve shared recently on the Cohere blog and videos that go over them.:
Article: Whatâs the big deal with Generative AI? Is it the future or th...
Group-equivariant neural networks with escnn
Escnn, built on PyTorch, is a library that, in the spirit of Geometric Deep Learning, provides a high-level interface to designing and training group-equivariant neural networks. T...
luz 0.4.0
luz v0.4.0 is now on CRAN. This release adds support for training models on ARM Mac GPUs, reduces the overhead of using luz, and makes it easier to checkpoint and resume failed run...
torch 0.10.0
torch v0.10.0 is now on CRAN. This version upgraded the underlying LibTorch to 1.13.1, and added support for Automatic Mixed Precision. As an experimental feature, we now also sup...
De-noising Diffusion with torch
Currently, in generative deep learning, no other approach seems to outperform the family of diffusion models. Would you like to try for yourself? If so, our torch implementation of...
Prompt Engineering
Prompt Engineering, also known as In-Context Prompting, refers to methods for how to communicate with LLM to steer its behavior for desired outcomes without updating the model weig...
The Transformer Family Version 2.0
Many new Transformer architecture improvements have been proposed since my last post on âThe Transformer Familyâ about three years ago. Here I did a big refactoring and enrichment ...
AO, NAO, ENSO: A wavelet analysis example
El Niño-Southern Oscillation (ENSO), North Atlantic Oscillation (NAO), and Arctic Oscillation (AO) are atmospheric phenomena of global impact that strongly affect people's lives. E...
Large Transformer Model Inference Optimization
[Updated on 2023-01-24: add a small section on Distillation.]
Large transformer models are mainstream nowadays, creating SoTA results for a variety of tasks. They are powerful but ...
Books Read in 2022
At the end of every year I have a tradition where I write summaries of thebooks that I read throughout the year. Unfortunately this year wasexceptionally bus...
Wavelet Transform - with torch
torch does not have built-in functionality to do wavelet analysis. But we can efficiently implement what we need, making use of the Fast Fourier Transform (FFT). This post is a ver...
The Illustrated Stable Diffusion
Translations: Chinese, Vietnamese.
(V2 Nov 2022: Updated images for more precise description of forward diffusion. A few more images in this version)
AI image generation is the ...
Some Math behind Neural Tangent Kernel
Neural networks are well known to be over-parameterized and can often easily fit data with near-zero training loss with decent generalization performance on test dataset. Although ...
A Plea to End Harassment
Scott Aaronson is a professor of computer science at UT Austin, where hisresearch area is in theoretical computer science. However, he may be more wellknown ...
Generalized Visual Language Models
Processing images to generate text, such as image captioning and visual question-answering, has been studied for years. Traditionally such systems rely on an object detection netwo...
The Illustrated Retrieval Transformer
Discussion: Discussion Thread for comments, corrections, or any feedback.
Translations: Korean, Russian
Summary: The latest batch of language models can be much smaller yet ac...
Books Read in 2021
At the end of every year I have a tradition where I write summaries of thebooks that I read throughout the year. Hereâs the following post with the roughset ...
How to Train Really Large Models on Many GPUs?
[Updated on 2022-03-13: add expert choice routing.]
[Updated on 2022-06-10]: Greg and I wrote a shorted and upgraded version of this post, published on OpenAI Blog: âTechniques fo...
What are Diffusion Models?
[Updated on 2021-09-19: Highly recommend this blog post on score-based generative modeling by Yang Song (author of several key papers in the references)].
[Updated on 2022-08-27: ...
Contrastive Representation Learning
The goal of contrastive representation learning is to learn such an embedding space in which similar sample pairs stay close to each other while dissimilar ones are far apart. Con...
Explainable AI Cheat Sheet
Introducing the Explainable AI Cheat Sheet, your high-level guide to the set of tools and methods that helps humans understand AI/ML models and their predictions.
I introduce t...
On Information Theoretic Bounds for SGD
Few days ago we had a talk by Gergely Neu, who presented his recent work:
* Gergely Neu Information-Theoretic Generalization Bounds for Stochastic
Gradient Descent [https://ar...
Weight Banding
Weights in the final layer of common visual models appear as horizontal bands. We investigate how and why.
Branch Specialization
When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.
Reducing Toxicity in Language Models
Large pretrained language models are trained over a sizable collection of online data. They unavoidably acquire certain toxic behavior and biases from the Internet. Pretrained lan...
Visualizing Weights
We present techniques for visualizing, contextualizing, and understanding neural network weights.
Curve Circuits
Reverse engineering the curve detection algorithm from InceptionV1 and reimplementing it from scratch.
Controllable Neural Text Generation
[Updated on 2021-02-01: Updated to version 2.0 with several work added and many typos fixed.]
[Updated on 2021-05-26: Add P-tuning and Prompt Tuning in the âprompt designâ sectio...
Newsletter #087
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Newsletter #086
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Some Intuition on the Neural Tangent Kernel
Neural tangent kernels are a useful tool for understanding neural network
training and implicit regularization in gradient descent. But it's not the
easiest concept to wrap your he...
Understanding RL Vision
With diverse environments, we can analyze, diagnose and edit deep reinforcement learning models using attribution.
Newsletter #085
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Notes on Causally Correct Partial Models
I recently encountered this cool paper in a reading group presentation:
* Rezende et al (2020) Rezende Causally Correct Partial Models for
Reinforcement Learning [https://arxi...
Newsletter #084
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Newsletter #083
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Newsletter #082
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
So long, and thanks for all the fish
All good things must come to an end, including this podcast. This is the last episode we plan to release, and it doesnât cover data scienceâitâs mostly reminiscing, thanking our wo...
A Data Science Take on Open Policing Data
A few weeks ago, we put out a call for data scientists interested in issues of race and racism, or people studying how those topics can be studied with data science methods, should...
The Data Science Open Source Ecosystem
Open source software is ubiquitous throughout data science, and enables the work of nearly every data scientist in some way or another. Open source projects, however, are dispropor...
Rock the ROC Curve
This is a re-release of an episode that first ran on January 29, 2017. This week: everybody's favorite WWII-era classifier metric! But it's not just for winning wars, it's...
Curve Detectors
Part one of a three part deep dive into the curve neuron family.
Criminology and data science
This episode features Zach Drake, a working data scientist and PhD candidate in the Criminology, Law and Society program at George Mason University. Zach specializes in bringing da...
Convolutional neural networks
This is a re-release of an episode that originally aired on April 1, 2018 If you've done image recognition or computer vision tasks with a neural network, you've probably used a...
Stein's Paradox
This is a re-release of an episode that was originally released on February 26, 2017. When you're estimating something about some object that's a member of a larger group of simi...
Causal Trees
What do you get when you combine the causal inference needs of econometrics with the data-driven methodology of machine learning? Usually these two donât go well together (deriving...
The Grammar of Graphics
You may not realize it consciously, but beautiful visualizations have rules. The rules are often implict and manifest themselves as expectations about how the data is summarized, p...
Gaussian Processes
Itâs pretty common to fit a function to a dataset when youâre a data scientist. But in many cases, itâs not clear what kind of function might be most appropriateâlinear? quadratic?...
Putting machine learning into a database
Most data scientists bounce back and forth regularly between doing analysis in databases using SQL and building and deploying machine learning pipelines in R or python. But if we t...
The work-from-home episode
Many of us have the privilege of working from home right now, in an effort to keep ourselves and our family safe and slow the transmission of covid-19. But working from home is an ...
Software 2.0
I sometimes see people refer to neural networks as just âanother tool in your machine learning toolboxâ. They have some pros and cons, they work here or there, and sometimes you ca...
AlphaGo, in context
Update Oct 18, 2017: AlphaGo Zero was announced. This post refers to the previous version. 95% of it still applies. I had a chance to talk to several people about the recent AlphaG...
ICML accepted papers institution stats
The accepted papers at ICML have been published. ICML is a top Machine Learning conference, and one of the most relevant to Deep Learning, although NIPS has a longer DL tradition a...
A Peek at Trends in Machine Learning
Have you looked at Google Trends? Itâs pretty cool â you enter some keywords and see how Google Searches of that term vary through time. I thought â hey, I happen to have this arxi...
ICLR 2017 vs arxiv-sanity
I thought it would be fun to cross-reference the ICLR 2017 (a popular Deep Learning conference) decisions (which fall into 4 categories: oral, poster, workshop, reject) with the nu...
Virtual Reality: still not quite there, again.
The first time I tried out Virtual Reality was a while ago â somewhere in the late 1990's. I was quite young so my memory is a bit hazy, but I remember a research-lab-like room ful...
Yes you should understand backprop
When we offered CS231n (Deep Learning class) at Stanford, we intentionally designed the programming assignments to include explicit calculations involved in backpropagation on the ...
CS183c Assignment #3
The last few weeks we heard from several excellent guests, including Selina Tobaccowala from Survey Monkey, Patrick Collison from Stripe, Nirav Tolia from Nextdoor, Shishir Mehrotr...