Ascom appoints David Hale as new CEO
David Hale, an executive leader with a proven track record and over two decades of technology and industrial expertise, has been appointed by the Board of Directors of Ascom Holdin...
How to debug STM32F103 using ST-LINK?
You debug an STM32F103 with ST-LINK by doing three things: Wire ST-LINK to the chip Install drivers... Tagged with stm32f103, debug, stlink, stm32f103c8t6.
Neovim's recent updates
Neovim's recent updates indeed show the characteristics of "one stable and one new", each targeting...
Thrive Launches New Client Portal
Revamped client portal provides a holistic view of services provided by leading NextGen 3.0 Service Provider Thrive, a global technology outsourcing provider for cybersecurity, clo...
DEEPX Launches DX-H1 V-NPU
Recently recognized with a CES 2026 Innovation Award, the solution combines video decoding, AI inference, and encoding on a single chip, offering 80% hardware cost savings compared...
2025 at Google
Learn more about Googleās launches, milestones and more from 2025.
Decoding omics via representation learning
A framework called AUTOENCODIX benchmarks diverse autoencoder architectures in biological molecular profiling data, enabling insights from complex, multi-layered data.
Fourier transform of a Fourier series
The previous post showed how we can take the Fourier transform of functions that don't have a Fourier transform in the classical sense. The classical definition of the Fourier tran...
Fourier transform of a flat line
What is the Fourier transform of a constant function? What does it even mean? Two hand-wavy derivations and a rigorous formulation.
What is a medoid?
In univariate data analysis, the median is often used as an alternative to the mean because the mean is sensitive to outliers in the data, whereas the median is a robust statistic.
AI Image Generation: On Genius
A new piece in The New York confirms that AI-generated writing -- along with similar AI creation tools -- is now the 'it' app.
The virtual cell
Virtual cells based on artificial intelligence models are on the horizon
Predicting RNA structures
Predicting the folded structures of RNA molecules poses greater challenges than proteins, but steady progress continues.
Uncertainty quantification for connectomics
As connectomics datasets grow in size and quantity, future reconstruction methods will have to work with minimal or no human supervision. For that, we will need methods that can qu...
What is a Pedersen commitment?
What are Pedersen commitments? How are they used? Why do they not require a trusted setup? What do they have to do with homomorphic encryption?
AIAI Toronto, 2025
Stream every session from AIAI Toronto, with sessions from OpenAI, Nvidia, BMO Financial Group, Meta and more.
Haskell IS a Great Language for Data Science
Iāve been learning Haskell for a few years now and I am really liking a lot of the features, not least the strong typing and functional approach. I thought it was lacking some of t...
Solving spherical triangles
Solving spherical trinagles. When is a combination of sides and angles enough to uniquely specify a solution and when are there two solutions?
The Navigational Triangle
The navigational triangle has one vertex at your position, one at the North Pole, and one at the geographic position of a star. Solving this spherical triangle.
Line of position (LOP)
The line of position is a fundamental concept in celestial navigation, the circle of positions from which a star appears at the same altitude.
Embedding a Quarto presentation in a blog post
This post is a simple demo of how to embed a Quarto presentation in your website. Quarto is a pretty nice open-source scientific publishing system which enables you to make present...
We are joining OpenAI
Iām excited to share that weāve entered into a definitive agreement to be acquired by OpenAI, subject to closing conditions. We are thrilled to join the OpenAI team and help their ...
The Crunchbase Tech Layoffs Tracker
Tech layoffs: At least 95,000 workers at U.S.-based tech companies were laid off in mass job cuts in 2024 and the cuts have continued into 2025.
What Are Lie Groups?
By combining the language of groups with that of geometry and linear algebra, Marius Sophus Lie created one of mathās most powerful tools.
Driving American battery innovation forward
At the MIT Energy Initiativeās Fall Colloquium, Kurt Kelty, vice president of battery, propulsion, and sustainability at General Motors, emphasized how affordability, accessibility...
When GPT-5 thinks like a scientist
GPT-5 is transforming research with novel insights, deep literature search, and human-AI collaboration that accelerates scientific breakthroughs.
Case Study: Loveable
Loveable, the Stockholm-based "vibe coding" platform, is demonstrating that Europe is still a prime incubator for global AI unicorns.
Gaussian Processes Again
One nice aspect of blogging about the things youāve just learned is that when, inevitably, you forget those things, reading back your old posts can bring you up to speed pretty qui...
Oil spill
Oiled Bird - Black Sea Oil Spill 11/12/07 ā CC BY by Igor GOLUBENKOV (Marine Photobank) Day 28 of 30DayMapChallenge: Ā« Black Ā» (previously). There is a global oil spill dataset fro...
Monero subaddresses
Monero subaddresses are analogous to heirarchical addresses in Bitcoin wallets. How subaddresses are generated and used.
A circle in the hyperbolic plane
If you draw a circle using the hyperbolic metric, then look at the curve from a Euclidean perspective, it's still a circle, but the radius and center change.
The Ultimate Black Friday Deal Is Here
Black Friday is leveling up. Get ready to score one of the biggest deals of the season ā 50% off the first three months of a new GeForce NOW Ultimate membership. Thatās GeForce RTX...
Equal things that donāt look equal
There are a surprising number of expressions for the hyperbolic metric, specifically the Poincare half plane, and none of them look equivalent.
Hyperbolic metric
The cross ratio can be used to define a metric on the upper half plane or disk model of the hyperbolic plane. Mobius transformations are isometries.
Case study: OpenAI
OpenAIās UK partnership accelerates sovereign AI, infrastructure, and enterprise adoption while shaping the global shift toward agentic AI.
AlphaFold: Five years of impact
Explore five years of AlphaFoldās impact on biology. Learn how this Nobel Prize-winning AI is accelerating scientific discovery globally
Case study: GitLab
GitLabās AI-driven DevSecOps platform unifies code, security and compliance to help UK enterprises ship software faster and safer.
Preparing Data for BERT Training
BERT is an encoder-only transformer model pretrained on the masked language model (MLM) and next sentence prediction (NSP) tasks before being fine-tuned for various NLP tasks. Pret...
BERT Models and Its Variants
BERT is a transformer-based model for NLP tasks that was released by Google in 2018. It is found to be useful for a wide range of NLP tasks. In this article, we will overview the a...
Case study: Synthesia
Synthesia is the UKās $4B AI video leader, transforming enterprise communication with secure, scalable, human-realistic AI video technology.
Solving H_n = 100
Finding the value of n such that the nth harmonic number is closest to 100.
The cost of thinking
MIT McGovern Institute researchers find a surprising parallel in the ways humans and new AI models solve complex problems.
The cost of thinking
MIT McGovern Institute researchers find a surprising parallel in the ways humans and new AI models solve complex problems.
Case study: Wayve
How Wayve is driving the UKās autonomous revolution, scaling mapless AI and competing with global AV leaders in Londonās 2026 robotaxi race.
Training a Tokenizer for BERT Models
BERT is an early transformer-based model for NLP tasks thatās small and fast enough to train on a home computer. Like all deep learning models, it requires a tokenizer to convert t...
Start building with Gemini 3
Gemini 3 is introducing advanced agentic coding capabilities, plus Google Antigravity, a new agentic development platform.
Hello, Old Friend?
A new piece in The New York confirms that AI-generated writing -- along with similar AI creation tools -- is now the 'it' app.
AIAI Boston 2025
Stream every session from AIAI Boston, with sessions from DeepSeek, Anthropic Waymo, NVIDIA, Prudential and more.
Synthetic Data for LLM Training
Learn about synthetic data as a widely used method to train foundation models when data is scarce, sensitive, or costly to collect.
Great Data Products
The language we use to talk about data is keeping us from realizing its full potential.
Teaching robots to map large environments
MIT researchers developed a powerful system that could help robots safely navigate unpredictable environments using only images captured from their onboard cameras.
Gone Fishinā
RobotWritersAI.com is playing hooky. We'll be back May 5, 2025 with fresh news and analysis on the latest in AI-generated writing.
Creating AI that matters
The MIT-IBM Watson AI Lab bridges research and deployment in AI through advances like smaller, efficient foundation models, vision and multimodal systems, and causal discovery.
Halloween costumes by tiny neural net
I've recently been experimenting with one of my favorite old-school neural networks, a tiny program that runs on my laptop and knows only about the data I give it. Without internet...
Deep Cognition at WESCCON 2025
Deep Cognition showcases PaperEntry AI at WESCCON 2025. Discover how AI-driven customs automation elevates brokers. Visit us at Booths 4 & 5.
New Claude Sonnet 4.5:
A new piece in The New York confirms that AI-generated writing -- along with similar AI creation tools -- is now the 'it' app.
These little robots literally walk on water
HydroSpread, a breakthrough fabrication method, lets scientists build ultrathin soft robots directly on water. These tiny, insect-inspired machines could transform robotics, health...
mall 0.2.0
The mall 0.2.0 update for R and Python introduces support for external LLM providers like OpenAI and Gemini. This version also features parallel processing for R users, the abili...
ChatGPT will apologize for anything
ChatGPT will apologize for anything - even advice it definitely didn't give, and stuff it definitely didn't do. It very much regrets its recommendation that we hire a giraffe as CE...
Book Review: Essential Graph RAG
Coming from a background of Knowledge Graph (KG) backed Medical Search, I don't need to be convinced about the importance of manually curate...
Internet Power
A talk given at the Yale Jackson School of Global Affairs arguing that 21st century institutions need to understand how to accrue and wield their own power on the Internet.
A Kernel of Truth
A haunting modern parable about truth, distortion, and destructionāhow one kernel of truth can grow into a fire that burns the world.
Why We Think
Special thanks to John Schulman for a lot of super valuable feedback and direct edits on this post.
Test time compute (Graves et al. 2016, Ling, et al. 2017, Cobbe et al. 2021) and...
Moving To Substack
Iām freezing this blog and starting to post on my Substack instead. The authoring experience is much more convenient for me there. Please follow me there, and check out The Illustr...
Building Resilient Data Infrastructure
We know it's possible to build new data infrastructure, but we urgently need to figure out how to do so sustainably and ethically. Join us at CNG Conference and at Fed Geo Day as w...
Staying Sane in an Insane World
Staying sane is a daily practice. Discover seven practical strategies to stay sane, reclaim your clarity, and strengthen your resilience amid uncertainty.
Some Lessons on Reviews and Rebuttals
Writing and responding to reviews is the bread and butter of any academic and especially in AI research, PhD students are confronted with both rather early compared to other displi...
Minecraft with object impermanence
I generally am uninterested in generative AI that's too close to the real thing. But every once in a while there's a modern AI thing that's so glitchy and broken that it's strangel...
AI Safety Index Released
The Future of Life Institute has released its first safety scorecard of leading AI companies, finding many are not addressing safety concerns while some have taken small initial st...
Trip Report - PyData Global 2024
I attended PyData Global 2024 last week. Its a virtual conference, so I was able to attend it from the comfort of my home, although presenta...
Reward Hacking in Reinforcement Learning
Reward hacking occurs when a reinforcement learning (RL) agent exploits flaws or ambiguities in the reward function to achieve high rewards, without genuinely learning or completin...
Foundations of diffusion networks
Diffusion networks As thereās a lot of recent developments around image generation and diffusion models in general, I took a deep dive in the fundamentals of...
Max Tegmark on AGI Manhattan Project
A new report for Congress recommends that the US start a "Manhattan Project" to build Artificial General Intelligence. To do so would be a suicide race.
Thinking About Research Ideas vs. Technology
In this article, I want to share some thoughts on the difference between research ideas and technology, particularly in machine learning. This distinction is have been contemplatin...
Introducing mall for R...and Python
We are proud to introduce the {mall} package. With {mall}, you can use a local LLM to run NLP operations across a data frame. (sentiment, summarization, translation, etc). {mall}...
Paris AI Safety Breakfast #3: Yoshua Bengio
The third of our 'AI Safety Breakfasts' event series, featuring Yoshua Bengio on the evolution of AI capabilities, loss-of-control scenarios, and proactive vs reactive defense.
Botober 2024
Back by popular demand, here are some AI-generated drawing prompts to use in this, the spooky month of October!
Longtime AI Weirdness readers may recognize some of these. That's b...
Panda vs. Eagle
FLI's Director of Policy on why the U.S. national interest is much better served by a cooperative than an adversarial strategy towards China.
Experiments with Prompt Compression
I recently came across Prompt Compression (in the context of Prompt Engineering on Large Language Models) on this short course on Prompt Com...
Extrinsic Hallucinations in LLMs
Hallucination in large language models usually refers to the model generating unfaithful, fabricated, inconsistent, or nonsensical content. As a term, hallucination has been somewh...
Book Report: Pandas Workout
Unlike many Data Scientists, I didn't automatically reach for Pandas when I needed to analyze data. I came upon this discipline (Data Scien...
An exercise in frustration
There's an anonymous facebook posting that's been making the rounds, in which a studio art director tried to hire AI prompters to make art, only to discover that they were complete...
Introducing Keras 3 for R
We are thrilled to introduce {keras3}, the next version of the Keras R package. {keras3} is a ground-up rebuild of {keras}, maintaining the beloved features of the original while r...
Finetuning RAGAS Metrics using DSPy
Last month, I decided to sign-up for the Google AI Hackathon , where Google provided access to their Gemini Large Language Model (LLM) and ...
FAQ for our Monte Carlo Conformal Prediction
Over the past months, I have given several talks about Monte Carlo conformal prediction and the problem of calibrating with uncertain ground truth, for example, stemming from annot...
KGC/HCLS 2024 Trip Report
I was at KGC (Knowledge Graph Conference) 2024 , which is happening May 6-10 at Cornell Tech . I was presenting (virtually) at their Health ...
Hidden 3D Pictures
Do you know those autostereograms with the hidden 3D pictures? Images like the Magic Eye pictures from the 1990s that look like noisy repeating patterns until you defocus your eyes...
On NeurIPSā High School Paper Track
The decision to have a separate High School Project Track at NeurIPS 2024 has sparked quite some controversy, with many prominent AI researchers debating pros and cons and personal...
Diffusion Models for Video Generation
Diffusion models have demonstrated strong results on image synthesis in past years. Now the research community has started working on a harder taskāusing it for video generation. T...
Chat with AI in RStudio
Interact with Github Copilot and OpenAI's GPT (ChatGPT) models directly in RStudio. The `chattr` Shiny add-in makes it easy for you to interact with these and other Large Language...
Q&A with Mala Kumar, Our Newest Board Member
We are pleased to welcome Mala Kumar to our Board of Directors. In this Q&A profile, we talk with Mala about her career journey, joining our Board, and the intersections between te...
Thinking about High-Quality Human Data
[Special thank you to Ian Kivlichan for many useful pointers (E.g. the 100+ year old Nature paper āVox populiā) and nice feedback. š ]
High-quality data is the fuel for modern data...
Unicorns, Show Ponies, and Gazelles
A few four-legged animal metaphors that explain how weāve been building global data infrastructure to date and how we might do better in the future.
PyData Global 2023: Trip Report
I had the opportunity to present at PyData Global this year. It is a virtual conference that ran over 3 days in multiple tracks from Decemb...
Adversarial Attacks on LLMs
The use of large language models in the real world has strongly accelerated by the launch of ChatGPT. We (including my team at OpenAI, shoutout to them) have invested a lot of effo...
Hugging Face Integrations
Hugging Face rapidly became a very popular platform to build, share and collaborate on deep learning applications. We have worked on integrating the torch for R ecosystem with Hug...
STAC API 1.0.0 Released
The STAC API specification reached its 1.0.0 version. With this release, the spec is fully aligned with the OGC API - Features Version 1.0 standard.
LLM Powered Autonomous Agents
Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspi...
Understanding LoRA with a minimal example
LoRA (Low Rank Adaptation) is a new technique for fine-tuning deep learning models that works by reducing the number of trainable parameters and enables efficient task switching. I...
GPT-2 from scratch with torch
Implementing a language model from scratch is, arguably, the best way to develop an accurate idea of how its engine works. Here, we use torch to code GPT-2, the immediate successor...
My Faculty Application Experience
I spent roughly a year preparing, and then interviewing, for tenure-trackfaculty positions. My job search is finally done, and I am joining theUniversity of ...
safetensors 0.1.0
Announcing safetensors, a new R package allowing for reading and writing files in the safetensors format.
We may finally crack Maths. But should we?
Automating mathematical theorem proving has been a long standing goal of artificial intelligence and indeed computer science. It's one of the areas I became very interested in rece...
torch 0.11.0
torch v0.11.0 is now on CRAN. This release features much-enhanced support for executing JIT operations. We also amended loading of model parameters, and added a few quality-of-life...
Group-equivariant neural networks with escnn
Escnn, built on PyTorch, is a library that, in the spirit of Geometric Deep Learning, provides a high-level interface to designing and training group-equivariant neural networks. T...
Generative AI and AI Product Moats
Here are eight observations Iāve shared recently on the Cohere blog and videos that go over them.:
Article: Whatās the big deal with Generative AI? Is it the future or th...
luz 0.4.0
luz v0.4.0 is now on CRAN. This release adds support for training models on ARM Mac GPUs, reduces the overhead of using luz, and makes it easier to checkpoint and resume failed run...
torch 0.10.0
torch v0.10.0 is now on CRAN. This version upgraded the underlying LibTorch to 1.13.1, and added support for Automatic Mixed Precision. As an experimental feature, we now also sup...
De-noising Diffusion with torch
Currently, in generative deep learning, no other approach seems to outperform the family of diffusion models. Would you like to try for yourself? If so, our torch implementation of...
Prompt Engineering
Prompt Engineering, also known as In-Context Prompting, refers to methods for how to communicate with LLM to steer its behavior for desired outcomes without updating the model weig...
The Transformer Family Version 2.0
Many new Transformer architecture improvements have been proposed since my last post on āThe Transformer Familyā about three years ago. Here I did a big refactoring and enrichment ...
AO, NAO, ENSO: A wavelet analysis example
El NiƱo-Southern Oscillation (ENSO), North Atlantic Oscillation (NAO), and Arctic Oscillation (AO) are atmospheric phenomena of global impact that strongly affect people's lives. E...
Large Transformer Model Inference Optimization
[Updated on 2023-01-24: add a small section on Distillation.]
Large transformer models are mainstream nowadays, creating SoTA results for a variety of tasks. They are powerful but ...
Books Read in 2022
At the end of every year I have a tradition where I write summaries of thebooks that I read throughout the year. Unfortunately this year wasexceptionally bus...
The Illustrated Stable Diffusion
Translations: Chinese, Vietnamese.
(V2 Nov 2022: Updated images for more precise description of forward diffusion. A few more images in this version)
AI image generation is the ...
Some Math behind Neural Tangent Kernel
Neural networks are well known to be over-parameterized and can often easily fit data with near-zero training loss with decent generalization performance on test dataset. Although ...
A Plea to End Harassment
Scott Aaronson is a professor of computer science at UT Austin, where hisresearch area is in theoretical computer science. However, he may be more wellknown ...
Generalized Visual Language Models
Processing images to generate text, such as image captioning and visual question-answering, has been studied for years. Traditionally such systems rely on an object detection netwo...
The Illustrated Retrieval Transformer
Discussion: Discussion Thread for comments, corrections, or any feedback.
Translations: Korean, Russian
Summary: The latest batch of language models can be much smaller yet ac...
Books Read in 2021
At the end of every year I have a tradition where I write summaries of thebooks that I read throughout the year. Hereās the following post with the roughset ...
How to Train Really Large Models on Many GPUs?
[Updated on 2022-03-13: add expert choice routing.]
[Updated on 2022-06-10]: Greg and I wrote a shorted and upgraded version of this post, published on OpenAI Blog: āTechniques fo...
What are Diffusion Models?
[Updated on 2021-09-19: Highly recommend this blog post on score-based generative modeling by Yang Song (author of several key papers in the references)].
[Updated on 2022-08-27: ...
Contrastive Representation Learning
The goal of contrastive representation learning is to learn such an embedding space in which similar sample pairs stay close to each other while dissimilar ones are far apart. Con...
Explainable AI Cheat Sheet
Introducing the Explainable AI Cheat Sheet, your high-level guide to the set of tools and methods that helps humans understand AI/ML models and their predictions.
I introduce t...
On Information Theoretic Bounds for SGD
Few days ago we had a talk by Gergely Neu, who presented his recent work:
* Gergely Neu Information-Theoretic Generalization Bounds for Stochastic
Gradient Descent [https://ar...
Weight Banding
Weights in the final layer of common visual models appear as horizontal bands. We investigate how and why.
Branch Specialization
When a neural network layer is divided into multiple branches, neurons self-organize into coherent groupings.
Reducing Toxicity in Language Models
Large pretrained language models are trained over a sizable collection of online data. They unavoidably acquire certain toxic behavior and biases from the Internet. Pretrained lan...
Visualizing Weights
We present techniques for visualizing, contextualizing, and understanding neural network weights.
Curve Circuits
Reverse engineering the curve detection algorithm from InceptionV1 and reimplementing it from scratch.
Controllable Neural Text Generation
[Updated on 2021-02-01: Updated to version 2.0 with several work added and many typos fixed.]
[Updated on 2021-05-26: Add P-tuning and Prompt Tuning in the āprompt designā sectio...
Newsletter #087
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Newsletter #086
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Some Intuition on the Neural Tangent Kernel
Neural tangent kernels are a useful tool for understanding neural network
training and implicit regularization in gradient descent. But it's not the
easiest concept to wrap your he...
Understanding RL Vision
With diverse environments, we can analyze, diagnose and edit deep reinforcement learning models using attribution.
Newsletter #085
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Notes on Causally Correct Partial Models
I recently encountered this cool paper in a reading group presentation:
* Rezende et al (2020) Rezende Causally Correct Partial Models for
Reinforcement Learning [https://arxi...
Newsletter #084
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Newsletter #083
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
Newsletter #082
Weekly newsletter dedicated to sharing resources for building and operating production machine learning systems.
So long, and thanks for all the fish
All good things must come to an end, including this podcast. This is the last episode we plan to release, and it doesnāt cover data scienceāitās mostly reminiscing, thanking our wo...
A Data Science Take on Open Policing Data
A few weeks ago, we put out a call for data scientists interested in issues of race and racism, or people studying how those topics can be studied with data science methods, should...
The Data Science Open Source Ecosystem
Open source software is ubiquitous throughout data science, and enables the work of nearly every data scientist in some way or another. Open source projects, however, are dispropor...
Rock the ROC Curve
This is a re-release of an episode that first ran on January 29, 2017. This week: everybody's favorite WWII-era classifier metric! But it's not just for winning wars, it's...
Curve Detectors
Part one of a three part deep dive into the curve neuron family.
Criminology and data science
This episode features Zach Drake, a working data scientist and PhD candidate in the Criminology, Law and Society program at George Mason University. Zach specializes in bringing da...
Convolutional neural networks
This is a re-release of an episode that originally aired on April 1, 2018 If you've done image recognition or computer vision tasks with a neural network, you've probably used a...
Stein's Paradox
This is a re-release of an episode that was originally released on February 26, 2017. When you're estimating something about some object that's a member of a larger group of simi...
Causal Trees
What do you get when you combine the causal inference needs of econometrics with the data-driven methodology of machine learning? Usually these two donāt go well together (deriving...
The Grammar of Graphics
You may not realize it consciously, but beautiful visualizations have rules. The rules are often implict and manifest themselves as expectations about how the data is summarized, p...
Gaussian Processes
Itās pretty common to fit a function to a dataset when youāre a data scientist. But in many cases, itās not clear what kind of function might be most appropriateālinear? quadratic?...
Putting machine learning into a database
Most data scientists bounce back and forth regularly between doing analysis in databases using SQL and building and deploying machine learning pipelines in R or python. But if we t...
The work-from-home episode
Many of us have the privilege of working from home right now, in an effort to keep ourselves and our family safe and slow the transmission of covid-19. But working from home is an ...