Home

Foundations for an Artificial Scientist

Contents


below are ideas without evidence; just trust me bro

Evolution provides a very noisy signal. That’s why it took millions of years for us to reach a level of complexity we experience now, with robust intelligent agents (humans) operating in a large network (society/economy). One might argue that if the signal was not noisy, complexity would not have emerged.

TODO: consider if this theory requires time to be included in the model, or at least movement along some dimension.

Learning a World

Current SOTA LLMs are trained through unsupervised learning on a large corpus of text scraped off the Internet. This can be compared to an organism evolving in a world, or environment, consisting solely of text. At each timestep during training, organisms that fail to predict the next word in their minibatch are killed and ones that do slightly better live on. With vision-enhanced LLMs like LLaVA, a vision encoder is introduced and the model is fine-tuned to incorporate information from images as well. Still, the agent’s environment is too constrained by its textual origins - it is unable to ground images (correctly discuss absolute and relative placement of objects in the image) without further training.

If text is the wrong world to evolve our models in, what is better? It’s latent space. Consider SOTA diffusion models like Stable Diffusion, which performs generation in latent space, then decodes the vector into an image. Or the KV-cache LLMs use to attend to past information generated from the input text. Modern AI models already operate in compressed, difficult to interpret spaces within their many layers and auto-regressive steps. Later we cover how to build an agent that operates in latent space (compression, organs, transplants, adaptation, etc).

Reduced-bias Learning

An effective way of injecting a large amount of experience into a model is through unsupervised pre-training. This technique improves sample efficiency in downstream tasks because it acts as a form of regularization. Intuitively, if a model is trained by optimizing an objective over a small dataset, it is likely to overfit by learning specific details representative of the particular sample rather than the entire population. Pre-training reduces the chance of this occuring by learning features from a larger sample.

Eye-like organs act as encoders that take electromagnetic waves as input and extract features for further processing. Some types of worms have developed eyes and evolved the organ over millions of years. At such a large time scale, we can consider the evolutionary signal as equivalent to compression.

Taming an Agent

After pre-training, models are fine-tuned to match human preferences or other narrow objectives. An analogy for this is the taming of wild wolves, turning them into dogs - helpful and harmless assistants as OpenAI calls ChatGPT.

Alignment

Machine learning is mainly a study of objectives. With enough data and basic inductive biases (e.g. self-attention, deep architectures) it comes down to defining a functional and sufficient samples (a dataset) to fit the function.

There is a lot of discussion around AI Alignment, which is essentially the study of objectives. As we give agents more power over the environment they operate in, how do we stop them from interfering with the objectives humanity operates under? Humans have tamed their environment and in turn been domesticated by society itself, so AI can be viewed as a potential rogue agent. How does one prevent an AI revolution? These are big questions, but let’s take a step back.

The problem with focusing on objectives is that it is simply overthinking it. Similar to Bayesian priors, we need to learn how to trust the data. Over-inflated egos lead to incorrect assumptions and biases that will lead to greater damage than if left to its own devices.

One could argue that the more resources humanity allocates to alignment research, the more mis-aligned our models will become. Maybe we can follow the ethos of “lead by example”?

What is Science?

Humanity is a result of millions of years of trial and error. In fact, one can view Darwinian evolution as the same. The trial and error of nature does not get encoded in a format easily decoded by humans - it took decades of expeditions and analysis by Darwin and evolutionary biologists to uncover explanations behind phylogenetic trees and even longer to identify DNA and genes as the encoding of such information. Modern science continues this trend of trial and error, but more efficiently. Rather than generating similar species several times over millions of years, e.g. dogs became whales, then becames dogs again, intelligence presents a more robust method of evolution where the underlying mechanism (something like the brain) remains the same and it adapts in real time to changing conditions e.g. we will invent air conditioning during heat waves rather than outright dying and waiting to evolve biological cooling mechanism. Inventions like air conditioning went through many iterations (engineering cycles) and without encoding the intention, variables and results of these experiments, it would not have been possible. It is this ability to encode experience (think folk-tales and stories passed down generations, also myths) that has expedited human progress - efficient trial and error through horizontal scaling!