Interesting references

Here I'll give an incomplete but hopefully interesting list of work that is related to mine or has inspired me. At the risk of misportraying great pieces, I briefly describe them in three indeterminate sections.

Graph generalizations

Patch Graph Rewriting shows a novel method to do graph rewriting. For its rewriting approach, it introduces a language for talking about subgraph matches, handling intuitive but formal rule specification.
The hypergraph physics post showcasing directed hypergraphs and confluent rewriting made headlines. The post showcases the formalisms' power and comes with a library of experiments that may provide a good entry point to the topic.

Stepping up the formalism and rewriting game, we have the Atomspace. This framework embeds many types of graphs and allows for complex pattern matching over large datasets. With a diverse set of links connecting concepts (and other links), you can build intricate (probabilistic) knowledge graphs.

Recursion schemes beautifully formalize common operations (infra), but making them work on graphs is challenging. Though I hope to share my breadth-first solution soon, one approach is to do them top-down instead of at the base level.
Conceptually building on top of recursion schemes and the Atomspace is Folding and Unfolding on metagraphs. However, you don't necessarily need to fold over hierarchical edges like in metagraphs to benefit from recursion schemes. Algebraic graphs encodes regular graphs as overlay, connect, and vertex expressions over which you can fold.

A generalization that doesn't just loosen constraints is that of the sheaves (pdf). Similar to its topological meaning, it consists of subgraphs that do not contain the border nodes. When adding types, these subgraphs look like jigsaw puzzle pieces: only connecting to other structures with matching borders. Porting over more definitions from topology allows for clustering of similar subgraphs (when querying data, for example) and retrieving structurally similar concepts.

AI and reasoning

Often, machine learning systems are trying to solve a more challenging problem than necessary because of our setup and the data we feed it. The probabilistic programming framework Gen allows you to leverage existing software as proxies and increase performance and sample efficiency. This talk about Gen gives a great introduction.

Much of my research deals with evolutionary methods and meta-learning; I'll mention some projects that inspired me.
The first paper I can't find a reference to; please get in touch with me if you know the original paper. I believe it's called MAP, and the idea is simple. You define a few axes along which you can differentiate your solutions. For walkers in a physics environment, these may be weight and height, while for programs, this may be execution speed, expression size, and generalizability. Next, you roughly divide up each axis and set up whatever reproduction method suits your problem space. Instead of keeping only the top individuals or selecting them via some heuristic, you place them into the hyperrectangle. Only individuals superseded in their exact phenotype get replaced, which keeps the pool diverse and aids robustness.
NEAT (pdf) is an influential paper in this space, and for a good reason. They showed a principled approach to neuroevolution in which they balance performance, diversity, and exploration.
If your evolution algorithm is not open-ended, there's a maximum difference between the initial and final candidate solution. POET enables continued complexification by transferring well-suited agents to newly evolved environments.
One way to make the evolution more tractable is by drawing components for new agents from a library of proven concepts, as done in the AI physicist paper. If your library supports the proving of equivalences between concepts, you simplify, merge, or abstract them in the background.

Iterated amplification and distillation is the concept of using the current "base" model with more time, evaluations, or other resources to get a more accurate result against which to improve the base model. Running this procedure can raise the model's performance beyond what available training examples would provide. For example, when you have a code inference engine, this likely uses heuristics to narrow down the search space. Applying iterated amplification, you can tune the heuristics by spending more time searching the space and propagating the result metrics into the heuristic. Expert iteration - known from the independently developed AlphaGo Zero - is another example of this, using neural networks as the base model and tree search for amplification.
There are many execution schemes related to iterated amplification as an alignment technique, like recursive reward modeling and Humans Consulting Humans. Still, I want to highlight AI (safety via) debate, which is promising and perhaps underapplied.
The debate setup is a turn-based two-player game, where each player takes a stance on some issue and continues arguing for it. When the players are AI agents, a human can let them explore the pros and cons and decide whichever argument they find more convincing. You can probably apply similar transformations to debate as to ATL (pdf), letting an arbitrary number of agents argue in parallel. For example, agents with heterogeneous architectures employed in detecting art forgeries can refute claims using different detection methods. By the nature of the game, an agent picks the arguments where its techniques are best suited.

AGENT formalizes four behavioral properties you'll want your machine learning system to have. When designing or reflecting on a system, it's helpful to check whether it has the internal mechanisms to achieve these properties and doesn't make assumptions that counteract these.
The Abstraction and Reasoning Corpus introduced in On the Measure of Intelligence provides a benchmark for few-shot learning that doesn't require domain knowledge. Given a few abstract grids and the results of applying some transformation to them, your job is to apply the same transformation to a new grid. Except for the 2D spatial layout and non-temporal nature, many kinds of program-synthesizers should be on equal footing recreating these 400 transformations.

For many excellent articles on interpreting deep learning techniques, I recommend Distill. It also hosts this neat self-organizing system classifying hand-written digits.

The alignment of powerful AI systems is a critical and challenging task that likely requires effort continued for the foreseeable future. For broad and nuanced updates in the field, the alignment forum is an accessible place to start. Even if you're not focused on safety yet, by definition, alignment is a property you'll want your system to have to be of use to people. I've found the forum's library and Rob Miles' coverage to be valuable sources to provide to people who are new to the topic.

The Topos Institute examines interesting mathematical formalisms applied to scientific and engineering problems and makes them accessible to the world. The Polynomial Functors course provides a model for composable interfaces, and the Programming with Categories course provides an easy to digest grounding for recursion schemes and monads. If you're new to the field, I highly recommend this invitation to applied category theory, authored by some of the same people.

Graphs, vector spaces, and custom (often complex) data stores strive to provide interactive machine learning systems with a medium to do computation. We aim to avoid the natural language mess, though this brings either encoding/decoding loss or difficulty expressing concepts clearly. Luckily, people are working on new ways to express concepts, namely in constructed languages. We can take many lessons from their work, from representing counterfactuals to building complex relationships between entities. Learning about non-linear writing systems like UNLWS has been a fascinating journey, and I recommend exploring it with an open mind.

HCI and programming

The initial substrate for computers still heavily influences programming paradigms, and ultimately, user interaction. We're building abstractions upon abstractions to model our domain and express our solutions easily, while it may be beneficial to reshape low-level constructs instead.
Chemlambda performs local graph rewrites in an unsynchronized fashion to do computation. The rewrites can grow, shrink, and transform your data step by step by exposing different connections to the reagents. If you're unfamiliar with the Chamlambda project, start with a basic but mesmerizing example.
Another unsynchronized computational model is that of propagators (pdf), of which this talk gives a great introduction. The propagation networks consist of cells that yield possible values and propagators that are like functions. When randomly executing functions in such a network, the overall ambiguity decreases, effectively increasing the usefulness of the output.
On the topic of incremental computation, salsa-rs lets you define your programs as a series of transformations, recomputing them and their sub-transformations only if their inputs change.

Functional polyglot languages like Morloc, Yona, and Enso allow for - what feels like - abstraction first programming. It trades advanced language features for purity, making them well suited for type and program inference and structured editing.
Enso has a complete graph-based visual editor with interactive components and context-aware suggestions. Their wide variety of methods and easy composition on top of this editor makes for a modern and productive web, data science, and prototyping workflow.
Fructure focuses on structured editing in Racket with a keyboard-driven tree-traversal interaction.
Lambdu is similar to Fructure but adds more advanced features in an interactive-first Haskell-like language.

Generalizing both the modality and purpose, Keykapp provides a predictive virtual keyboard that works by applying, composing, and sharing functions for your workflow. Keykapp is accessible to more people by making few assumptions about the nature of your input device. It relies solely on the n-ary tree-traversal, mapping any amount of actuators to any amount of actions (which, of course, includes your regular characters).

Even from within existing languages and computational architectures, one can find clever solutions and elegant paradigms.
The Gen Julia framework enriches your functions for use in probabilistic programming. One example is enhancing deep learning by simulating the input data using predicted properties.
Scala 3 refines many features, including generic extensions, which allows the additions of methods to objects based on its implicit properties. For example, you can define a toBytes extension to all (potentially nested) containers of serializable elements.
Recursion schemes provide a template for classically error-prone recursive functions. Inspired by category theory, countless different schemes enable the abstraction of intricate folding and unfolding. I recommend Jared Tobin's practical recursion-schemes as an intro, as well as his overview of promorphisms and time-traveling recursion schemes. While there is still much experimental work to be done, Droste is one example of a practical library.