<aside>
**Post V of the Causal Discovery Series by https://diksha-shrivastava13.github.io/**
</aside>
[Conducted on 2nd October, 2024]
Following my work at implementing hybrid vector-graph databases and looking at industry use-cases, I wanted to test how good are the current LLMs at Similar Link Prediction. Now, why the focus on “Similar Link Prediction”?
To explain that, it’s important to first take a look at the ARC-AGI problem and François Chollet’s paper **https://arxiv.org/abs/1911.01547** which introduced the Arc Benchmark in 2019. Chollet talks in great detail about defining intelligence, generalisation and adaptability. For our interests here, I will only talk about a selected few things, although I do strongly recommend reading the whole paper.
Generalisations can be categorised in two ways:
Now, the degree of generalisation can be broadly defined as:
For my purpose at handling reasoning in holistic systems, I wanted broad generalisation which would enable the system to handle a vast number of ML problems by first understanding the hidden relationships and implications between entities of different subsystems.
Now, let’s take a quick look at an ARC-AGI example before we dive into the “Arc” World.
The Arc Challenge provides a couple of examples of input-output pairs using which you have to predict the output corresponding to the input in the test pair. The Arc Prize 2024 Competition ended with the SOTA score at 55% on the public leaderboard and generated significant interest in the field of Test-Time Training (TTT).
My attempt here is to achieve good enough similar results which are much cheaper to run in production. Just like Retrieval-Augmented Generation (RAG) allows anyone to leverage the power of LLMs without having to fine-tune and host their own models for a large number of general use-cases, there can be a data representation which allows organisations to leverage reasoning and generality without having to scale compute on inference time.
The question is, what should this data representation look like? (see next section)