qMind is a research platform for building a more individual model of intelligence and mental fitness via language. The core concept is that ideas are defined in relation to one another; words can only be defined in terms of other words, therefore our understanding can be represented by a directed graph of word meanings. By asking people to define different phrases in their own terms, we build a weighted network of meaning and word association.
Current language models are based on statistical prediction methods, whereby vast amounts of text are analyzed in advance to form a weighted graph of their likelihood to appear near each other in a sentence. These transformers have an incredible capacity to imitate complex and intelligent behavior.
Ours is based on self-reported definitions and associations provided by real people in their own words, and in the moment.
One of the requirements was that the portal be not only a means for collecting data, but also serve as an exploration tool for both participants and researchers to look deeply into the data. This meant having interactive data visualizations, search, sort, filter and transforms of the underlying data, and realtime analysis on it.
I implemented algorithms and related visualizations for common graph analysis tools: breadth- and depth-first search, shortest path (first Dijkstra's, then A*),
function shortestPath(source, target) {
if (!source || !target) return [];
if (source === target) return [source];
const queue = [source];
const visited = { [source]: true };
const predecessor = {};
let tail = 0;
while (tail < queue.length) {
// Pop vertex off queue
let last = queue[tail++];
let neighbors = nodeMap[last];
if (neighbors) {
for (let neighbor of neighbors) {
if (visited[neighbor]) continue;
visited[neighbor] = true;
if (neighbor === target) {
// Check if path is complete. If so, backtrack!
const path = [neighbor];
while (last !== source) {
path.push(last);
last = predecessor[last];
}
path.push(last);
path.reverse();
return path;
}
predecessor[neighbor] = last;
queue.push(neighbor);
}
}
}
};
Providing these slices of information on-the-fly is a tough target to hit. The datasets generated by a single session are relatively small, but the analysis on them could be potentially massive. Some tasks lent themselves strongly toward client-side calculation on a per-user basis, and others toward batch or cron jobs on tables or entire databases. That's why we went with a GraphQL-based communication layer, and a hybrid mode of analysis where work was split between server and client, depending on what significance and scope it had.
Finding path lengths between specific nodes, for example, could be done on demand by the client, since it would be relevant only at that time. Calculating the graph's Eigenvector centralities, on the other hand, would be precomputed by batch processes.
The platform will use participant datasets to build language models that are both generalizable, and tuneable. Eventually, the plan is to use the model to help identify cognitive deficiencies in youth and elderly, to recommend areas of focus for school-aged children, and potentially as an early-warning indicator for neurological disease.