Five days exploring the frontiers of artificial intelligence
A deep technical dive into agentic AI architectures — tool use, memory systems, planning under uncertainty, and the open problems between today's demos and reliable autonomous deployment.
Advances in RAG pipeline design — dense vs. sparse retrieval, late interaction models, reranking, and how to evaluate retrieval quality end-to-end rather than in isolation.
How MoE architectures achieve GPT-4 class quality at a fraction of the active parameters — routing mechanisms, load balancing, expert collapse, and what the next generation of sparse models looks like.
Advances in RAG pipeline design — dense vs. sparse retrieval, late interaction models, reranking, and how to evaluate retrieval quality end-to-end rather than in isolation.
How models are being extended to million-token contexts — rope scaling, sliding window attention, and hybrid architectures — and the open questions around retrieval quality at long range.
How synthetic data generation is being used to augment real training data — quality filtering, model collapse risks, and domains where synthetic data already outperforms scraping.
Model versioning, experiment tracking, deployment pipelines, and monitoring strategies that scale from a solo researcher to a 50-person ML platform team.
Choosing between Pinecone, Weaviate, Chroma, and pgvector — indexing strategies, query performance benchmarks, and architecture patterns for embedding-heavy AI applications.
Economic modelling of AI's labour market impact — which roles are most exposed, where augmentation is winning, and what policy interventions have shown early evidence of working.
Why chasing perfect prompts is the wrong abstraction, and how to think about AI product quality through the lens of system design, evaluation pipelines, and feedback loops instead.
Building robust evals that catch regressions before they reach users — LLM-as-judge patterns, human eval sampling, and the metrics that actually correlate with user satisfaction.
Edge AI, powered by low-power TinyML, is revolutionising intelligent systems by bringing computation closer to data sources. The presentation explores strategies for designing efficient, scalable, and sustainable systems with social applications.
We present SignTalk, a dual-module system enabling bidirectional translation between sign language videos and textual sentences within a hospital domain.
A framework for detecting plant diseases using Edge TinyML is presented, focusing on sustainable precision agriculture. The system, which operates without batteries, combines on-device image inference, RF energy harvesting, and backscatter communication.
This article examines Artificial Intelligence role in criminal defence within Africa's legal system, highlighting benefits like improved legal analysis, precedent search, and decision prediction, while addressing risks of traditional proceedings, bias and insufficient regulatory oversight.
How diffusion architectures have been adapted for non-image modalities, unique challenges of temporal coherence in video and audio, and state-of-the-art results in 3D generation.
Latest advances in mechanistic interpretability — superposition, circuit analysis, and sparse autoencoders — and what they're revealing about how transformers actually represent and process information.
Batching strategies, KV-cache management, speculative decoding, and hardware co-design decisions that reduce inference cost by 10x without degrading output quality.
An honest comparison of H100 clusters vs. TPU v5e pods for large-scale training workloads, covering TCO, throughput, fault tolerance, and ecosystem maturity.
Pipeline, tensor, and data parallelism strategies — when to combine them, failure modes to expect, and checkpoint strategies that don't cost you days of compute.
Systematic auditing methodologies for detecting representational harm in large models, with case studies from hiring, healthcare, and legal AI deployments.
A practical breakdown of compliance obligations under the EU AI Act, expected US federal frameworks, and how AI teams should structure governance processes today.
UX research findings on how users form trust with AI systems — what builds confidence, what destroys it, and the interaction patterns that make AI feel like a collaborator rather than a black box.
How to scope an AI product initiative, define success metrics, choose the right model tier, and ship a v1 without over-engineering — lessons from a dozen product launches.
A critical examination of current AGI progress claims — which benchmarks measure generalisation, where today's models fall short, and what milestones would constitute meaningful progress.
Hybrid systems that integrate neural networks with symbolic reasoning modules — recent results, remaining challenges, and where this approach outperforms purely connectionist alternatives.
State-space models, Mamba, and hybrid approaches challenging transformer dominance at sub-10B parameter scale — benchmarks, deployment results, and architectural trade-offs.
Practical model compression techniques that get 70B-parameter capabilities onto consumer hardware — trade-offs, tooling, and where quality collapses under aggressive compression.
How safety concepts like interpretability, robustness, and corrigibility translate into concrete engineering practices at AI labs shipping products to millions of users today.
A data-driven look at AI product monetisation strategies showing strong retention and unit economics — subscriptions, usage-based, embedded vs. standalone — with real-world conversion benchmarks.
How foundation models trained on scientific data are accelerating research cycles in biology, chemistry, and materials science — real results from the lab, not just benchmarks.
A technical deep dive into why chain-of-thought prompting works, how o1-style reasoning models allocate test-time compute, and the research frontier on learned reasoning strategies.
How MoE architectures achieve GPT-4 class quality at a fraction of the active parameters — routing mechanisms, load balancing, expert collapse, and what the next generation of sparse models looks like.
When to fine-tune vs. prompt, how to set up an efficient fine-tuning pipeline with parameter-efficient methods, and quality vs. cost trade-offs across different adaptation strategies.
Real-world deployments of AI in clinical settings — diagnostic tools, treatment planning, patient communication — with honest reporting on failure modes and the governance structures that caught them.
Adoption data and outcome metrics from enterprise AI copilot rollouts across legal, finance, and engineering — what drives sustained usage, what causes abandonment, and integration patterns that matter.
A forward-looking keynote on the open research problems that matter most — scalable oversight, robustness, interpretability, and alignment — and how the community should prioritise them.
A frank assessment of where RLAIF and constitutional approaches have succeeded and fallen short — lessons from deployment at scale and the research directions that show most promise.
A systematic comparison of open-weight and proprietary models across coding, reasoning, instruction-following, and safety benchmarks — and what the trend lines suggest about the future of the field.
A candid discussion between five senior researchers on problems the field is systematically underinvesting in — and what it would take to make real progress on each of them.
Practical strategies for managing compute costs at each stage of an AI startup — when to use APIs vs. host your own, spot instance strategies, and the infra decisions that don't age well.
The current state of synthetic media detection, watermarking standards, and platform-level interventions — what's working, what's failing, and what policymakers need to understand about the arms race.
Qualitative and quantitative research on how people think about and interact with AI products — mental models, frustration points, and the feature requests that surface again and again across demographics.