Edge-First LLM Semantic Routing on a 4GB Jetson Nano
People: David Pickett Idea: Testing whether a 4GB NVIDIA Jetson Nano can act as an autonomous routing brain - classifying incoming queries with a local embedding model and deciding to answer locally or escalate to more powerful servers across four compute tiers. Details: * The Jetson Nano can run llama.cpp