Machine Intelligence Engineer
Machine Intelligence Engineer
Learn about our culture to see if TAHO is a fit for you.
TAHO, Inc. is a remote-first software company with employees across the US on a shared mission to unlock computing hardware performance. We develop and sell TAHO (Trusted Autonomous Hybrid Operation), a software mesh designed for deployment and runtime architectures. We are focused on building category-defining technology in software, helping organizations drive better performance, efficiency, and adaptability. As a fast-growing startup, we are looking for team members who thrive in high-impact roles and embrace challenges.
As of August 2025, we have captured SEED funding, and we are pursuing additional customer traction. Our target is to achieve one million dollars in annual contract values by year-end 2025 and a one billion dollar valuation in three years. We are a first mover and category creator with pragmatic near-term goals. Join us for a once-in-a-lifetime opportunity to unlock true computing efficiency for the entire world.
We are hiring a Machine Intelligence Engineer to build the systems that define the future of machine learning, from the core algorithms to the code that runs them and the systems that deliver them to customers. This is a role for an engineer with data science experience who thrives on creation, someone who believes that today’s training frameworks and runtime methods can be reimagined from first principles. If you have strong opinions on how to get more out of modern training systems and the drive to prove it in code, this is your opportunity to build it. This role extends beyond TAHO’s own distributed runtime. You will also build and evolve training systems for a diverse range of customers, from AI research labs to high-performance computing teams and enterprise partners. Each environment will present new architectural and optimization challenges, giving you the opportunity to shape how machine learning systems are trained and scaled in the real world.
The mission of this role is to design and implement intelligent systems that make TAHO’s distributed runtime adaptive, efficient, and self-optimizing, enabling computing that is radically faster, more affordable, and universally accessible. This work requires precision, creativity, and a systems-level mindset that uses AI to continuously improve how compute is orchestrated, scaled, and recovered. The Machine Intelligence Engineer is not a model tuner but a systems innovator, embedding intelligence directly into how TAHO operates and learns from the workloads it runs.
This role requires an applied research mindset: bridging theoretical understanding of AI with practical engineering that drives measurable performance gains. The Machine Intelligence Engineer must fluidly switch between:
- Applied intelligence engineering, designing and integrating machine learning models that enhance runtime scheduling, resource prediction, and system resilience across heterogeneous compute environments.
- Systems optimization and automation using reinforcement learning, feedback loops, and predictive analytics to enable autonomous decision-making across TAHO’s runtime fabric.
- Cross-functional collaboration working with product, infrastructure, and research teams to ensure AI-driven capabilities directly amplify TAHO’s category-defining mission: unlocking true computing efficiency at scale.
TAHO is a cutting-edge tech company that requires adaptability, a commitment to continuous learning, and innovative thinking to stay ahead in a rapidly evolving environment. Individuals must possess strong problem-solving skills and a customer-centric mindset, ensuring they can identify issues effectively while prioritizing customer needs. Collaboration and teamwork are crucial, as working well with others and contributing to a team-oriented culture drives success. While technical proficiency varies by role, a basic understanding of the company’s technology is beneficial, with deeper expertise essential for technical positions. Ultimately, thriving in such an environment hinges on the ability to adapt, learn, communicate, innovate, solve problems, and prioritize customers.
Responsibilities:
Intelligent Systems Architecture & Development
- Design and implement machine learning–driven components that make TAHO’s distributed runtime adaptive, predictive, and self-optimizing.
- Build intelligent workload schedulers that dynamically allocate compute resources based on live telemetry and historical performance data.
- Develop reinforcement learning and optimization models that improve runtime resilience, recovery, and throughput in real-world environments.
- Integrate AI into runtime decision loops to automate tuning, caching, and orchestration across multi-cloud and edge deployments.
- Build and iterate fast. Prototype, test, and refine training and runtime systems that challenge industry assumptions.
- Reimagine how training efficiency, distributed learning, and runtime intelligence can coexist in a single framework.
- Take ideas from research and turn them into production-grade engineering breakthroughs.
- Build and deliver next-generation training systems used both internally at TAHO and by external customers across multiple domains.
- Contribute to the evolution of TAHO’s intelligence framework, ensuring every system is efficient, interpretable, and production-ready.
Applied Compute Intelligence & Scalability
- Build and experiment with new architectures for distributed training and inference that push the limits of real-time decision-making in TAHO’s runtime.
- Design self-adaptive models and control systems that learn from utilization, latency, and workload behavior to continuously improve runtime performance.
- Prototype training and orchestration frameworks that turn raw telemetry into active intelligence, enabling the system to predict, adapt, and recover faster.
- Partner with product and infrastructure engineers to translate bold architectural ideas into production-grade features that redefine performance and scalability.
- Invent, test, and refine distributed training workflows for large-scale models operating in diverse, high-throughput environments.
Performance Optimization & System Learning
- Build intelligent feedback systems that allow TAHO’s runtime to learn from its own performance data, continuously improving accuracy, efficiency, and resilience.
- Design and implement new optimization architectures that go beyond monitoring, systems that diagnose, adapt, and self-correct in real time.
- Develop and experiment with AI-driven methods that reduce latency, balance utilization, and extend compute reach across multi-tenant and heterogeneous environments.
- Create next-generation observability tools that not only surface insights but act on them, closing the loop between measurement and optimization.
- Prototype, benchmark, and ship mechanisms that transform performance improvements into permanent system capabilities.
Integration, Deployment & Collaboration
- Lead the productionization of intelligence systems, ensuring TAHO’s learning and optimization capabilities are embedded seamlessly into the core runtime.
- Build automation frameworks that unify continuous training, validation, and deployment, enabling intelligence to evolve safely and reliably across environments.
- Design internal tools and developer frameworks that expand TAHO’s AI capabilities for both internal teams and external partners, accelerating adoption and experimentation.
- Collaborate across infrastructure, research, and product teams to translate bold ideas into deployable, measurable systems that drive real customer and business outcomes.
- Operate as a technical integrator and builder, not just aligning efforts but creating the processes and systems that make collaboration itself scalable.
Qualifications:
- 7+ years of experience in applied machine learning, systems engineering, or large-scale distributed infrastructure.
- Proven ability to design, train, and deploy ML models that directly enhance system performance, scheduling, or automation.
- Strong proficiency in one or more programming languages such as Rust, Python, or C++, with an emphasis on performance, safety, and integration into production environments.
- Deep understanding of distributed systems, runtime orchestration, and parallel computing architectures (multi-cloud, edge, or HPC).
- Experience applying reinforcement learning, online learning, or predictive modeling to optimize dynamic, real-time systems.
- Hands-on experience with AI-driven control systems, autonomous optimization loops, or resource-aware inference pipelines.
- Strong foundation in data processing, feature engineering, and telemetry analysis for system intelligence applications.
- Familiarity with WebAssembly (WASM), container orchestration (Kubernetes), or decentralized compute networks is highly valuable.
- Track record of implementing AI or ML pipelines in production, including model versioning, validation, and monitoring.
- Excellent communication skills, able to explain complex technical ideas clearly across engineering, research, and product teams.
- Comfort operating in ambiguous, high-velocity startup environments, with a bias for experimentation, iteration, and measurable impact.
What to expect in the first 30 days:
- Deep dive into TAHO’s runtime architecture, telemetry systems, and distributed compute fabric to understand how intelligence integrates with distributed execution.
- Audit existing performance data and identify high-impact opportunities where intelligence models can improve scheduling, efficiency, or recovery.
- Reproduce and analyze prior AI-driven optimization experiments to establish baselines for new approaches.
- Prototype one small-scale learning loop (for example, predictive scheduling or anomaly detection) using simulated or historical system data.
- Partner with the CTO and senior engineers to define one bold experiment that tests a new approach to distributed training or runtime learning, along with clear success metrics for its impact.
What to expect in the first 60 days:
- Design, build, and deploy the first production-grade intelligence component within TAHO’s runtime, demonstrating adaptive scheduling or predictive scaling that measurably improves performance.
- Create the foundational data and telemetry systems that enable ongoing model training, evaluation, and feedback loops across distributed environments.
- Build the first generation of automation hooks that let the runtime adjust its own behavior based on real-time intelligence output.
- Drive early architecture and design sessions, shaping how TAHO embeds learning and adaptation as native behaviors in its runtime.
- Codify key learnings from early prototypes into internal technical guidance and shared frameworks that will accelerate future intelligence development.
What to expect in the first 90 days:
- Deliver a fully functioning intelligence subsystem into staging and early production, demonstrating self-learning and adaptive behavior across TAHO’s infrastructure and selected customer environments.
- Prove measurable impact through live performance gains such as faster cold-start recovery, higher resource efficiency, and greater runtime stability in real workloads.
- Design and implement continuous integration and deployment pipelines for model retraining, ensuring that new intelligence can be rolled out safely and repeatably across TAHO and customer systems.
- Build real-time feedback loops that allow deployed runtimes to learn from new telemetry automatically, closing the gap between experimentation and production.
- Define long-term research and architectural priorities for autonomous optimization, distributed reinforcement learning, and cross-customer scalability, setting the foundation for TAHO’s next wave of intelligent systems.
What to expect in the first 6 months:
- Build, launch, and operate at least one fully autonomous optimization loop that continuously adjusts runtime performance across TAHO’s infrastructure and active customer deployments.
- Demonstrate significant, measurable improvements in compute efficiency, throughput, or recovery time, showing tangible impact for both internal systems and customer environments.
- Create visualization and control frameworks that make model-driven intelligence transparent, auditable, and actionable for engineering, operations, and customer teams.
- Lead architectural design and scaling initiatives that extend TAHO’s intelligent runtime across multi-region, multi-tenant, and customer-specific deployments.
- Deliver reusable frameworks and core libraries that empower other engineers and customer integration teams to build ML-enabled features on top of TAHO’s platform.
What to expect in the first year:
- Build and fully operationalize TAHO’s intelligence layer in production, enabling self-optimizing behavior across global runtime deployments and customer environments.
- Design and publish the foundational systems and frameworks that define how TAHO, and the organizations we serve, build, train, and evolve intelligent infrastructure.
- Create scalable pipelines and processes for continuous learning, training, and deployment of new intelligence models across multi-tenant, multi-cloud architectures.
- Lead and mentor engineers, guiding the growth of a builder culture that fuses systems engineering with applied intelligence.
- Author and share technical papers, case studies, or open frameworks that position TAHO as the reference standard for intelligent compute systems.
- Deliver transformative performance outcomes that validate TAHO’s mission to make computing radically faster, affordable, and universally accessible through intelligent automation.
Perks:
- Freedom & flexibility; fit your work around your life.
- Home Office Setup: We want you to be comfortable while working.
- Training stipend for conferences, courses, and classes.
- Equity: We are a growing start-up, and we want all employees to have a stake in the company's success.
- Cutting-Edge Work: Shape the Future of AI and Infrastructure Software.
- TAHO Swag.
- Medical benefits and holidays.
