Just Horizons Alliance (Amsterdam)
We make meaningful work for brands that excite us. What we do and why we do it. Our work. 49,000 people take a break from Facebook.
Senior Developer, AI Evaluation & Cloud Infrastructure | Just Horizons Alliance
Join us to build the technical foundation for AI accountability.
The Role
Just Horizons Alliance is an 18-year-old applied research lab focused on ethics and technology. Our current focus is the AI Ethics Index, a measurement framework for evaluating AI systems on ethics, safety, and societal impact.
We need a senior engineer to own the technical infrastructure end-to-end: learn what exists, close critical gaps, and build something that lasts.
The evaluation methodology is validated and in use. We're now at the stage where the systems need to mature alongside the research. This is the first dedicated infrastructure hire for this work, and you'll shape how it scales.
What You’ll Do
Months 1–3: Learn the System
Map the current architecture with Sophia Zitman (AIEI Team Lead). Understand the evaluation methodology, the data flows, and the infrastructure that supports them. Identify what needs to evolve for multi-domain benchmarking—reproducibility, security posture, test coverage, deployment pipeline. Begin implementing the highest-priority improvements.
Months 4–6: Build for Scale
Architect the infrastructure to support the next phase of the Index. CI/CD that maintains stability as the system grows. IAM and secret management built for a production environment. Experiment tracking that makes every evaluation run auditable. Documentation that enables the research team to work independently.
Months 7–12: Expand
Multi-domain benchmarking across education, healthcare, finance, and other sectors. Reproducibility standards that meet external scientific scrutiny. A system the research team can extend without engineering support for every change. At this point, the infrastructure should be stable enough that you're focused on capability, not maintenance.
Why This Role Is Difficult
This is infrastructure for a scientific standard, not a product feature.
Correctness and delivery both matter. A bug in the evaluation engine doesn't break a feature, instead it invalidates a measurement. A flawed pipeline doesn't slow things down, it compromises the credibility of the research. At the same time, methodology that never runs in production has no impact. The role requires both rigor and momentum.
You're translating between disciplines. Your stakeholders are researchers, ethicists, and governance specialists. You'll need to take concepts like "operationalizing an ethical construct" and turn them into data models and pipelines. This is a translation problem as much as an engineering problem.
The work is novel. There's no existing system to reference. The AI Ethics Index is defining what rigorous AI evaluation looks like. You'll be making architectural decisions in areas where best practices don't yet exist.
You'll have full ownership. This is not a role where you're executing someone else's technical vision. You're setting the direction. That means autonomy, but it also means accountability.
You're probably the right person if
✅ You've built evaluation systems or data pipelines that other people depended on for correctness, not just uptime
✅ You're comfortable with GCP and have deployed containerized workloads in a real production context
✅ You've worked with LLM APIs and understand their reliability and reproducibility characteristics
✅ You can read a paper about measurement methodology and turn it into a working data structure
✅ You build for durability. Your code is still running 18 months later because you thought about the next person
✅ You've worked somewhere between 5 and 50 people and you're comfortable being the person who figures things out without a playbook
✅ You find working on AI ethics infrastructure more interesting than building another e-commerce checkout flow
You're probably not the right fit if
❌ Enterprise environments make up most of your experience. This is not a large-team context
❌ You need clearly defined requirements before you can start. The requirements here evolve through conversation with ethicists
❌ You're based on the West Coast US or expect West Coast US working hours
❌ You mainly build user-facing APIs and features — this is systems and infrastructure work
❌ You're looking for a high-growth startup where shipping speed is everything. This is a scientific organization. Correctness is everything.
Hard Skills
These are the technical capabilities you need going in — or need to be able to build up fast using an AI coding agent. We're not looking for someone who ticks every box. We're looking for someone who closes gaps quickly and knows how to learn.
What you get
The role: You'll work directly with Sophia Zitman (AIEI Team Lead) as the technical backbone of the AI Ethics Index. Full engineering ownership of the evaluation engine.
The comp: Base salary $110,000 on a remote contractors contract
The team: Small, split between ethicists and engineers. You will interview with Janet Kang (Executive Director) and Sophia Zitman (AIEI Team Lead).
The environment: Boston-based non-profit (501(c)(3)). East Coast US or Western Europe time zones. Collaborative but autonomous — Sophia won't micromanage, but she will hold you to a high standard of systems thinking.
The upside: You'll have built the technical foundation of what may become the globally referenced standard for AI system evaluation. That's a meaningful line on any CV — and a genuinely hard thing to have done.


Wat jij gaat doenAls ervaren developer bij Nh1816 Verzekeringen ben je één van de drijvende krachten achter het beste verzekeringsportaal van Nederland, met artificial intelligence als hart van je...


Jouw werk als Senior Developer bij NS maakt het verschil: je bouwt realtime monitoringsoftware die storingen voorkomt en treinen veilig en op tijd laat rijden. Samen met een gedreven team ontwikkel...


Als Senior Full Stack Developer bouw je aan het kloppend hart van ons Internal Developer Platform. Je ontwikkelt het Developer Portal as-a-Platform, een self-service omgeving voor honderden teams. Zo...


Help jij mee om onze stationsmiddelen, zoals de poortjes en paaltjes, optimaal te laten functioneren? Door onze applicaties te vernieuwen en te laten landen in de cloud maak jij het verschil voor...


Je bent medeverantwoordelijk voor de bouw en het beheer van de Master Planning Tool. Hiermee zorgt de afdeling Onderhouden voor een goede planning gedurende de levensloop van treinen, zodat het...


Als senior backend developer werk je bij DSO aan baanbrekende software voor het Nederlandse spoornet. Samen vernieuw je met Java en microservices het plansysteem van morgen.Daarom wil je als senior...


Als junior java developer kom je te werken in het team knoop & service, je bent betrokken met alle ICT die nodig is om onze opstelterreinen optimaal te benutten. Een proces wat 24/7 doorgaat. Wil...


Als senior Mendix ontwikkelaar bij NS zorg jij met jouw technische skills en leiderschap voor stabiele en toekomstbestendige apps die de treinen dagelijks op tijd laten rijden.Daarom wil je als...


Als Senior Full Stack Developer bouw je aan het kloppend hart van ons Internal Developer Platform. Je ontwikkelt het Developer Portal as-a-Platform, een self-service omgeving voor honderden teams. Zo...


Ben jij een innovatieve Java Developer op zoek naar uitdaging? Sluit je aan bij Team Mena van NS en werk mee aan baanbrekende software-oplossingen voor deur-tot-deur reizen zonder grenzen....