Research Scientist
Dataiku
Headquartered in New York City, Dataiku was founded in Paris in 2013 and achieved unicorn status in 2019. Now, more than 1,000+ employees work across the globe in our offices and remotely. Backed by a renowned set of investors and partners including CapitalG, Tiger Global, and ICONIQ Growth, we’ve set out to build the future of AI.
We're looking for a Research Engineer to join the applied research team at Dataiku. You would join our effort to define and develop LLM-based prototypes to recast the data analytic experience in DSS.
You would join the research team, helping embed Dataiku into the machine learning research community while also working hand-in-hand with Dataiku R&D to solve enterprise data challenges. Straddling these two worlds is no easy feat.
Dataiku’s mission is big: to enable all people throughout companies around the world to use data by removing friction surrounding data access, cleaning, modeling, deployment, and more. But it’s not just about technology and processes; at Dataiku, we also believe that people (including our people!) are a critical piece of the equation.
At Dataiku, we develop innovative AI products, at the edge of academic and scientific advances to address our client’s biggest challenges. We work on very different topics from drift detection to AutoML as well as causal ML… and of course LLM.
This position requires a strong understanding of state-of-the-art machine learning and LLM as well as statistics, experiment design and software engineering.
What you will do:
Lead research to distill ML into new product applications.
Help develop Dataiku’s new LLM-powered features.
Design, code, and test pipelines on real life problems.
Manage scientific programs and activities at Dataiku AI Lab.
Contribute to the applied research machine learning community (seminars, conferences, publications).
You might be a good fit if you believe that 'transformers' are more than just robots in disguise, and that 'BERT' isn’t just someone's uncle from Sesame Street and you have…
MSc/PhD degree in Statistics, Machine Learning, related field or equivalent practical experience.
5 years + relevant work experience in the position.
Practical experience with LLMs.
And of course, strong coding skills in Python.
Who could it be ?
A DS with strong coding skills and proven scientific curiosity (LLM and beyond)
An ML software engineer with proven scientific curiosity (LLM and beyond)
A Research scientist with strong coding skills and business acumen.
So someone reasonable and practical, with good coding skills and solid scientific experience.
Deal breaker question in application form:
Do you have experience (playing) with LLMs (beyond ChatGPT) ?
Bonus point for: fine-tuning an LLM, deploying an LLM application.
Do you have experience with NLP in general ?