Software Engineer Intern - Fuzzy Distinct
Dataiku
This job is no longer accepting applications
See open jobs at Dataiku.See open jobs similar to "Software Engineer Intern - Fuzzy Distinct" Stripes.At Dataiku, we're not just adapting to the AI revolution, we're leading it. Since our beginning in Paris in 2013, we've been pioneering the future of AI with a platform that makes data actionable and accessible. With over 1,000 teammates across 25 countries and backed by a renowned set of investors, we're the architects of Everyday AI, enabling data experts and domain experts to work together to build AI into their daily operations, from advanced analytics to Generative AI.
Internship goal
Augment Dataiku data preparation by improving features on data records
Detailed description
Today, Dataiku boasts a robust data preparation framework that functions admirably to process a vast amount of data, helping users to have clean databases with the right data (and only the right data) inside them. However, we believe that with your help, we can take it a step further!
In a world where databases can be filled by real humans, data is not always clean. Errors can happen, typos can be made, and sometimes, you want to merge two database tables containing the same information, but not quite in the same format. “Dataiku”, “dataiku”, “data\niku” refer to the same company, but will be considered different entries in your database.
The goal of this internship is to improve the capabilities of our “distinct” processor to support fuzzy matching (aka: matching data that looks almost the same). You will participate to help our customers clean up their database, detect duplicated information and reduce them to a single line.
Why Engineering at Dataiku?
Dataiku’s on-premise, cloud, or SaaS-deployed platform connects many data science technologies, and our technology stack reflects our commitment to quality and innovation. We integrate the best of data and AI tech, selecting tools that truly enhance our product. From the latest LLMs to our dedication to open source communities, you'll work with a dynamic range of technologies and contribute to the collective knowledge of global tech innovators. You can find out even more about working in Engineering at Dataiku by taking a look here.
How you'll make an impact
-
Get familiar with Dataiku and its data preparation recipes as well as database schemas.
-
Participate to design a new component able to detect duplicate data
-
Develop the User Interface that helps the user understand the clusters of data
-
Help our users to reduce their data overload!
Stack
-
Python and Java for the backend side
-
JavaScript/Angular for the frontend part
#LI-Onsite
This job is no longer accepting applications
See open jobs at Dataiku.See open jobs similar to "Software Engineer Intern - Fuzzy Distinct" Stripes.