·7+ years industry experience focused on NLP, data science, AI/ML/LLM engineering, computer science, semantic engineering or a related discipline
·OR PhD in data science, AI/ML/LLM engineering, computer science, semantic engineering or a related discipline
Minimum Requirements:
·5+ year experience with Natural Language Processing, Generative AI or related techniques for machine understanding of natural language (i.e., written text, omics data, or similar)
·7+ years experience with Python, Spark, or related frameworks in AI, machine learning, data science, data engineering or similar context
Mandatory Skills:
·Fluency in Python programming, version control and collaboration with git, environment management (e.g., poetry, conda, docker), standard Python packages (e.g., pandas, numpy, matplotlib), and at least one ML framework (e.g., pytorch, tensorflow, fairseq)
·Experience with scalable data engineering frameworks such as Apache Spark and orchestration frameworks such as Airflow, and/or experience with semantic search and retrieval frameworks (e.g., development and benchmarking of embedding models and retrieval approaches in the context of Retrieval Augmented Generation, RAG)
·Experience with ML model deployment and operations (e.g., DevOps, MLOps, LLMOps), including CI/CD workflows and tooling (e.g., Github actions)
·Experience with standard operations on non-relational (e.g., Elasticsearch/Opensearch, MongoDB, Neptune), relational databases (e.g., PostgreSQL), and vector databases (e.g., pgvector, Elasticsearch dense vectors) and deployment of APIs and web applications (e.g., flask, fastAPI, django, or dash)
·Working knowledge of statistical learning, such as supervised, unsupervised, and weakly supervised learning, particularly in NLP contexts
·Working knowledge of NLP and/or Generative AI libraries (e.g., regular expressions, spacy, langchain), text annotation tools, and/or semantic frameworks (e.g. RDF triplestores, property graphs, ontology management)
·A demonstrated ability to engage cross-functional teams and stakeholders, including an eagerness to acquire a level of domain knowledge
·Excellent communication, teamwork, didactic, and leadership skills, including skills for scientific communication (authoring scientific articles and presenting) and guidance and mentorship of junior employees and less experienced collaborators
...