Simpson Thacher & Bartlett LLP logo
Full-time
Remote
United States
$145,000 - $165,000 USD yearly
Data Science and Analytics

Simpson Thacher’s Data Scientist will play a key role in delivering insights to the Firm’s leadership, legal practices, Knowledge Department and other administrative functions. To do so, this expert will use a mix of statistics, machine learning, deep learning and natural language processing methods to derive meaning, build models and create systems that capitalize on both structured and unstructured data. This position will play a crucial role in pushing the Firm’s artificial intelligence (AI) efforts forward, working on some of the most advanced projects in the legal industry.

The Data Scientist position sits within the Firm’s Applied Analytics Group, which is a part of the Firm’s Knowledge Department. This position will work closely with lawyers and legal support professionals across practices, as well as other technical resources within the Firm. This role requires creative problem solving, analytical rigor, technical skill and an appreciation for the business and practice of law.

Essential Job Duties & Responsibilities

  • Support legal teams and relevant operational staff in delivering on opportunities to use data to drive decision-making and improve the efficiency and effectiveness of the Firm’s client representations
  • Collaborate with Firm functional departments (e.g., Finance, Talent, Business Development, IT) to analyze data and develop solutions to support operational objectives
  • Develop regression and classification models using established and emerging data science methodologies
  • Chain, fine-tune and deploy pre-trained language models (e.g., BERT, Llama4, Qwen2.5, etc.) to optimize performance on a range of NLP tasks, including text classification, named entity recognition, and generative tasks such as summarization, clause and document generation, and question-answer exchanges
  • Design and deploy document segmentation and embedding approaches to facilitate information retrieval and retrieval augmented generation (RAG)
  • Conduct advanced quantitative research, using machine learning (ML) and natural language processing (NLP) techniques to understand patterns in large volumes of data, identify relationships, detect data anomalies and classify data
  • Configure practice-specific AI workflows and language technologies, which may require complex pipelines, prompt engineering, prompt chaining, and text operations
  • Design and deploy highly visual reports and interactive user interfaces that surface quantitative insights in forms that are fit-for-purpose, modern and easily accessible to non-technical business professionals
  • Stay current with the latest advancements in LLMs, NLP, Deep Learning and ML research, implementing cutting edge techniques and incorporating them into production models as appropriate
  • Document development processes, codebase, and best practices to facilitate knowledge sharing and maintain a well-organized, reproducible environment
  • Partner with other technical resources to refine data pipelines for recurring classes of analysis and data-driven solutions
  • Handle projects on request under the direction of the CKIO, Director of Data Analytics and other executive staff

Education

  • A bachelor’s degree required, preferably in data science, mathematics, statistics, computer science, engineering, finance or a related field
  • Master’s degree in data science, computer science, statistics, computational linguistics or engineering preferred
  • Prior coursework in deep learning, natural language processing, or information retrieval a significant plus

Skills and Experience

  • 2+ year in a data science, machine learning engineering, artificial intelligence or equivalent role, or a PhD in a related field
  • Highly proficient with statistical programming (e.g., Python, R) and databases (e.g., SQL, Pinecone)
  • Proven experience developing and validating linear and non-linear regression and classification models
  • Expertise in data transformation, data science and visualization libraries (e.g., pandas, scikit-learn, matplotlib, Snorkel, Seaborn)
  • Experience with natural language processing and related libraries (e.g., Hugging Face’s Transformers, spaCy, NLTK)) preferred
  • Ability to design and develop object-oriented machine learning systems beyond Jupyter notebooks a plus
  • Solid understanding of deep learning frameworks such as TensorFlow or PyTorch a plus
  • Proficiency with version control systems such as Git or equivalent tools for code management and collaboration
  • Able to translate business problems to technical logic and practical solutions
  • Able to communicate complex results clearly to a non-technical audience
  • Proactively develops and maintains technical knowledge in emerging data science areas
  • Experience in the legal field is a significant plus

Salary Information

The estimated base salary range for this position is $145,000 to $165,000 at the time of posting.

The actual salary offered will depend on a variety of factors, including without limitation, the qualifications of the individual applicant for the position, years of relevant experience, level of education attained, certifications or other professional licenses held, and if applicable, the location in which the applicant lives and/or from which they will be performing the job. This role is exempt meaning it is not overtime pay eligible.

Privacy Notice

For information about how Simpson Thacher & Bartlett LLP collects and processes your personal information, please refer to our Privacy Notice available at https://www.stblaw.com/other/privacy-notice.

Simpson Thacher & Bartlett is committed to a collegial work environment in which all individuals are treated with respect and dignity. The Firm prohibits discrimination or harassment based upon race, color, religion, gender, gender identity or expression, age, national origin, citizenship status, disability, marital or partnership status, sexual orientation, veteran’s status or any other legally protected status. This Policy pertains to every aspect of an individual’s relationship with the Firm, including but not limited to recruitment, hiring, compensation, benefits, training and development, promotion, transfer, discipline, termination, and all other privileges, terms and conditions of employment.

#LI-Remote