STAR 1678-02 – Data Scientist Skill Level: Expert Location: Chantilly (fully on-site, no remote option)
**MUST HAVE A POLY CLEARANCE TO APPLY**
Day to day responsibilities include:
The Candidate shall perform end-to-end quality assurance of data feeds and data sets.
The Candidate shall provide support for data triage and assessment at the site.
The Candidate shall identify and document areas for improvement in workflows or systems.
The Candidate shall attend regular stand-up meetings.
The Candidate shall provide input to code reviews.
The Candidate shall cross-train on existing collection tools.
The Candidate shall support building, monitoring, alerting, and reporting out (e.g. dashboards).
The Candidate shall support new use cases.
The Candidate shall research and document options for collecting or aggregating data from a variety of web based and internal platforms.
The Candidate shall evaluate web based platforms’ ability to detect or deny access.
The Candidate shall make recommendations on approaches to acquire information.
The Candidate shall use appropriate tools and computer programming languages, such as Python scripts, to collect and process data from a variety of sources.
The Candidate shall use network APIs to programmatically access data.
The Candidate shall create, maintain, and enhance systems in support of data exploitation.
The Candidate shall create or improve custom collection scripts written in Python.
The Candidate shall create or improve scripts leveraging APIs for collection needs.
The Candidate shall automate data clean-up and conditioning of collected data.
The Candidate shall automate data management and dissemination steps.
REQUIRED SKILLS AND DEMONSTRATED EXPERIENCE
Demonstrated experience programming in Python.
Demonstrated experience working with data in a variety of structured and unstructured formats, including MS Excel
Demonstrated experience with a variety of database tools, such as SQL and Presto, and data lakes/S3 data.
Demonstrated experience building and demonstrating data visualization tools, and dashboards especially Elasticsearch, Kibana, Tableau, Apache Superset and PowerBI
Demonstrated ability to translate complex, technical findings into an easily understood narrative in graphical, verbal, or written form.
Demonstrated experience analyzing questions, formulating requirements, determining suitable analytic approaches, evaluating results, and communicating findings to partners and stakeholders.
Demonstrated experience translating complex technical findings into an easily understood narrative in graphical, verbal, written form.
Demonstrated openness to feedback and ability to collaborate well with colleagues to develop client-tailored products.
HIGHLY DESIRED SKILLS AND DEMONSTRATED EXPERIENCE
Demonstrated experience with AI/ML such as natural language processing in a production environment
Demonstrated experience developing and refining statistical models ,inferential statistics, and hypothesis testing.
Demonstrated experience with data management tools, such as Hadoop, MapReduce, or similar.
Demonstrated experience conducting data science using the Apache Zeppelin and Jupyter notebooks platforms and Spark/Pyspark.
Demonstrated experience developing MS Visual Basic for Applications (VBA) code.
Demonstrated experience supporting traditional and commercial targeting efforts.
Demonstrated experience programming in the R language