Note: This position is available on our W2 contract, and this position is located in New York City, NY.
Candidate is Required to travel to Client's offices.
Job Overview:
This role will be responsible for data engineering and data science works. This is a heavy hands-on role. This role will require handling big data from different data sources leveraging Google Cloud Platform, building data pipeline and develop ETL processes in the data engineering part. For data science part, the person in this role should be able to propose/build any data related product/tools by leveraging ML, cloud machine learning API or deep learning for business need or to promote efficiency of business process.
Responsibilities:
Process large and complex raw data sets that meets business needs.
Build the workflows required for optimal extraction, transformation, and loading of data from a wide variety of data sources into Google BigQuery.
Work with BI team to create Looker reports that utilize the data pipeline to provide actionable insights into key business performance metrics.
Reframe newsroom and business objectives as machine learning tasks that can deliver actionable insights, accurate predictions, and effective optimization
Implement and execute machine learning research with reliability and reproducibility
Turn models into data products, collaborate with other teams, and integrate into process throughout Hearst Newspaper.
Qualifications:
MS, PhD or 4+ years’ experience in computer science, applied mathematics, or other quantitative/computational discipline.
Familiar with different types of machine learning algorithms
Experience building and deploying a machine learning system into production
Data engineering experience, including SQL and manipulating large raw data for analysis
Experience building and optimizing data pipelines and architectures
Build processes supporting data extraction, transformation and loading to big data system.
Preferred:
Experience with relational SQL and NoSQL databases, including Postgres and SQL server
Experience working with unstructured data and Natural Language Processing
Experience with data pipeline and workflow management tools: Airflow, Luigi, etc
Experience with Google or AWS cloud services
Experience with object-oriented/object function scripting languages: Python, Java, Scala, etc.
--
RiVi Group