VPAL Research Data Engineer
383057 IT Specialist
Duties & Responsibilities
As a member of the Vice Provost for Advancements in Learning (VPAL) research group the VPAL Research Data Engineer will support our research program in pedagogy and education by managing the data obtained from online and mixed online and residential courses. The person filling this position will be responsible for designing, implementing and maintaining scalable software to transform and integrate large complex datasets into a reliable data infrastructure that can produce high quality research data for faculty, staff researchers, post-doctoral fellows and course developers at HarvardX. The research data engineer will work closely with technology partners including Harvard University Information Technology, EdX, and potentially others to define data requirements, verify data integrity, establish secure workflows for the transfer and storage of this data, and define, prepare, and present datasets extracted from the raw data collected. This person will also be a liaison and member of a technical community that will define and implement the next generation of tools for teaching and learning, both on-line and on-campus. She or he will be expected to act as a voice for educational research interests within that community.
Please note: This is a two year term position with the possibility of renewal.
Candidates MUST meet the following basic qualifications in order to be considered for this role:
Bachelor’s degree in Computer Science or related field, or equivalent work experience.
Minimum of 2 years of experience in technical projects with database components.
Proficiency in data integration and data quality development
Proficiency with Relational Database Systems and/or NoSQL databases.
Strong programming experience, including Python, Java, Scala, SQL, shell programming.
Familiarity with Python frameworks like Django or Flask.
Excellent problem-solving skills with proven track record.
Good oral and written communication skills as well as strong organizational and time management skills, with a demonstrated ability to manage multiple projects and tasks concurrently.
Advanced degree in Computer Science, or related field.
Experience performing data analysis with very large data sets.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift.
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with data pipeline and workflow management tools: Airflow, Luigi, etc.
Experience in Kubernetes, Docker is desirable.
Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O.
Please note: Harvard University requires pre-employment reference and background screening.
The VPAL Research Group is unable to provide work authorization and/or visa sponsorship.
Term will end 2 years from the date of hire.
USA – MA – Cambridge
00 – Non Union, Exempt or Temporary
Appointment End Date
Full time. Monday through Friday. 35 hours per week.
We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, gender identity, sexual orientation, pregnancy and pregnancy-related conditions, or any other characteristic protected by law.