|
We are seeking a motivated Student Programmer to assist with data collection, cleaning, and management tasks for an active research project focused on Knowledge Graphs (KGs) of scientific activity. The goal of this project is to model the scientific ecosystem (including datasets, research papers, authors, institutions, and funding sources) as a large-scale knowledge graph to facilitate new kinds of data-driven analysis and discovery. You will be working directly with faculty and other student researchers. The Team Responsibilities:The programmer will primarily focus on tasks essential for building and maintaining the knowledge graph. API Integration & Data Collection: Develop and maintain Python scripts to interact with various APIs (e.g., scholarly databases, institutional repositories) to collect structured and semi-structured scientific metadata. Data Cleaning and Transformation: Implement data processing routines in Python to clean, normalize, and transform raw data into a format suitable for graph modeling. Graph Database Management: Load, update, and manage data within the Neo4j graph database. This includes writing and optimizing Cypher queries for data manipulation and ensuring data integrity. Documentation: Maintain clear and thorough documentation of data sources, collection scripts, and data schema.
|