Genetic Sequence Database Product Owner and Data Wrangler
Bethesda, MD
Full Time
Experienced
Computercraft is looking for a Genetic Sequence Database Product Owner and Data Wrangler to support our work for the National Center for Biotechnology Information (NCBI), part of the National Library of Medicine (NLM) at the National Institutes of Health (NIH).
NCBI, one of the 400 most-visited sites in the world, is the premier biomedical center, hosting over four million daily users in search of clinical, genetic, and other information. NCBI’s wide range of applications (e.g., PubMed, ClinicalTrials.gov), platforms, and environments (e.g., big data [petabytes], machine learning, multiple clouds) serve more users with more data than any other U.S. Government agency. Working on NCBI products, you can help to accelerate the development of cures for diseases like cancer.
The Sequence Archives and Submissions (SeqArch) program needs a Product Owner and Data Wrangler for the GenBank sequence database, a unique scientific resource of human health and genetic data at NCBI. This person will be responsible for coordinating data exchange with the International Nucleotide Sequence Database Collaboration, generating downloadable data for external users, and coordinating targeted updates to the database based on systematic changes in taxonomic information.
In this position you will help manage GenBank’s data-access-related products, tools, and protocols. You will make decisions about the direction of the product and prioritize tasks. You will also work to define development tasks, establish delivery schedules, and ensure compliance with the organization’s policies and procedures.
Job Responsibilities
Required Skills/Experience
The compensation for this position will be based on the experience of the successful candidate. The expected pay range for this position is $110,000 to $150,000.
NCBI, one of the 400 most-visited sites in the world, is the premier biomedical center, hosting over four million daily users in search of clinical, genetic, and other information. NCBI’s wide range of applications (e.g., PubMed, ClinicalTrials.gov), platforms, and environments (e.g., big data [petabytes], machine learning, multiple clouds) serve more users with more data than any other U.S. Government agency. Working on NCBI products, you can help to accelerate the development of cures for diseases like cancer.
The Sequence Archives and Submissions (SeqArch) program needs a Product Owner and Data Wrangler for the GenBank sequence database, a unique scientific resource of human health and genetic data at NCBI. This person will be responsible for coordinating data exchange with the International Nucleotide Sequence Database Collaboration, generating downloadable data for external users, and coordinating targeted updates to the database based on systematic changes in taxonomic information.
In this position you will help manage GenBank’s data-access-related products, tools, and protocols. You will make decisions about the direction of the product and prioritize tasks. You will also work to define development tasks, establish delivery schedules, and ensure compliance with the organization’s policies and procedures.
Job Responsibilities
- Develop product vision, goals, and strategic roadmaps
- Lead data-gathering efforts through market research, data analysis, and user research to make balanced, objective decisions and provide clear guidance to delivery teams to create incremental value in an Agile environment
- Synthesize data-gathering efforts into a logical organization of epics and user stories for the development team
- Collaborate with users and lead cross-functional teams to define and optimize user workflows to improve user experience
- Understand customer segments and identify targeted solutions to exceed their needs
- Lead teams through a complete product lifecycle of discovery to delivery
- Nurture partnerships with various stakeholders who wish to participate in the sharing of genomic data for research in cloud and conventional environments, using secure cross-agency protocols
- Participate in external collaborations and work with senior stakeholders
- Analyze incoming genetic sequence data for trends
- Prioritize the actions of the product team
- Critically evaluate datasets and functional annotations to assess quality
- Monitor automated dataflows for loading data to production databases
- Provide critical expertise to NCBI in biological data curation of genetic sequences
- Analyze log files, error files, or test-case “diffs” that can total hundreds of megabytes using tools such as sed, grep, awk, and Perl to confirm known/expected outcomes and identify outlier/problematic outcomes
Required Skills/Experience
- B.S. in bioinformatics, molecular biology, data science, computer science, information technology, or a similar field
- Excellent verbal and written communication skills
- Genomics/bioinformatics experience
- Strong understanding of molecular biology concepts
- Scientific ETL data model experience/skills
- The ability to troubleshoot technical and staffing roadblocks and mitigate resource risks
- Experience managing large and cross-functional projects in a complex, policy-driven environment
- Strong customer engagement, networking, presentation, and collaboration skills
- Ability to incorporate and diplomatically resolve conflicting priorities from multiple user groups and technical stakeholders
- Data processing experience in a Linux environment (5+ years)
- Experience coaching team members and eliminating knowledge silos
- Experience working with GenBank or other sequence databases at NIH or other organizations
- Experience with data interoperability and sharing standards and policies
- Experience working with Cloud data storage and processing platforms (e.g., AWS, GCP)
- Proficiency in at least one scripting language (e.g., BASH, Python)
- Experience working with large SQL databases involving many tables and billions of data rows
- Experience with CI/CD pipelines, unit tests, integration, and regression testing
- Expertise in bioinformatics of sequence analysis and tools including BLAST and multiple sequence aligners
- Solid understanding of key molecular biology concepts, such as the central dogma that describes the flow of genetic information from gene (DNA) to mRNA to protein
- Experience working in Product Owner or Product Manager positions in an Agile environment (e.g., developing vision, strategic plan, roadmap, requirements; applying user testing methodologies; prioritizing features based on value and effort)
The compensation for this position will be based on the experience of the successful candidate. The expected pay range for this position is $110,000 to $150,000.
Computercraft offers an excellent benefits package that includes health, dental, vision, and disability and life insurance; a 401(k) plan with matching; paid leave starting at 128 hours/year for the first 3 years of employment; and 11 paid holidays. We also offer the opportunity for a positive work–life balance with a standard 40-hour work week and the chance to work alongside a team of highly accomplished professionals.
To learn about other Computercraft job opportunities, please visit the Careers section of our website: https://www.computercraft-usa.com/.
EEO Employer – Disability/Veteran/Race/Color/Religion/Sex/National Origin/Genetic Information
Apply for this position
Required*