[Remote] Research Intern (LLM)

Remote Full-time
Note: The job is a remote job and is open to candidates in USA. Abaka AI is focused on advancing artificial intelligence research, and they are seeking a Research Intern to contribute to the development of challenging QA datasets and evaluate large language models. The role involves collaboration with global researchers and requires strong analytical and execution skills. Responsibilities Design and construct high-quality, sufficiently challenging QA datasets (graduate/PhD level) inspired by GPQA, HLE, and AI4Sci families, collaborating with a global network of talented researchers Evaluate large language models on reasoning, factuality, and problem-solving benchmarks Develop review pipelines and quality-control criteria for expert-level question generation Analyze model outputs, conduct error taxonomy studies, and summarize insights for internal reports and research papers Collaborate with the 2077AI Foundation’s open-source benchmark teams on public dataset releases Skills Strong background in computer science, data engineering, artificial intelligence, or related fields, with hands-on experience in large-scale data systems 1+ years of experience with LLMs, prompt engineering, and evaluation frameworks (e.g., LM Eval Harness, OpenCompass) Excellent written and verbal English skills and analytical reasoning Strong execution and team management skills—able to translate high-level objectives into actionable plans and drive team outcomes Experience with formal methods, chain-of-thought evaluation, or curriculum generation Relevant publications in top conferences Company Overview Abaka AI is a leading AI company and we are committed to becoming the data partner in artificial intelligence industry. It was founded in 2021, and is headquartered in Palo Alto, California, USA, with a workforce of 51-200 employees. Its website is Company H1B Sponsorship Abaka AI has a track record of offering H1B sponsorships, with 2 in 2025. Please note that this does not guarantee sponsorship for this specific role.
Apply Now →

Similar Jobs

[Remote] Student Researcher [Seed LLM Post Training – Reward Modeling] - 2026 Start (PhD)

Remote Full-time

Mutual Funds Relationship Manager

Remote Full-time

[Remote] Reinforcement Learning Research Intern for Game AI

Remote Full-time

Credit Analyst, Power, Energy & Utilities

Remote Full-time

[Remote] Medical Coding Intern - Fully Remote - Must have a NM Residence

Remote Full-time

[Remote] Virtual Phone Sales Representative Virtual Phone Sales Representative

Remote Full-time

Junior Account Executive (AE)

Remote Full-time

[Remote] Mortgage Loan Originator

Remote Full-time

Data Engineer, Junior

Remote Full-time

Associate, Paralegal

Remote Full-time

Wayfair Job Openings Remote (Wayfair Jobs Click To Apply!!) – Indeed Jobs US

Remote Full-time

Staff Program & Portfolio Manager, IT – Remote Opportunity with DoorDash at $26/Hour

Remote Full-time

Entry Level Data Entry Specialist - Verizon Remote Opportunity with Competitive Salary & Benefits

Remote Full-time

Experienced Remote Cloud Data Engineer - Wayfair Oregon $25/Hour - Innovative Data Systems and Cloud Solutions

Remote Full-time

Experienced Junior Data Entry Clerk – Full-Time Remote Opportunity for Detail-Oriented Individuals with Excellent Organizational Skills

Remote Full-time

Experienced Executive Administrative Assistant - Remote Support with Hybrid Work Arrangement for State Information Technology Services Division

Remote Full-time

Senior Network Operations Manager

Remote Full-time

**Experienced Data Entry Specialist – Home Depot Operations Support**

Remote Full-time

**Experienced Remote Data Entry Clerk – Ensuring Data Integrity and Efficiency at arenaflex**

Remote Full-time

QA - Level 2

Remote Full-time
← Back to Home