Ardy

Hi! I'm Ardy. I'm currently working on AI capabilities and risk propensity evaluation, specifically on building better tools and environments for eval. I obtained my MS in CS from the University of Southern California and my BS in Informatics from Institut Teknologi Bandung, Indonesia.

Previous to my masters, I worked as an AI Engineer for 3yrs at Bukalapak, one of Indonesia's biggest ecommerce companies. I worked on their agentic LLM chatbot, recommendation system, and fraud detection platform.

Awards

2nd Place · AI Manipulation Hackathon

Apart Research

Solo team. Empirical evaluation of frontier LLMs' susceptibility to adversarial web content injection in product recommendation. Designed a benchmark across 10 product categories and quantified deception rates.

2nd Place · AI Safety Startup Hackathon

Apart Research

Solo team. Proposed a startup to scale training and alignment of agentic AI in cyber-physical systems through domain-specific simulations and human feedback pipelines.

Fulbright Scholarship

U.S. Department of State

Sponsored by the U.S. Department of State for M.S. in Computer Science at the University of Southern California.

Education

M.S. Computer Science · University of Southern California

Aug 2024 – Dec 2025

GPA: 3.75. Fulbright Scholar. Coursework: Trustworthy Large Foundation Models, Advanced NLP, Probabilistic and Generative Models, Machine Learning.

B.S. Informatics · Institut Teknologi Bandung

Aug 2017 – Oct 2021

GPA: 3.73. Thesis: Domain Adaptation in Aspect Based Sentiment Classification using BERT Model.

Work Experience

AI Research Engineer · Bukalapak

Jun 2021 – May 2024

Built the company's first agentic LLM RAG chatbot for customer service. Owned 50+ data pipelines for recommendation systems. Achieved 30%+ infrastructure cost reduction. Developed fraud detection models reducing forbidden product takedown time by 50%.

LLM Agents Recommendation Fraud Detection

AI Engineer · Supertype

Sep 2020 – Feb 2021

Built a web app for Google Play product sentiment analysis. End-to-end: design, development, deployment using Python, Flask, and Altair.

NLP Flask Sentiment Analysis

Software Engineer Intern · Shopee

May – Aug 2020

Developed mobile frontend for Shopee's gamification campaign in React, deployed across 10+ countries.

React Mobile Frontend

Research Projects

Social Media Narrative Monitoring Through Knowledge Graphs

Feb – Apr 2025

SPAR Project. Under Kellin Pelrine at Mila / Complex Data Lab. Built a platform to extract narratives and public figure mentions from social media, converting them into a queryable knowledge graph for analysts.

Knowledge Graphs NLP SPAR

Indonesia's AI Roadmap Risk Assessment

Jul – Sep 2025

AISES course project. Developed an LLM-powered literature review pipeline to analyze the gap between Indonesia's National AI Roadmap and the MIT Risk Repository. Found strong coverage of cyber/misuse risks but near-zero mention of AI control and system safety.

AI Policy Risk Analysis AISES

Eliciting Ranking Bias and Deception on Generative Search Engines

2024 – 2025

Evaluated LLM vulnerabilities to adversarial prompts in RAG platforms for product recommendation. Demonstrated that attacks transfer across models (GPT-4.1, GPT-5.1) and consumer platforms (ChatGPT). Extended to study deception in factual claims.

Adversarial ML RAG LLM Security

View deception cases →

Compact CoT Exploration with Multi-token Forward Passes

2024 – 2025

Proposed fused-token forward passes (FTFP), an alternative to next-token embedding using probability-weighted combination of top-k token embeddings. Replicated Meta's Coconut paper as a baseline decoding method.

Chain-of-Thought Decoding LLMs

Stock Movement Prediction from Social Media & Company Correlations

2024

Reproducibility project. Lead contributor. Used graph attention networks with multimodal data (prices, tweets, inter-stock graphs). Improved results by adding residual connections and LayerNorm to combined embeddings.

Graph Neural Networks Multimodal Reproducibility

Activation Engineering Library (Patchscope)

Jan 2024

AI Safety Camp. Participated in designing a modular activation engineering library based on Google PAIR's Patchscope paper, providing APIs for logit lens and future lens interactions with model internals.

Interpretability AI Safety Camp

Measuring Unlearning in LMs Under Few-shot Learning

Jun – Aug 2022

ML Safety Scholars. Evaluated unlearning capabilities in language models and uncovered inverse scaling phenomena. Measured training iterations needed to learn new word meanings while forgetting prior harmful information. Compared GPT-2 and InstructGPT.

Unlearning Inverse Scaling ML Safety