Tomomi’s AI and Machine Learning Portfolio

AI systems do not operate in isolation — they shape how people decide, trust, behave, and interact.

This portfolio explores machine learning, user behavior analytics, AI safety, and generative AI through practical, real-world applications.

I’m Tomomi Tanaka, an economist and technical program leader working at the intersection of AI, behavioral science, safety, and large-scale decision systems. My work spans machine learning, experimentation, user behavior analytics, AI safety, and generative AI evaluation across technology platforms, digital products, and public-sector systems.

This portfolio documents my ongoing work in:

Machine Learning with Python
User Behavior Analytics
AI Safety and Generative AI Safety
BehavioraThis portfolio documents my ongoing work in:
Machine Learning with Python
User Behavior Analytics
AI Safety and Generative AI Safety
Behavioral AI and Decision Systems
LLM Evaluation and AI-Assisted Analysis
Real-World Applications of AI and Data Science

Through practical projects, research, and technical implementations, I aim to explore how AI systems can be designed to be more effective, trustworthy, and aligned with human behavior.

Tomomi's talk at the UN Headquarter — Watch the video of my talk at the UN Headquarter

Topics Covered

Generative AI for Data Analysis

This section explores how large language models (LLMs) such as GPT-4, Claude, and Gemini are transforming modern data analysis and research workflows. Beyond conversational applications, generative AI models are increasingly being used as powerful analytical tools for classification, evaluation, pattern detection, and insight generation across complex datasets.

Through practical case studies and research-driven projects, this series examines how generative AI can support scalable, data-driven analysis while also revealing the strengths, limitations, and risks of current AI systems.

GenAI vs Crypto Scammers: Which LLM Wins

Topics Covered

LLM-Based Data Analysis and Classification
Comparative Evaluation of Multiple LLMs
Prompt Engineering and Evaluation Design
Multilingual and Cross-Cultural Analysis
AI-Assisted Text Classification
Behavioral Pattern Detection
Generative AI Benchmarking
Research Methodology and Validation
Ethical Considerations in AI-Assisted Analysis
Real-World Applications of Generative AI in Research

Areas of Focus

Readers will learn how to:

Design evaluation frameworks for comparing LLM performance
Build secure and unbiased AI testing pipelines
Use generative AI models for large-scale text analysis
Analyze multilingual datasets and cultural communication patterns
Evaluate model strengths, weaknesses, and failure cases
Apply rigorous research methodologies to AI-assisted analysis

Real-World Applications

This section includes practical projects and case studies demonstrating how generative AI can be applied to real-world analytical challenges.

Examples include:

Scam and fraud detection using LLMs
Behavioral pattern analysis in online conversations
Cross-model benchmarking and evaluation
Multilingual classification and cultural analysis
AI-assisted research workflows and experimentation

Each project combines technical implementation, empirical evaluation, and real-world context to demonstrate how generative AI can support modern data science while maintaining scientific rigor, transparency, and responsible AI practices.

Generative AI Safety

In this section, I explore the critical realm of Generative AI safety, addressing the challenges and opportunities presented by this revolutionary technology. From ChatGPT’s conversational abilities to DALL-E’s artistic creations, generative AI is reshaping our world. This series aims to equip you with the knowledge to navigate the complex landscape of generative AI safety.

Key topics include:

Readers will learn how to:

Understand key risks associated with generative AI systems
Evaluate safety challenges in language and image generation models
Analyze misinformation, manipulation, and abuse scenarios
Explore alignment and oversight approaches for advanced AI systems
Design safer human-AI interaction workflows
Examine the trade-offs between innovation, usability, and safety

The series combines technical concepts, practical implementation strategies, and real-world examples to provide a comprehensive introduction to the rapidly evolving field of generative AI safety.

AI System Safety

This section focuses on the safety, reliability, robustness, and responsible deployment of machine learning systems. As AI systems become increasingly integrated into high-impact products and decision-making environments, understanding how to evaluate and mitigate risks is becoming essential for both technical and non-technical practitioners.

Through practical Python implementations and real-world examples, this series explores key techniques for improving the fairness, transparency, robustness, and governance of machine learning models.

Topics include:

Fairness, Bias Detection and Mitigation
- AIF360 library
- Reweighing
- Disparate Impact Analysis
Model Explainability and Interpretability
- SHAP
- LIME
- Partial Dependence Plots (PDP)
Reliability and Robustness
- Adversarial training
- Robust Model Evaluation
Ethical Considerations
- Calibrated Equalized Odds
- Mitigation
Adversarial Robustness
- PGD attack
- Adversarial Training with PGD
- TRADES
- Randomized Smoothing
Privacy-Preserving Machine Learning
Scalable Oversight of AI Systems
- Recursive Reward Modeling
- Debate and Amplification Techniques
- Factored Cognition Approaches
- Human-AI Interaction Protocols
Ethical AI Development
- AI Development Lifecycle

This series emphasizes both the technical implementation and the broader real-world implications of machine learning safety, helping readers better understand how to build and evaluate AI systems that are more trustworthy, transparent, and resilient.

User Behavior Analytics with BigQuery ML

This series explores how to analyze user behavior and build predictive models using SQL and BigQuery ML. Designed for data analysts, marketers, product teams, and business intelligence professionals, the series focuses on practical approaches to transforming large-scale behavioral data into actionable business insights.

Using BigQuery’s scalable analytics environment, readers will learn how to analyze customer interactions, measure engagement, and develop machine learning models directly within SQL workflows.

What You’ll Learn

The series covers a wide range of user behavior analytics and predictive modeling techniques commonly used in digital products and e-commerce platforms.

Topics include:

Analyzing User Behavior on an E-commerce Site
Deep Dive into User Engagement Analysis
Sales Prediction
- Logistic Regression
- Random Forest
- XGBoost
- Deep Neural Network (DNN)
Revenue Prediction
- Linear Regression
- Ridge Regression
- Lasso Regression
- Random Forest
Identifying High-Value Customers
- Logistic Regression
- K-Means Clustering
- Random Forest
Customer Segmentation
- K-Means Clustering
- PCA + K-Means Clustering
Predicting User Conversion
- Logistic Regression Model
- Random Forest Model
- XGBoost Model
Churn Prediction
- Logistic Regression Model
- Random Forest Model
- XGBoost Model
Recommendation and Personalization
- Matrix Factorization Model
Optimizing Marketing Campaigns
- Logistic Regression
- Random Forest
- XGBoost
- Deep Neural Networks (DNN)

Each post combines practical SQL implementations, machine learning workflows, and real-world business use cases to demonstrate how behavioral analytics can support product, marketing, and strategic decision-making.

User Behavior Analytics with Python

This section explores the emerging challenges and opportunities associated with generative AI systems, including large language models, image generation models, and multimodal AI applications. As generative AI becomes increasingly integrated into products, platforms, and everyday workflows, understanding how to evaluate and mitigate safety risks is becoming critically important.

Through practical examples, case studies, and technical discussions, this series examines the safety, reliability, governance, and societal implications of generative AI technologies.

Topics covered:

The series combines practical Python implementations with real-world analytics use cases to demonstrate how machine learning can support product, marketing, and decision-making strategies.

Price Prediction with Python

This series explores practical machine learning techniques for predicting house prices using Python and the popular Kaggle dataset, House Prices – Advanced Regression Techniques.

Rather than focusing solely on prediction accuracy, this series emphasizes building models that are interpretable, explainable, and reliable in real-world settings. Through hands-on examples, readers will learn how to design end-to-end machine learning workflows while understanding the trade-offs between model performance, complexity, and transparency.

What You’ll Learn

The series walks through the full machine learning pipeline, from raw data preparation to advanced ensemble modeling and model evaluation.

Topics include:

Each post includes practical Python implementations, detailed explanations of core concepts, and links to full GitHub repositories to support hands-on learning and experimentation.

Let’s Connect

I’m always interested in discussions and collaborations related to AI, behavioral science, machine learning, digital safety, and decision systems.

Whether you’re exploring new ideas, building AI-driven products, conducting research, or interested in potential collaborations, feel free to reach out.

Contact