15% off membership for Easter! Learn more. Close

Explain the challenges in training LLMs. How will you set up your team to achieve training objectives given the challenges you listed, on time and within budget?

Asked at Google
340 views
Asked at
eye 340 views eye 340 views
Answers (1)
crownAccess expert answers by becoming a member

You'll get access to over 3,000 product manager interview questions and answers

badge Gold PM
  1. Clarifying the question and stating assumptions
    • does it need to be to address a specific domain - Any domain
    • does it need to address a specific stage of training process or any or all the stages - your choice
    • does the objective need to be for a  preview audience or GA ready - It could be for a MVP/ Private preview but a quick GA timeline
  2. Describing the product
    • an in-app developement experience of converting Appscripts to typescripts to enase transition of business users to reuse any existing appscripts in different platforms and applications
  3. Listing out the LLM training attributes
    • Data Collection an preprocessing
      • Sourcing data from diffrent locations include code repos available in public domain, any curated databases/ personal repos, websites andbooks
      • segmenting the data to smaller units to form tokens that is basis of model vocabulary
    • Model Configuration
      • Defining the number of layers , tranformer blocks, attention heads and other parameters to define the model
  4. Listing out the LLM challenges for the training attributes
    • Data generation and validation (Data collection): Collecting data from various sources, cleaning the data and organizing the same for training, validation and testing sets
    • Optimizing reasoning capabilities: While LLM can proccess inforamtion and reason with near human like logic through understanding the patterns, the need to combine multiple facts and generate the tyspecript code to come to the same inference or capability of appscript business model requires additional reasoning capability.
  5. Solution: to address the challlenges wihtin time and budget
    • Data Generation and Validation:
      • Asked to team to create a subset of data that employs back translation to be added for each of the datasets to ensure that results accuracy is cloase to 100%
      • Opensourcing some of the model to get expert opinion and data contributions
    • Optimization: Take a modular approach that combines knowldge, computationa dn natural language expression and applying RLFH (reinforcement learning from Human feeback) was an approach we tool for addressing this challenge
  6. Trade offs
    • We had to compromise with bias that typescript could as effective as appscripts for all business scenarios
    • Quality control metrics benchmarks were based on exisiting working patterns and not implicit towards transpilation type of efforts that this product was trying to address.
Access expert answers by becoming a member
1 like   |  
Get unlimited access for $12/month
Get access to 2,346 pm interview questions and answers to give yourself a strong edge against other candidates that are interviewing for the same position
Get access to over 238 hours of video material containing an interview prep course, recorded mock interviews by expert PMs, group practice sessions, and QAs with expert PMs
Boost your confidence in PM interviews by attending peer to peer mock interview practices, group practices, and QA sessions with expert PMs
Get unlimited access for $12/month
Get access to 2,346 pm interview questions and answers to give yourself a strong edge against other candidates that are interviewing for the same position
Get access to over 238 hours of video material containing an interview prep course, recorded mock interviews by expert PMs, group practice sessions, and QAs with expert PMs
Boost your confidence in PM interviews by attending peer to peer mock interview practices, group practices, and QA sessions with expert PMs