15% off membership for Easter! Learn more. Close

Design a photo app for the blind.

Asked at Google
3.7k views
Answers (3)
crownAccess expert answers by becoming a member

You'll get access to over 3,000 product manager interview questions and answers

badge Platinum PM

Clarification & Understanding

What is the greater context and our motivation for doing this? Is there some org wide strategic initiative we need to align with or a specific goal? Let's just focus on providing a really good experience to our users.

Is this a stand alone offering or an extension of something Google currently offers? It can be either. What do we mean when we say app? Is there a specific type of app we should focus on: desktop, mobile, web? If up to us then let's approach this as platform agnostic, but I could see us leaning towards a mobile app as a lot of photo capturing, viewing, and sharing is done on mobile devices.

 

User Segments

This problem is still pretty ambiguous so let's try to break it down further by looking at some of the different user segments this could apply to. It is worth noting that there are varying levels of blindness:

  1. Completely Blind - Can't see anything at all pitch black
  2. Partially Blind - Can still see shapes and colors
  3. Near Sighted - Functionally blind to objects far away, but might be able to read text via a screen reader
  4. Not Blind - While the core users we are designing for are some level of blind, non-blind users particularly friends or family may still be using the app to interact with those blind users.

Out of the above user segments, I'm going to suggest we focus on the completely blind. This is the most common form of blindness and if our solution works for someone who is completely blind it will also work for someone who is partially blind as well where as the opposite of that isn't necessarily true.

 

User Needs / Pain Points

  1. Blind people have no idea if they successfully took a clear photo of what they were trying to take a photo of
  2. Blind people need to know what is in the photo they are looking at
  3. Blind people need to know if their photo has been shared successfully
  4. Blind people can't thumb or scroll through a photo album to find the photo they are looking for, finding a specific photo is difficult

Out of the above pain points, I'm going to suggest we focus on pain point #2. This pain point in particular is really central to the experience of a photo app. If users of our photo app can't understand what the photo is of then we fundamentally aren't providing a good experience.

 

Solutions

Now that we have a better understanding of some of the issues the user faces, let's go ahead and brainstorm some solutions to help blind users understand what a photo is of:

  1. Description Prompt - When users share or send photos to a blind user of Google Photos, we could prompt them to type a short description describing the image. When the visually impaired user receives the photo or views it at a later date the description of the photo would be read aloud.
  2. Audio Companion Files - When a user takes a photo we could allow them to automatically record an audio description of what they are taking a picture of that would then be attached to the photo. 
  3. AI Descriptions - We could use machine learning algorithms to automatically analyze the contents of a photo and create a verbal description of the photo that would then be read aloud to the user

 

Prioritization

I'm assuming we won't have the bandwidth necessary to build out all three solutions in parallel so let's go ahead and choose one to prioritize. Remember our goal here is to provide a really good experience to our users. To help us do this I'm going to use the comparison matrix below:

Ease of Implementation, User Satisfaction

1. A, B

2. B, B-

3. C+, A

I'm going to suggest we prioritize building out solution #3 first. While this is the hardest solution to implement I think it is going to be the one users find most useful. The first two solutions are a little too niche in the sense that they apply to individual scenarios, either someone sending the blind user a photo or the blind user taking a photo themself. They both require someone to manually describe the contents of photo and that is going to be difficult to scale. The AI Descriptions of photos would not be constrained by requiring using to do something manually.

The AI Description of photos could of course be leveraged by any sort of photo specific applications, but I think it can be useful to our users beyond that. If incorporated as a general utility into audio based screen readers then the visually impaired users would be able to understand the contents of photos they encounter elsewhere online. For example, an attachement in an email or an image embedded into a news article they're reading.

While all of the above sounds great, let's not lose sight of the fact that this is going to be quite difficult to implement. A very rudimentary version could read aloud information like the capture time and location from the image's exif data as well as detect standard objects present like "person smiling". Future versions of the model could both be more specific and be tailored to the individual users. For example, instead of detecting "person smiling" it could detect "Your grand daughter Sophie smirking while eating halloween candy". Facebook's auto-tagging capability is proof that this is doable, but at the same time they have a very large and rich data set of photos.

 

Summary

In order to provide a really good product experience to users who are completely blind, we are going to build an AI based application that will automatically analyze the contents of a photo and read an audio description of it's contents.  

 

Metrics

Right off the top of my head, I think there are two points that are important to measure: 

1. Does our product accurately describe the contents of a photo?

2. Do our users find our descriptions of the photos useful?

 

Let's focus on picking out a metric that accurately reflects our ability to do #2. I think #2 encapsulates #1 in the sense that if our product can't accurately decribe the contents of photos then our users aren't going to find the descriptions useful so based upon that let's focus on question #2.

To assess whether or not our users find our descriptions of photos useful, we should focus on monitoring our number of daily active users. This is the type of application that is meant to be used day-to-day so focusing on daily active users makes more sense than monthly active users. If users aren't finding our product useful then they won't use it. 

One potential complication here is if our AI Descriptions are leveraged by something like a general audio based screen reader as a standard offering then users could be de facto using our AI descriptions even if they didn't find them useful. In that situation maybe offering a small subset of users the ability to skip the reading of our descriptions could be useful just to monitor if they do it or not as a proxxy for how useful they find our AI Descriptions.

Access expert answers by becoming a member
4 likes   |  
Get unlimited access for $12/month
Get access to 2,346 pm interview questions and answers to give yourself a strong edge against other candidates that are interviewing for the same position
Get access to over 238 hours of video material containing an interview prep course, recorded mock interviews by expert PMs, group practice sessions, and QAs with expert PMs
Boost your confidence in PM interviews by attending peer to peer mock interview practices, group practices, and QA sessions with expert PMs
badge Gold PM

Clarify

  • Google or another company [Google]
  • Can you share insights driving the need for the product? [large and growing potential population of users that we are not serving today]
  • Platform? [up to me - choose mobile]
  • Geography - ok to focus on US due to market size [yes]
  • Any resource contrainsgts [no]
Goal
  • Potential: A, A, R, M - we know the google doesn't try to monetize these apps initially. Suggest focusing on driving engagement and retention initially. If the product is enaging can recruit additional users to the platform by partnering with marketing teams.
  • Focus - Engagement/retention
Users
  • users are visually impared - that implies other senses may be more finely tuned. assume that verbal navigation is something they are comfortable with.
  • need for photos - stronger desire or opportunity enable deeper customer engagement since they may be restricted from this today
typical User journey for someone taking photos - will highlight pp for visually  impared
  • take a picture on your phone
  • store the photo
    • PP: so many photos, rarely end up going to retrieve them
      • Med
  • view the photo
    • PP: difficult to navigate a list of photos if you cannot see them
      • High
  • share the photo
    • PP: difficult to share if you cannot see them
      • High
  • create an album
    • PP: need to rely on others to create an album
      • High
Focus on viewing, sharing, and album creattion as key PP to focus on
 
Sound OK?
 
Solutions
  • assistant search - "find photos of grandpa" - there are 100 picts of grandpa i have found
    • help with pain point to search for photo
  • assistant describe - "tell me about this photo" - date, who is in the photo, and location of photo, sentimate analysis - happy/sad/etc, google lens to identify details - in the back there is a palm tree, etc
    • help with PP to view photos
  • living photo book - create a "memory" of grouped photos, such as of a person, and the assistant can collect photos and add to a live album - nia growing up - here is her first step, first word, first day of school, etc. and this book can visually played and grow over time - similiar to a playlist on spodify, this is a curated and narrated group
  • smart photo book - enables a cut of a group of photos with audio details
  • (moonshot) - way to incorporate smell into photos - photo of a rose - to bring it to life in a new dimension?
Rating
  • Assistant search
    • Impact - H
    • Effort - L
    • Overall - Must have
  • Assistant describe
    • Impact - H
    • Effort - H
    • Overall - Must have
  • Living photo book
    • Impact - M
    • Effort - H
    • Overall - Nice to have
  • Smart photo book
    • Impact - L
    • Effort - L
    • Overall - Nice to have
  • Addition of Smell dimension 
    • Impact - H
    • Effort - H
    • Overall - Should have (nice to have but would highly differentate product and would delight user)
Prioritized features - assistant search, assistant descrive, addition to smell (likely research area to start) due to combo of impact and effort
 
Metrics
  • Primary - Usage - DAU/WAU/MAU
  • Secondary - # of pictured stored, engagement with new assistant features, # of downloads, interval betwee logins, churn
Summary
- designed a photo app for visually impared, described user journey, rated pain points, and identified 5 potential solutions including a moonshot. Prioritized top features and identified key success metrics.
 
 
Access expert answers by becoming a member
2 likes   |  
Get unlimited access for $12/month
Get access to 2,346 pm interview questions and answers to give yourself a strong edge against other candidates that are interviewing for the same position
Get access to over 238 hours of video material containing an interview prep course, recorded mock interviews by expert PMs, group practice sessions, and QAs with expert PMs
Boost your confidence in PM interviews by attending peer to peer mock interview practices, group practices, and QA sessions with expert PMs

Clarifying questions : 1. What is the photo app used for ? Is it for any specific use case or in general ?

2. Is the photo app specifically for blind people or can be used by anyone ?

3. Is there any technical constraints to be taken into consideration ?

4. Would also want to understand by blind we mean both full blindness and low vision ?

5. Is there a specific language the app should be in or it can adapt to multiple languages ?

6. Is there a specific timeline that we are looking at ?

7. Another question is if the photo app is static images or dynamic images / videos ? 

So now the question becomes, design a photo app that helps the blind people go about their day to day  activities, does not have any constraints, first MVP in English, to be released in 6 months and can be used by all via both static and dynamic images : Fully blind, Low vision and people who have sight.

Personas:

The personas are already defined as : Fully blind and people with low vision

User Journey and the associated pain points:

User JourneyPain Points
Visually Impaired person gets upUnderstanding of the time
Completes morning activities and would like to read Finds the braille section and reads
Gets ready and would want to start walkingNeeds regular identification of the location and GPS
Reaches office, completes workThere could have been an obstacle on the way
While returning back from office, feels the cool breezeNo way to send the same experience to friends , query them
Reaches home, need of some entertainmentFinding books or videos but not in a mood to read 

Solutions : The solutions can be categorized under different buckets

  • Community : Click a "static" photo and send it to friends (VIP and sighted) people to share experience, have any doubts clarified and/or ask for help from volunteers 
  • Travel & GPS : A "static" or dynamic photo app that identifies the current location, lets the user know the distance and time to reach the destination
  • Experience & entertainment : A photo app or a camera app that
    • Reads normal text and braille aloud 
    • Movie dialogues could be read or spoken to 
    • Headphones to give the immersive experience
    • Recmmendations 
  • Obstacle finder : The Photo app sensing obstacles send vibration or haptic feedback, aids in immersive experience also.
  • General identifier : The photo app that is able to identify anything that is being pointed at. Reads aloud the name and describes the object as needed. It could be currency identifier or a blanket in the front or a vase on the table.
  • Low vision :
    • People with low vision could be assisted by magnifier lens (to expand the text)
    • People with low vision for colors could be assisted by color sharpener
Since the time to build is 6 months, would prefer to take the solution which is easy to build and has the highest impact to the community, In that sense for short term MVP, would be picking up GPS and Community. For long term feature development. would look at General identifier which could help cover almost 80% use cases
 
Success metrics : 
# of app downloads 
# of sign ups 
Avrage # of queries 
Volunteer ratings
NPS 
# recommendations 
 
Caveats : A photo app could be expanded well to everything by pointing the camera however it also mixes the definition of taking a photo and understanding it further.
 
In Summary, my photo app for the blind to be released in 6 months would have features for community, social networking and mobility preferences to begin with thereby reducing the gap that  blind people are different from the visually abled and enjoy the world.
 
 

 

 

 

Access expert answers by becoming a member
0 likes   |  
1 Feedback
badge Gold PM

I think its well structured, and easy to read. The part that might need to be relooked at could be the user journey, i couldnt find any element of the journey intersecting with the need for a camera or a photo app. one suggestion could be to start with the user journey of someone taking picture or looking at photo, and interplay the pain point that arises due to lack of vision. 

0
Get unlimited access for $12/month
Get access to 2,346 pm interview questions and answers to give yourself a strong edge against other candidates that are interviewing for the same position
Get access to over 238 hours of video material containing an interview prep course, recorded mock interviews by expert PMs, group practice sessions, and QAs with expert PMs
Boost your confidence in PM interviews by attending peer to peer mock interview practices, group practices, and QA sessions with expert PMs
Get unlimited access for $12/month
Get access to 2,346 pm interview questions and answers to give yourself a strong edge against other candidates that are interviewing for the same position
Get access to over 238 hours of video material containing an interview prep course, recorded mock interviews by expert PMs, group practice sessions, and QAs with expert PMs
Boost your confidence in PM interviews by attending peer to peer mock interview practices, group practices, and QA sessions with expert PMs