You're the CTO of a company. Design a program that provides unique IDs upon requests from a client. This program will be used by Facebook and Google and needs to scale.
You'll get access to over 3,000 product manager interview questions and answers
Recommended by over 100k members
1. Need to check why is such a program needed ? Is it to map block of data to a unique ID or user to a unique ID or URL to a unique ID ?Is there an expiration date for the Id, if so can it be set manually ? Say yes. Should the ID be always random or can it be user provided ? Say no
Let's assume we have user entering some data which in turn returns a unique ID. Since this is going to be scaleable, our system needs to take care of the amount of data stored and the time to return the value generated
2. Requirements :
- Functional :
- Need to return a unique ID
- Need to be able to set the expiration date.
- Need to be able to map the input data with the ID when needed
- Non Functional:
- Availability of the unique ID and mapping always
- Min latency in returning the unique ID
- Extended :
- Ability to run analytics on how many times and where the unque ID is used.
- Input : api_dev, "data", user_name, expiry_date
- Output : api_dev, unique_id
- Range based partitioning : Say from Aa-Jj and the next one to be defined in a similar fashion
- Hash based partitioning : Create a hash of the text entered and store it where the key is the unique id.
Clarifying questions:
1. Are we creating this program to outsource all login/credentials logic from big tech like FB/Google where they can access via their APIs and plug and play? Yes
2. Are we seeing any regional issues such as login failures or issues as to why we are tasked to do this vs them handling this within their architecture? You can make that assumption.
3. For applications like these which are accessed by Billions of users, high availability, high scalability, and low latency seem like the most important Nonfunctional requirements, does that sound right? Yes
Defining Functional Reqs & Non Functional Reqs:
FR:
- Ability to generate unique user IDs
- Need to be able to map data quick in case the user needs to retrieve forgotten user-id
NFR:
- High Availability
- Low Latency
- Ability to run analytics (to understand the volume of users, peak times, to make decisions on scalability)
Customer Journey Flow & Architecture Flow (I would draw this system design on whiteboard or Google docs if I can)
#1 | The user enters a unique ID on any client | (Mobile, desktop) |
2 | The load balancer distributes requests based on the data structure set to the available server | LB Layer |
3 | App server checks Cache is this is already created to throwback error for user & to protect the DB getting queried a lot (we can use Redis here) which handles requests like getID etc | App Server |
4 | If not, Key generation service is invoked using a Hash key Logic to create a Unique ID which handles CreateID functions | KeyGen API |
5 | Stores in Database, we will use a Cassandra database which is a Document DB store that finite queries & ever-increasing data (for apps like FB/Google) | Database Layer |
6 | We will add a Redis Cache to keep a check on already generated user ids to maintain uniqueness and protect multiple queries to the DB | Caching Layer |
7 | We also can have email services API & Analytics services API which will notify users on successful/failure of id creation. Analytics services will push data via a KAFKA to the Hadoop cluster to perform analytics as needed. | Analytics Tracking Layer |
Hashkey Generator Logic:
We can assign a 7 digit unique hashkey code to make sure uniqueness is maintained and is scalable:
We can pick a 7 digit code which will increase each time an id is created and stored, this will enable us to create upto ~ 3.5 Trillion unique IDs
Our Database will have tables like (sample)
id | uuid (hashkey) | expiry date | |
query122 | abcdef1 | 10/10/2033 & timestamp | |
name123 | abcdef2 | 1/12/2032 & timestamp |
Scalability:
How many unique ID requests can we get in a minute: (assumption is we keep it for 10 years)
Making an assumption that we have around 400 Million users a day:
X = 400 M/ (60 X 60 X 24)
X = 4500 requests per second
Storage of these users for 10 years = 400M X 365 X 10 X (Storage unit of one unique ID = 100 Bytes)
= 400M X 365 X 10 X 100 Bytes
= 1.46 PB would be needed for us to run this model.
Top Google interview questions
- What is your favorite product? Why?89 answers | 263k views
- How would you design a bicycle renting app for tourists?62 answers | 82.5k views
- Build a product to buy and sell antiques.54 answers | 66.8k views
- See Google PM Interview Questions
Top Technical interview questions
- Imagine you're the product manager for Facebook Marketplace. Since many sellers don't mark items as sold, what existing functionality and metrics could you use to determine whether an item has likely sold?7 answers | 20.9k views
- What happens when you enter a URL in your browser?6 answers | 10.8k views
- How does TinyURL work?5 answers | 317k views
- See Technical PM Interview Questions
Top Google interview questions
- A metric for a video streaming service dropped by 80%. What do you do?50 answers | 135k views
- Calculate the number of queries answered by Google per second.45 answers | 78.5k views
- How would you design a web search engine for children below 14 years old?36 answers | 42.9k views
- See Google PM Interview Questions
Top Technical interview questions
- How would you determine how to rank posts in the newsfeed?4 answers | 3.3k views
- The Chrome team is looking to reduce power utilization on mobile phones when using the browser. How would you go about solving this problem?3 answers | 3.7k views
- How would you map the ocean?3 answers | 2.9k views
- See Technical PM Interview Questions