Your new feature boosts Amazon Search by 10%, adds 2s to load time. What do you do?

Question

Your new feature boosts Amazon Search by 10%, adds 2s to load time. What do you do?

Asked at Amazon over a year ago

36.1k views

Problem Solving category

Asked at

36.1k views

How to answer Problem Solving questions

Interview Guide

Liked by PM Exercises

P

PM Team

Platinum PM

Jan 2, 2020

Start answering this Amazon problem solving question by asking clarifying questions to fully understand data.

Was this feature launched globally? Ans: Yes
Is this performance data for the global launch? Ans: Yes
Is a 10% increase per user increase or in the overall number of total searches? Ans: Overall number of searches
Is this 2s increase impacting any consumer/business metric? Ans: Yes, Search exits have increased & people are leaving the site more often
In terms of percentage increase, what do 2 seconds mean? Ans: Almost 100%
Is this data for the app or website or both? Ans: It is for website

Understand the goal of the feature

The goal of the feature was to make sure that people are able to get to the product they want faster than before & hence increase average order size per customer

Let's Look at user funnel

User Lands on to the site ------x%---->Perform Search----y%------>Land on SRP-----z%--->Click on a result

So as per my understanding:-

x has increased by 10% (which is good)

y has decreased because of degradation in performance

z has increased because of improvement in search relevance

So we will focus on figuring & fixing why 'y' has decreased

Analysis

If x*y*z has decreased which means that less % of people are clicking on search results than before, then I would roll-back the feature since fewer people are progressing on the funnel. The assumption here is that nothing else is getting impacted by roll-back.

But for continuing the analysis, I am assuming that more people are progressing in funnel since x & z have increased.

Let's focus on why performance has degraded leading to a decrease in 'y'.

First, let's develop some hypothesis and then we will get some more data points to analyze the issue

Search Time = Time for the request to reach back end search server + Server time to perform search + Time for response to reach client + Time required by the client to display the SRP

Each of the above is measurable by tools or internal logging, so I would measure these and compare it with the previous implementation.

Time for the request to reach back end search server - After comparison, assume we find out that the time has increased and this metric has degraded. Possible reasons are:-
1. The deployment strategy of the new feature was changed because of which the time increased. Action item: Evaluate if we can use the same deployment strategy. Deployment strategy here means the underlying hardware, infrastructure, routes (both internal as external) etc for the search service
2. Check if search queries from a region where the performance is not up to the mark have increased thus skewing up the average time. Action item: Evaluate if this change in search queries is permanent & if yes, plan a deployment catering to the region.
3. Check if the performance of a certain region has degraded significantly. Action item: Plan a deployment catering to the region.
4. Check if underlying infrastructure such as undersea cables etc has snapped leading requests to take alternate routes that are taking longer. The probability of this is less since the impact we have seen is after we made the changes that point to something internal. I wouldn't spend too much time on this analysis.
Server time to perform a search - This is the time required by the backend to actually perform the search
1. Check if the number of items to be searched increased significantly. (Assumption: No)
2. The throughput of the new algorithm for search has decreased. Action item: If yes, we will evaluate optimizations in the algorithm as well as adding more hardware if needed to improve throughput.
Time for response to reach client - Similar analysis as point#1.
Time required by the client to display the SRP
1. Is the UI of SRP changed ?- Action item: If yes, please evaluate the front-end page-size and other front-end parameters like page load time, etc. Optimize the front-end code. Chrome Audit tool can be used to evaluate the same. If the UI is not changed, then this analysis is not needed but we need to make sure that we are returning the same number of results as before, else reduce the number of search results & implement pagination is not done already.

52 likes | 1 feedback

1 Feedback

Tina Patel

PM

Mar 30, 2020

1

Top Problem Solving interview questions

Invite members
Invite by email

Add another

Norge Hund · Answer 1 · 2020-04-28T01:39:47+0000

First, I’d like to clarify what this data reflects. Is this overall data for the platform or is it specific to a certain channel / geography / browser / device / etc.?

Assuming it is generally reflective of the overall platform, next I’d like to clarify what was the goal in releasing the new items search feature. If we only cared about increasing the number of searches, then the analysis could stop here. But if I were the PM for this offering, the metrics I care more about are average order size and search conversion rate. Increasing the number of searches are a means to achieve these revenue-related goals.

Next, I’d move onto a hypothesis regarding the slower load time. My initial reaction is one of concern, as I believe this degraded performance could cause users to grow frustrated and leave the site or close out of the search. Let’s think about this in the context of the user journey:

A) User lands on homepage

B) User performs search

C) User lands on search results

D) User navigates to product page

E) User adds to cart

F) User purchases item

From my prompt we know we’ve increased B, but we also potentially decreased C due to degraded performance. Let’s check some metrics to see if the change has impacted C, D, E, and F by looking at:

% search results exited before vs. after the new feature.

% search product pages viewed before vs after the new feature.

% search results resulting in add to cart before vs. after the new feature.

% search results resulting in purchase before vs. after the new feature.

If we notice declines in these metrics, then we know we’ve got a problem with our new feature, as it is negatively impacting users. We can begin to diagnose the cause of the degraded performance with our engineers.

stashbag · Answer 2 · 2021-01-20T00:47:46+0000

Approach

Understand the feature, its goals, the changes to the feature and their impact
Understand tradeoffs i.e. is the change net positive to the goals - Determine whether increased searches > Search Load time increase by 2 seconds+ by evaluating on different parameters such as customer satisfaction, Order value increase, order frequency changes, avg check out times among others.
Based on the data and our business goals determine the right course of action

Clarification

Is searches increasing a good thing or bad thing I.e. are customers searching more because they are searching for new products or they need multiple queries for the same product
What product did this impact (web or mobile) —> cross all platforms

Product Description and Goals

Amazon search is the first point of interaction for most customers in the Amazon app when they are searching for things (known or browsing). It is key is helping users find the right product in the shortest amount of time based on the search query. To that end, the product goals would be

User Goals

Provide accurate and relevant search results across the Amazon catalog
Minimize the effort customers need to make to find the right product (number of search queries, time spent searching, modification to search query)

Business goals

Make amazon the de facto leader in Product search i.e. customer originate product searches on Amazon and not Google
Grow marketplace transactions

Tradeoffs

I would start by into the impact of the feature on the following metrics pre/post the feature launch so that I can get a sense of the impact.

For that I would collect the following metrics:

Number of users joining Amazon —> Did searches increases because of new users or because of changes to the product. (Increase is bad as we are seeing slower load times without any material impact on search queries from existing users. The increase could be due to new users.)
Number of searches per user for each product —> Has the number of searches gone per user gone up (More searches are good if accompanied by more products)
Search to cart conversion rate —> What is the change in add to carts. (More items added to cart are good)
Search to checkout conversion rate —> What is the change in single click checkouts (More checkouts are good, less checkouts are bad)
Avg number of orders per customer —> Has the frequency of orders being placed gone up / down (increase in orders is good)
# of sessions terminating in search —> Are users not taking any actions post search i.e. abandoning their journey (this is bad as users are unable to find what they are looking for)
Customer reviews of the Amazon app --> See how the Amazon app reviews have changed post feature. (Drop on ratings might mean feature has negative impact)

Based which direction the above metrics are trending pre and post the search enhancement we can decide whether to persist with the feature i.e. do nothing or whether to address the page load times.

Summary

If the increases in Search queries is leading to more marketplace transactions and more frequent product searches and check-outs the feature is a net positive and we should not do anything. If the greater search queries isn’t impacting the metrics that point to more transactions and better customer experience we should work on address the slower load times.

Swanidhi Singh · Answer 3 · 2020-09-05T13:34:32+0000

Clarifying questions -

Increase in searches may not imply that users are finding it relevant and useful - it could be the opposite. What was the goal of the feature that was implemented? Was it to reduce search time or increase the total number of orders? - Total number of orders
Has the CTR for search results been affected? No
Have there been any marketing campaigns recently that could add to increased search? No
Was the feature rolled out across both Web/App platforms? Yes
Was the featured rolled out globally or across specific geographies? Globally
Has the increase in searches considerably increased in specific countries? No.

Does the 2-sec increase in page load negatively impact business goals? Yes, user drop rate has increased and total orders have decreased.
Is the drop rate consistent across both the platforms? Yes
Is there any specific demographic for which the drop rates have increased? No

Let's consider the user journey -

A% - User Lands on the Search Page
B% - Performs the Search
C% - Clicks the search result
D% - Adds the item to cart and places the order

Business Goal is to maximize A% x B% C% x D%, increasing revenue for the company. For the users who wait for the search to complete, CTR for search results has not changed - 10% increase in search (B%) seems to be a positive indicator.

Hypothesis - Resurrecting the dropped users by minimizing search time will lead to more conversions.

Search time is the amount of time that the user waits after performing search action before the results are displayed on the screen.

Typical request journey => 1. Request from client to server => 2. Server to process the request => 3. Response from server to client => 4. Response is parsed on the FE.

To identify the specific issue - I'd want to work with the engineering team to measure how performance across each of these steps has changed before and after the feature was released.
1. Request from client to server
I would validate if the request object size is too high.
Is there a new layer that is introduced between frontend and backend? Eg. A middleware or third-part API service
Is the issue specific to a browser or platform? Are the coding standards-compliant and optimal?

2. Server to process the request
Is the algorithm complexity acceptable?
More third party integrations? Can those be optimised?

3. Response from server to client (Same as 1)

4. Response is parsed on the FE
Are the loop iterations optimal?
Is FE waiting for a large response? Can it be lazy-loaded?
Is there room to optimise how information is delivered to the user?

ashishkatariya2 · Answer 4 · 2020-01-15T12:43:35+0000

* First of all this should have been found in testing this functionality
* Second is my customer getting impacting with this delayed response. I will do this by checking the data for the customers.
* I mean 2 seconds is not too much time ( you just blink your eye and 2 seconds are over) so this can be ignored.
* If my new feature is not affecting any customer then no need to do anything.
* There might also be be some connection issue with DB, as there might be multiple hits on this feature on the server at the same time so the page will take time to response.
* I will optimise my server load here and will also have a load balance in place to handle the request.

Kaushik · Answer 5 · 2021-03-01T16:28:48+0000

Clarifying Questions

When we are talking about the search functionality, this is the main search of the Amazon website, right?
Before we deep dive, I want to clarify what we mean by search - user enters a query, hits search and gets the result. A search is supposed to be completed when the results are retrieved, is my understanding correct? - Yes
There could be multiple page load times in the entire amazon website. Are we talking about an increase in the general page load time when someone opens the the website amazon.com or are we talking about the page load time when a user queries something? - this is the page load time after the user queries something
To narrow this further, I want to understand a few more specifics about the search feature.
1. Over what time has this change been noticed - sudden, week on week, month on month - month on month
2. In which geography has this change been noticed, since the teams at amazon operate pretty independently, it could be possible that these changes were rolled out only in a specific geography - India
3. Were these changes noticed on any specific platform or device - Web, mobile web, ios, android app, alexa etc. - the changes have been noticed across all the platforms

Quickly summarizing the findings - Due to recently introduced change, Week on week searches in India increased by 10% and page load times post search increased by 2 seconds.

Evaluation of the Situation:

The search experience is a crucial part of any user's journey when on Amazon. It is fair to assume that most of the high intent purchases start with the search. A poor search experience can deteriorate the entire business and even affect the rest of the metric adversely. For example: poor searches could mean not being able to find products, fewer products added to cart, few checkouts etc... So it is crucial to identify and fix this problem.

For an ideal experience for the user, the user should be able to find the products as quickly as possible and go through the results also quickly. A few quick questions:

Increase in Searches: When we say the searches increased by 10%, is it in proportion to the number of new users added, rather are doing more searches than before a making a purchase/adding a product to the cart. - Yes, this is normal. Then we can probably make an assumption that the search itself is functioning properly and that there is no cause for concern about the searches itself.
Page Load Times: The increase in the page load time is definitely a cause for concern and interferes with the ideal shopping experience for a user on the product. There could be multiple reasons why the page is taking longer to load. I will break down the search into various parts and try to diagnose possible issues. Is that okay?
1. Loaded page before the search - Sometimes some common search results are pre-cached to ensure the results are done
  1. Are our server caches working properly to be able to serve the results correctly - Yes
2. Running the Query
  1. Is the query taking longer to run than now than earlier?
3. Loading the page - Page loads take longer generally when there is more data requested from the server. This could happen due to various reasons.
  1. Have we changed the layout of the products and categories in the main search results page - of the products
  2. Are we loading more images for a product than before
  3. Are we loading more products on the page in comparison to earlier

Solutions:

Improve caches in multiple places to pre-load the results to the most common searches
Improve the indexing on the search to reduce the query load time
Reduce the results size
1. by scaling images
2. by progressively loading the images based on the desktop - top few results to load immediately
3. loading fewer products at first
4. scaling the images down to ensure overall results download is small

Implementation:

Once we have an understanding of which of the above issues is causing the page load times to increase, I will work with the engineering teams to improve the situation. Further, I will also track other metrics - such as searches, cart additions, checkouts etc. to see how it has improved the situation.

qwerty_abcd · Answer 6 · 2024-05-14T05:41:25+0000

Clarifying questions -

Boosts amazon search? What is the meaning of this? - Increases searched by 10% i.e. if 100 users used to perform search in a day, after the feature release - 110 users perform search.
Basically, that means search has become more comprehensive and granular.
2s load time to - search page? - Yes, to page that results after user adds search query.
It is for all search - not any particular category search. - Yes

User journey -

User opens the app

Searches for the item from search textbox

Search page loads

User browses the product and selects the product by going to product description page

User adds to cart

User goes to cart

User completes the payment and buys the product.

Let us understand the cause and effect first -

I will try to understand if the increase in 2s PLT is attributable to the change in search. A few questions related to that -

Have we checked health of our servers? - Health seems to be ok, no fluctuations observed off late
Are we sure all the servers are responsive? - Yes.
This release was done across all the demography and geography. - Yes
And the observation is not for any region or demography - Yes
Deviation is not platform specific - Yes
There is no other change that has gone live which may have impacted the PLT? - No

Ok. It looks like the causation is correct.

Now, first of all, I want to see the impact of this on not main goal - i.e. number of transactions.

Is there any decrease in user percent who are transacting finally? No -

Is there any deviation from goal? Were we expecting more conversions after search enhancement? - Yes, our success criteria for search enhancement is % increase in conversion rate.

That means page load delay may be causing user churn after search. Let us see -

Understand the side effect
How big is increase in page load time - What was the average page load time earlier? and how big is the impact? - Average page load time earlier was 1.7 seconds. Now its 3.7 seconds.
This is considerable increase in page load time, and in today’s time of instant gratification, users are not ok with this increase in PLT, hence leave the platform if they have to search operation again and again to find the desired item.
What was the category wise variation in PLT earlier? Was there a big variation earlier? and kindly help me with variation now also. - Earlier, the least PLT was .7 second for category A and the highest was 4.5 seconds for category B. Now variation has decreased. for almost all the categories, PLT is about 3.7 - variation (1-4 seconds)
Ok, we are seeking improvement in a few categories where plt was high earlier. But because of the implementation, we may have compromised on less PLTs on some other categories.
Cool,
Now I have a hypothesis -
Transaction for the categories where PLT has increased must have decreased and the categories where PLT has improved, must have increased - balancing the conversion rates. Could we have that insight please? - yes, data is in accordance with this hypothesis.
There is last area that I would like to explore -
What was the purpose of search enhancement- to make search more granular and decrease PLT (as less and relevant products will be loaded).
Isnt it resulting in people doing more searches now and increasing burden on servers? - Servers health has been consistent. As already tole, we have increased keywords in search. That is encouraging users to search the exact products with specific keywords in some categories.
Ok, now that we have understood the problem let us see what can be done.
1. What is the contribution of Category B searches overall? ~15%
2. That is considerable. What is the change in search? - It has become more granular, and we have added more keywords in search options.
3. Those changes have been added to overall search engine - Yes
Solution -
We will roll-back the last changes first and limit the search optimization changes to selected categories where PLT is improving.
For rest of the categories, we will continue with the earlier search algorithm, and we will assess scope of enhancement for these searches again.

N23 · Answer 7 · 2020-07-01T12:23:44+0000

To detmine next steps, I would start with what the new feature attempted to do.

Assumption:

Data we have is from a clean A/B test and is not tainted by other factors external to the feature.
Features goal was to make it easier for the users to search the products they wish to buy. Thus the success metrics could be :

Primary metric: "Average $ value of checkout per user session"
- User session is defined as a period of time with continuous user activity i.e. no user break greater than ~15 mintues.
Secondary metric: Search performance: Ratio of - number of searches / clicks on the SRP.
- As response time goes up, this metric would get worse because of bounce / abandonment of search.

Based on the how the feature performed on the primary and secondary metrics my actions would be as under:

Primary metric (Avg $ per user session)	Secondary Metric (Bounce rate)	Action
Pass	Pass (Unlikely, based on info we are given)	Roll out. No further action needed
Pass	Fail (Most likely, based on info we are given)	1. Roll out. 2. Root cause source delay and try to fix it. 3. If we have an improvement, make an additional release with the fix and measure again.
Fail	Pass (Unlikely)	1. Roll back the feature 2. Restrategise how to improve search
Fail	Fail	1. Roll back the feature 2. Root cause source of delay and try to fix it. 3. Run the experiment again if we have a fix

anshu gupta · Answer 8 · 2025-02-05T01:48:33+0000

some clarification questions to answer this issue:

1. What do we mean by boosts amazon search by 10% - does it mean the search results are increase by 10%

2. By adding 2s – what is the impact in terms of %?- don’t know. And which metric is it affecting? Is it causing users to abandon the search? – yes.

3. So that means effectively it is resulting into less purchases bringing the average order value down and also a bad experience for the user.

4. Is this feature done on both web and mobile? - yes

So typically a user journey is that a% of users are using amazon search > B% are then checking the search results > C% are clicking on the results and further down some of them are making the purchase.

Ideally we would like to keep that B% and C% as same.

So, I would like to rephrase my problem statement as –

I want to decrease the search result time , so that I do not abandon the search midway and get the results?

We need to optimize the query to get the results. Now, I am sure amazon has very good search mechanisms but it can be seen if by applying different search mechanisms like elastic search or may be by indexing in the database the result time could be improved.

Now, let’s say that in any case the return time of the query cannot be improved. In that case, we need to analyse the benefit of increasing the search results. Because the ultimate goal here to make a sale.

So, if without the new feature, the overall sales – A*B*C*D remains high then I think it makes sese to rollback the feature.

Mohit Khatwani · Answer 9 · 2024-04-12T12:51:33+0000

Quick Clarifying Questions

When your search is boosted by 10 % is this for specific category or over all , answer - Over all as this features is launched for all categories
Is this spike in a specific geo - Answer - all Geo
Have we seen this spike on a specific platform or over all - Answer - Over all
Also have we seen this spike on some specific cohert of users ?- Answer - No
When you search search has boosted by 10 % what change did we do in the search functionality (i.e new feature)
- Answer - Let us say our service has introduced search via AI (LLM) searches e.g I want to attend party today and i would want you to suggest Not so casualy but not very formal clothes and shoes for me
Do we know what was the load time earlier
- 1.5 seconds - earlier
- now - 3.5 seconds
The anlaytics tool we are using to measure is it firing the events correctly and database is recording them right ?Answer - Yes

Internal Factors (Currently i am not taking external factors in to consideration as this seems to more of aperformance issues - Assumption , please correct me Answer -

User journey - User lands on Homepage of the app/web> Tap on search icon> Starts typing keyword >after typing the sentence> user hits search button ?
1. Is this correct ? - Answer yes
2. Also there is not auto suggest coz this is long tail keyword - Am i right - Yes
Backend process/Architure- When user hits search button
1. Step 1 - To Decode the LLM based sentech in to desired keywords
  1. How much time does this Middle ware take to decode - Answer is 1.5 seconds
2. Step 2 - Post decoding the search keywords are passed on to search API
  1. Do we know search api is able to handle the number of request properly and we are not seeing any delay in response time
    1. Answer - Ya the API request is taking about 0.5 second higher time than earlier
  2. So do we know that the instance/server which is deployed for this api is it able to serve the request and there is a delay in response ?
    1. Answer - yes the server is taking sometime as multiple keywords are being sent in to api due LLM decoding and this the new structure where multiple keywords are being serched at a single time
  3. CDN - The data is being fetched from the server but the images are being fetched from CDN
    1. Do you see any latency there - Answer - No

Solution

I see 2 possible reasons here which are contrubuting to delay in search 1) Response time from server side as multiple keywords are being sent in single API request and second is Decoding of keywords

Sol1 - For LLM searches extract the list keywords searched in last 6 months and send the keywords using Azure AI or any AI solution to extract the possible search prompts against they keyword
1. E.g ask the AI to populate prompts for keyword shoes
  1. "Best running shoes for high arches 2024"
  2. "How to clean suede shoes without damaging them"
  3. "Fashionable shoe brands for men in spring 2024"
  4. "Comparison of Nike vs. Adidas basketball shoes"
  5. "Tips for choosing comfortable work shoes for standing all day"
  6. Train the model basis top 20 prompts on a daily basis to make the model more robust and essentially help the AI learn better about these keywords
  7. Measure the performance of decoding if the time has reduced by 1.5 seocnds or abc seconds
Sol 2 -
1. Increase the server capacity by one more instance and keep the monitoring performance and check if the response time has gone back to its orginal time which was inflated by 0.5 seconds

This is continious process and may be it will be. good idea to not expose the entire 100 % of the audience with feature like this as the model is not trained as well is compute cost of AI will also be very high

Deploy one solution at a time to measure its right effectiveness

Sucess Metric

No of seconds taken to populate search
1. Before vs after - North star metric

Other Metrics

Measure time taken for decoding
1. Before vs after
API response time
1. Before vs after

Nishant Kumar · Answer 10 · 2024-03-15T17:46:32+0000

CQ:

What does boosts mean -> searches appearing within the top 3 list
Feature sensitivity to age group, type of search, geography - NA

Approach - Run an A/B test and measure below metrics to take the call

Engagement:

No of searches made per session
% sessions with add to cart
% sessions with successful purchase

Retention:

User complaints wrt search/load time
7 day, 15 day, 30 day return rate
CSAT of both user bases

Top 2 metrics to be measured are:

% sessions with successful purchase -> measures customer & business outcome
7 day, 15 day, 30 day return rate -> indicator for impact on the user base

gaurav minhas · Answer 11 · 2024-03-07T21:46:06+0000

Clarifying Questions:

1:Is this related to any particular geography?-- No

2:Is this related to any particular platform?--No

3:Is this seasonal change ?--No

4: When we say boost search , have we started loading more items/images per page as per the search?-- Yes

5: Boost search means providing granular level of information about thing which was searched for or it is user has to use multiple keywords for searching 1 thing?--Yes

5: Does this affect the users adding things to cart?- yes

6: Did it affect users doing checkout process?Yes

7: How are we measuring the boost in search , is it only user entering the search keywords and landing on the requested page or it is end to end , searching , adding to cart and making transactions?- We are checking from end to end .

8: Was any A/B test conducted before launching the feature?- No

9:% users before and after landing on the Cart page -- Decreased

10: % users before and after landing on the checkout page-- Decreased

From the above , hypothesis we can generate is that :

1: The are now more no of items / information being loaded about the items being searched.

2:There was no A/B testing done before launching the feature .

3: There is overall decline of the transactions.

Potential solution:

1: Do the A/B testing for this and check if the more detailed information is actually required by the user in the first page or we can shift it to the subsequest pages.

2: Measure the % users before and after landing on the Cart page

3: Measure the % users before and after landing on the checkout page

Aditi Bhalla · Answer 12 · 2023-10-20T17:58:32+0000

Clarifying Questions:

Q: Does this search increase is gloabally or limited to certain region? Globally

Q: Does this search is related to particular device? Is it reflected among all.

Q: Does BI tool is providing you with correct information? Yes

Internal Factors:

Q: Does other metrics other than search and load time is getting affected?Yes, Product Selection, Adding to cart and Daily Average order

Q: Does All the metrics including # of searches,results load time, #of user landing to product screen,Product Selection , Adding to Cart and Daily Average Order value is getting changed before the feature vs after the feature launch

A: Yes

Q: How much is the change?

A: almost 10% in every metrics

Analysis: There is problem with your search functionality which you recently launched. That is not performing well. You need to optimize the search feature.

Solution: Rollback recent changes and launch the previous product to the user so that you can retain your user. Meanwhile, ask your tech team to work on optimizing the search feature and test that feature with the internal users , then small set of end users before launching it globally

sahil kapoor · Answer 13 · 2023-02-14T15:54:37+0000

First, I would want to understand the problem statement.

What is this new search feature? Eg - Advance Search
Can you further explain what you mean by load time? Page load time
Is this for all items users search for or some specific ones?
How are the new search functionality and Seach feature connected? ( this could also be a root cause, ill mention in the interview)
Is this happening to any particular user group, location
Is this mobile-specific, web, or both?

Hypothesis and Root Cause

First, as PM, I'll try to understand if/how the new search functionality is related to the page load time.
This will give me some direction on whether this 2-second load time is related to the new search functionality or is independent.
If it's independent then I'll investigate the load time increase -

Was there any new release on page loading, that might have caused a bug

4. I'll also try to understand the effect of an increase in load time on users. Meaning is this affecting user retention etc? If there is no effect then we can live with a 2-second increase in load time.

5. Understand if there are any other events, campaigns, etc that might have resulted in the increase of users and/or search of the item(s) that is affecting the load time.

6. Is there a seasonality meaning/example, prime day > more users> hence an increase in load time?

7. I'll look at releases that happened around the same time that might affect this.

8. I'll try to understand if this s item specific or general

If I can get a direction from the interviewer on the specific root cause, I'll move into the direction of a solution for ex - Influx of New users > increase the server capacity or maybe a cap on results displayed by pages.

Surbhis · Answer 14 · 2022-06-07T18:02:30+0000

I will approach this question shortly and succinctly.

Any e-commerce or customer-facing portal has a standard threshold defined for page load time. As per Google, it should be <3s or in other words, should in between 0 to 2s. The API ( external and internal ) response time should not be greater than 300ms.

To start with, I will ask some clarifying questions.

1) Is it page load time which has increased by 2 more seconds. What was the load time earlier? expected Answer:

2) Is it only page load or API response time as well? Answer: Only page load time has increased.

Approach Type 1 :

Let's consider the interviewer says the page load time earlier was within 2s in the 1st clarifying question. That means now the response time has reached tp to 4s, which is not recommended for any ecomm or customer-facing portals. It deteriorates the customer experience and increases the bounce rate.

Analysis:

The efficiency of the search feature might have increased by 10% which is good but the load time of pages can not be traded off with this 10% improvement of the search functionality. Why?

Let's consider-

A simple and ideal e comm user customer journey is

login--> Search the product--> add in the carts---> checkout--> make payment and done.

Searching for the product is at the top of the funnel ( awareness) and later part of the journey till the conversion is largely attributed to good user experience and the page load.

Solution: I would quickly involve the dev team to check the backend, database, APIs calls, code quality and other root cause analysis to take back the page load time to at least 2s. As this is the industry standard.

Approach Type 2:

At times interviewer checks whether Tech PMs understand Dev monitoring alerts related to code coverage, page load time etc.

Let's assume If the interviewer's answer to 1st clarifying question is 0s. Then adding up 2s is still acceptable as it is still less than 3s.

Solution:

Google recommended page load time is always between 0 to 2 s or less than 3s. Since the page load time is still under the range even if it was exceeded by 2s. As long as it is within the recommended range, there should not be a panicking situation. However, the Dev team must need to set a monitoring alert for a couple of days and do enough root cause analysis to rule out any code or DB related issues.

shrutc · Answer 15 · 2021-05-01T10:16:28+0000

The search functionality on Amazon is used by customers to find products that they want to buy. The functionality helps users narrow down the search by applying relevant filters. The search needs to be accurate so that users are able to get what they want. The results page is also displayed based on the filters applied and user can see any relevant information on the search landing page itself.

For the above problem, first we need to narrow down if the two are related and then decide on the next steps. Also, since the 2 increase in load time is undesirable, I will start with finding the root cause for that.

List of questions that I would ask to narrow this down

1. Is the 2percent increase in load time in a particular geographic location of Amazon? - yes (in the US)

2. Was it for a particular category of products? – Yes for some categories not all

3. Was there any infrastructure change that led to longer response time? – No

4. Was there any new algorithm added for searching better? – No

5. What was the new feature that was added that helped increase search by 10percent? – a new filtering criterion was added for all searches

6. Does the filtering criterion have a new algorithm at the backend? – yes

7. Has this algorithm, in order to provide better results, increased the load time? – yes possibly

If it’s the new algorithm that has led to the increased load time, then we need to check if the 2% increase in load time has affected any other metrics such as number of purchases or if it led to decrease in revenue from a particular category, especially the ones that load slower.

If the answer to the above questions is yes, the Amazon should rethink whether they want to increase sales at the cost of reduced purchases. And if the answer is no, then the trade-off of better and increased searches for increased load time is justifiable.

avleenk · Answer 16 · 2020-10-27T15:55:28+0000

Amazon is platform where customer can shop online anything and page load time is crucial. I would like to know more about what could have caused this. I have few questions that will help me narrow down the problem

Q1: Can you tell me what was this new feature that we released?

Q2: Searches increased by 10%? Can you tell me from what it increased to 10 %?

Q3: Do we know was any other release planned during the same time? If any other feature is using the search functionality internally could have launched during the same time, it might have affected our feature

Q4: Any change identified in the number of users? Is this peak at certain period of time?

Q5: Are users searching for same item or random items?

Q6: Can we rule out this is not prime day, cyber Monday or black Friday sale?

Lets say for example, we got an answer to Q3 as "Yes". Assume the number of users have significantly increased.

Mail goal for search functionality: To get the search results faster

Problem scenario: Searches increased by 10% but response time has increased by 2 seconds

Factors that could have caused this problem:

Server load: Given the number of users have been increased, the load of the server is more which could have resulted in poor performance. Need to get more statistics on server performance before and after the launch of the feature.

Performance of the queries/logic: The new search functionality has the logic which does not give the expected performance as number of users reaches that threshold. This needs to analyzed with the development team to come to any conclusion.

Based on the analysis with the development team, we can see if the increase in the page load time is due to the new feature. If so, then we can further calculate some user metrics. For example: If the usage of the feature is minimal, then we can simply roll back the feature. If the usage is high, then the priority for the development team to see if the response time of the page can be improved

Lokesh Kumar R · Answer 17 · 2020-07-17T18:18:26+0000

Before Answering the question, I would like to share the approach:

Comprehend the given situation
User Journey
Pros and Cons of the feature
Decide whether the lagging is actually because of this new feature.

Comprehend the situation:

Are we talking about a website or an app? App
Which demographic user are we talking about? US
For how long was the data collected? WoW

User Journey:

Opens the App
Enters the Search keyword
Results get shown
Clicks through a result

Pros:

The feature has helped in improving the user experience with respect to Search. This is pretty clearly visible from the % increase in the number of searches done. With personalized search, there is a possibility of a reduction in bounce rate from the Result Page. This might also help us in increasing the revenue by increasing the # of Successful Conversions.

Cons:

There has been a 2 second lag in showing the result page(This could be because of the search result or probably because of some other internal or external issues). The user might get confused with lots of options on the screen.

Decision:

As a PM of Amazon,

I would look if the lag in showing the result page is only in the US or globally? if it is only at the US, the problem might not be because of the Search algorithm's Alteration. Instead, there could have been some server-side problems or some change in government's policies that made the ISP's to slow down the network wrt to Amazon. If it is Global there is an increased chance of this problem being associated with the search improvisation. Let's assume it is globally to continue with the answer.
Next, I'll look if there is any seasonality in the drop? If yes, then the problem is pretty sure not because of the algorithm's change. However, if the answer is a No, then the problem should be attributed to the Algorithm's change. Let's assume it is a No.
In such a case, next, I'll look at the bounce rate from the Result Page. If there is an increase I'll go about rolling back the change as of now and will try optimizing the search algorithm as soon as possible. However, if there is no increase in the Bounce rate, I'll look for the number of successful transactions and the click-through rate. If either one has an increase, I'll have my change in-place.

Kay · Answer 18 · 2020-06-14T03:42:20+0000

It appears that you're assuming increase in searches by 10% is positive. I think it could also mean that it is more difficult for customers to find what they're looking for.
I would first of all confirm that the new feature is the primary reason for the increase in searches. It could also be as a result of a marketing campaign, any of the Political, Economic Sociological, Technological, Legal and Environmental factors. I would also compare with the numbers of the previous years within the same period. It could also be a seasonal spike
If I am able to confirm it is as a result of the new feature, I would measure the page load time of other e-commerce platforms such as eBay, AliExpress and others to know if we are within the same page load time threshold
The final step would be to monitor user behavior over the next few weeks using heat maps and other tools? Are they dropping off? Are they purchasing more items? Are they copying more links of the items and sharing
If the result is negative on all counts, we will go back to the drawing board and re-strategize based on the results