How to Solve an RCA?

How to Solve an RCA?

Question- As a PM responsible for X, you have noticed a Y phenomenon. Investigate the reason for Y.

Sample Question- As a PM working with Swiggy, you have noticed a spike in people abandoning carts.


Step 1:

Define scope for Y. Breakdown Y into clear metrics that are measured over time and geographical area. If it is unclear which part of the user journey Y is observed in, construct a simple sample journey and take the interviewer through it to identify where Y happens.

Questions to Ask:

  1. Since when have we observed Y. Is it sudden or gradual.
  2. How do we define Y in metrics? (Suggest a few metrics that make sense) β†’ Any change in usage patterns can be described in terms of metrics, we just need to choose the right one. Most product metrics measure similar things- users installing, users logging in, users looking at stuff, users clicking stuff, users dropping off after a point, users becoming inactive/uninstalling and users complaining through support/App Store reviews β†’ Avg. user session duration, %age of users dropping off after a certain point in the user journey, %age uninstalls, click through rates, daily active users, App Store reviews, customer complaints, etc.
  3. Is Y observed uniformly over all the geographic markets we operate in β†’ divided by countries, states, urban/rural, different language speaking regions, different demographics (age, gender, etc.) and different engagement levels (new users/ committed old users/ moderately engaged users/ inactive installs).
  4. Is Y observed across all our supported platforms - iOS, Android, Web?

Step 2:

Once you have the answers to the questions asked, you have to prioritise your approach. Divide your investigation into external and internal factors. If it is a gradual trend that is uniform in geography and devices - probably we are being affected by the market - so choose external. If it is a sudden change, maybe limited to one platform - it is due to some internal mis-step. Any combination of factors can make you lean into either going with internal or external first approach and it doesn’t matter which one you choose if you have a clear reason.

Explicitly state your chosen approach and that you will be coming back to the other part later. Mention what factors are a part of your approach and which ones you’ll be leaving out for later.

Internal Factors -

There are three main internal factors to consider:

  1. Technical Factors
    1. How confident are we in the numbers shown in our dashboard?
    2. Have we made any changes to the data pipeline, the formula for calculating our metrics or any other change that might have a material affect on our numbers.
    3. Have we noticed any performance issues with the app- forced closes, crashes, login/payment API timeout, location error, etc.
      1. Bonus Sub-Question - Any platform updates that affect the performance of our app - like location restrictions in iOS, app-switching restrictions preventing us from opening payment apps, etc.
      2. Things to look at: Changes to avg. user session length, error logs, customer service calls, etc.
    4. Did we experience any service downtime due to things like AWS/GCloud being down, CDN being down, our own server issues, etc.
  2. UI/UX/ User Journey/ Usability Factors If a sample user journey looks like this - User launches the app β†’ Performs action 1 β†’ Performs action 2 β†’ Performs action 3 β†’ Gets the result - which part of the user journey is the issue observed in.
    1. Have we made any changes to any of the steps mentioned above? The changes could be β†’
      1. How the steps are ordered.
      2. Changes in the representation of results from each steps β†’ like the way search results are shown, the visual weight of each element on screen, colors used, etc.
      3. Changes in the information shown in the results of each step β†’ like showing delivery time in search results, showing price with tax, no longer showing discounts applicable, changes in the search/recommendation algorithm showing different results based on local preferences rather than global, addition of a direct checkout button in addition to an add to cart, any other relevant changes, etc.
      4. Addition of new steps β†’ Like removing the option of direct checkout - hence adding additional steps to the process, hiding the like button under the options menu- asking the act of liking something a two step process, showing categories in search results instead of the actual item, adding a cancellation timer delaying the payment processing.
    2. Trying to attribute the phenomenon Y to any of the changes discovered above and determining if the change is a net positive.
      1. Looking at the metrics pre and post release to see the impact.
      2. Conducting a randomised A/B test which large enough sample size to with and without the change to see the impact
      3. Analysing the other related metrics which may have been positively impacted- making the change a net positive β†’ like drop in the no. of orders but increase in the avg. order value and total revenue, drop in avg. user session length but increase in number of sessions and daily engagement numbers, drop in delivery time but higher delivery costs -vely impacting profitability.
    3. Miscellaneous factors unique to the product in question and issue faced - X and Y This needs to be specific to the question. Few standard questions can be used as most businesses are some version of e-commerce/content streaming/financial devices (also form of e-commerce as you are buying financial goods online)/social media - talk about a lack of originality πŸ˜›
      1. Change in service delivery due to operational constraints β†’ like delivery force strike increasing delivery times/surge pricing, lapse of streaming rights leading to removal of popular content, removing local language support due to perception of poor adoption, etc.
      2. Pricing change→ hike in prices due to end of promotional period, end of free trial period for a massive proportion of user base, change in strategy to focus on higher priced but more restrictive plans for a more dedicated user base.

External Factors -

There are three main external factors to consider:

  1. Competitor Analysis
    1. Has our competitor engaged in an aggressive sales strategy- giving the users a better price for a similar product - could be free delivery, free trial periods, extended trial periods, referral programs, lower pricing plans, etc.
    2. Has our competitor engaged in an aggressive marketing strategy - increasing their mindshare and user’s willingness to try them out - could be an innovative social media strategy, viral ads, support for social causes, better treatment of workforce, etc.
    3. Has our competitor leapfrogged us and added functionality that makes their product more appealing - better content to stream (blockbuster shows, etc.), local language support, better UI/UX, faster delivery times, better recommendation algorithm, newer content format, variety of options (content, food, vehicles).
      1. Another similar factor could be a more established player adding features that were our main differentiating factor- like IG adding stories and reels, YouTube adding shorts, etc.
    4. Has our competitor poached our delivery workforce, drivers, etc.
  2. Regulatory and Social Factors
    1. Has the government passed some law that impedes our operation in the some way β†’ labour law preventing delivery force from operating, copyright law preventing content to be shown on our platform, COVID restrictions preventing operations, blocking apps from a certain country, etc.
    2. Has the company been the centre of some -ve publicity β†’ like delivery force behaving inappropriately, executive tweeting something offensive, randomly being caught in the backlash to something else (Snapdeal getting -ve reviews due to Snapchat getting in trouble)
    3. Festivals and other events that can cause shift in usage patterns - like Diwali causing sales to spike, Ramadan causing food orders to drop, Earth-day reducing internet and associated services consumption, internet outage or bring blocked in some area, natural calamities, seasonality of demand, etc.
    4. Changes in user preferences over times- users moving to vertical short-form video from horizontal long-form video, users eating out instead of ordering, users listening to podcasts instead of music, moving to online payment instead of offline payment
  3. Miscellaneous external factors unique to the product in question and issue faced - X and Y
    1. Cost of services used to access the product β†’ like internet cost, fuel cost for cabs, food ingredient cost for food delivery, access to Mobile phones or cost of mobile phones
    2. Things I can’t think of πŸ™‚
    3. Once you have enquired about all the internal and external factors- there will be one or more hypothesis you can advance about the root cause of Y. These hypotheses can be verified by looking at metrics pre and post Y for sharp trends or by conducting A/B tests. You can further explore if Y is a net +ve or -ve for X.

      This is how I would solve an RCA.