Tips from Wayfair Data Science PhDs – Wayfair Tech Blog


Data science panelists Rudi Natarajan, Stephanie Sorenson, Hussain Karimi, and Robert Yi (moderated by Licurgo De Almeida)

 

So, you graduated college, spent half a decade in a quantitative PhD program, maybe took a postdoctoral fellowship….and now what? You may want to stay the course and pursue a tenure track faculty position at a university–or, you may not. While the pressure to remain within the academy post PhD can at times feel overwhelming, if you’re in the latter group and are considering making the switch from academia to industry, know that you’re not alone.

In fact, more than 40% of the Data Science team at Wayfair hold PhDs and once made this same transition.


*Departmental statistics as of 10/31/18

These PhDs came to Wayfair with various academic backgrounds (from cognitive science to economics to nuclear engineering) and had different reasons for pursuing a career outside the academy (a desire for job security, better funding, and a faster tempo of development being some of the top). But once they made the decision to leave academia, all faced the same questions: How can I jump start my new career in Data Science? What skills or experience do I need to succeed?

Wayfair Data Science recently hosted a PhD Networking Event to aid current PhD students or Postdocs in the same position. Over 100 local PhD students registered for the event–still feel alone?–which featured lightning talks by Zephy McKanna (Data Scientist, Wayfair) and guest speaker Amy Winecoff (Data Scientist, True Fit Corp), as well as a panel of Wayfair PhDs who shared their experiences making the transition.

After the event, we asked our panelists for their top pieces of advice for PhDs/postdocs wanting to break into data science. Check them out below!

  


Rudi Natarajan

Data Science Manager
PhD Computational Biology
Duke University 

During his PhD in computational biology at Duke, Rudi worked on understanding the intricacies of gene networks in embryonic mouse gonad development. He then migrated north to Boston where he studied embryonic mouse liver development. At Wayfair, Rudi traded in mouse gonads for furniture and uses ML and statistical techniques to identify the best products in the Wayfair catalog. Rudi is also a coffee snob and a theoretical jazz guitarist.

 


Top Tips:

  • Get hands-on experience: Once you have some experience writing code, getting hands-on experience in applying machine learning, statistics and/or causal inference in real-world data sets is the best way to understand what data science is like. There are several common data sets available (mnist, cats vs. dogs, IMDB reviews, etc.), but instead of just picking one, try to form a project around a question that you care about, then study out how applications of ML algorithms would help answer it; Not only will it be more interesting for you, the project will carry more weight in discussions with future employers.
  • Determine what sets you apart: If you are close to interviewing for data science positions, it also helps to understand what skills you are bringing and how that might separate you from the rest of the competition. One book I recommend for anyone making a career transition (even though it’s geared towards people applying for MBA programs) is Avi Gordon’s MBA Admissions Strategy. In particular, I’ve found that adapting his profile building approach really helps to identify what skills you bring to the table as a future data scientist.  Once you have these outlined, it’s a matter of demonstrating them in the interview.

Additional Resources:

  • fast.ai: In terms of getting an understanding of how to use ML methods, I think fast.ai is an invaluable resource. Their ML for coders and deep learning for coders are fantastic. Even a year after making the transition to data science, I regularly go back to fast.ai materials and find things that I can directly apply to my work at Wayfair.

 


Stephanie Sorenson

 

Data Scientist
PhD Neuroscience
Stanford University 

Stephanie earned a BS in cognitive science from Dartmouth before completing her PhD in psychology/cognitive neuroscience at Stanford University. As a data scientist at Wayfair, she leverages data to optimize email marketing. In her free time she enjoys sailing, biking on the Minuteman bike path, and baking pumpkin-themed treats. Originally from Westerly, Rhode Island.

 


Top Tips:

  • Do an internship/immersion program: If you’re unsure you want to make the jump into data science, getting hands-on experience is a great way to find out if it’s a good fit. Many companies offer internships, where you can spend a couple of months actually doing data science. Wayfair also has an immersion program, where PhDs/postdocs can spend a week learning more about the day-to-day life of a data scientist.
  • Use your network: Talk to friends, colleagues, and friends of friends who are data scientists. Having informal “informational interviews” is an excellent way to learn more about what types of problems data scientists work on, what tools they use, culture at different work places, etc. This can help you figure out what types of data science roles might be a good fit for your background and/or interests. Meetup groups can also be a great way to get involved in local data science communities (e.g., https://www.meetup.com/socialdatascience).
  • Do a bootcamp/training fellowship: Programs like Insight Data Science and other incubators/bootcamps can be good resources for making the transition from academia to data science. You can also create your own study group with other students/postdocs interested in data science careers, and practice white boarding and giving each other mock interviews.
  • Take/audit classes: If you have time to attend classes at your university, this is a great way to pick up new skills (or refresh old ones). There are several online courses too (e.g., Coursera, Khan Academy). Statistics (e.g., basic stats, probability, data mining), Computer Science (e.g., intro to CS, machine learning), and Math (e.g., linear algebra) are a few useful areas.
  • Try using machine learning in your research:Is there any question in your research that could be tackled with a machine learning approach? Using machine learning in the “wild” is a great way to gain practical experience… and even better if you can apply it to your own work!
  • Do a side project: Kaggle is a good place to start. Beyond that, generating your own idea, and then getting/cleaning the data, doing some EDA, modeling, and thinking through the results / impact of your model is a super valuable experience. Doing a project is also a great opportunity to practice using common data science tools (e.g., pandas, sklearn, GitHub). Are there any interesting questions you can ask with publicly available data (e.g., https://data.boston.gov/)? Any “data science for good” volunteer projects you can get involved with?
  • Spend a few hours a week doing SQL tutorials/practice questions: Many PhDs/postdocs haven’t used SQL, but it’s a critical skill for a data scientist. If you’ve used pandas in Python, or dplyr/tidyr in R, a lot of the concepts are similar; putting in just a little bit of time going through tutorials and practice problems can help you prepare.

Additional Resources:


Hussain Karimi

 

Data Scientist
PhD Computational Engineering
MIT

Growing up in Atlanta, Georgia, Hussain developed a natural affinity towards sweet tea (or simply “tea” as we call it).  In college he left behind the country fried steak of the south and headed to the beaches of San Diego. Then he moved to Boston for graduate school at MIT, where he studied heat transfer in the ocean.  Since joining Wayfair, Hussain has been working to implement bidding algorithms in real-time for online advertising auctions.

 


Top Tips:

  • Learn Python: The most important you can do to prepare for the transition into data science is to switch over to python if you haven’t already.  To get started with machine learning, I recommend Python Machine Learning (2nd edition) by Sebastian Raschka.
  • Play with real data: Once you’ve gone through those fundamentals, it’s best to play with real data sets and do mini projects to get familiar with the modeling pipeline.  Kaggle is a good place to find data sets, along with shared notebooks that have quality ratings. The top rated notebooks, in my experience, usually have a clever approaches to data cleaning and modeling.

Additional Resources:

 


Robert Yi

Senior Data Scientist
PhD Geophysics
MIT

Robert received a PhD in geophysics from MIT where he studied pattern formation in river networks. Now he applies the skills he learned to develop and apply new uplift modeling methods as they pertain to ad retargeting. When not at work, Robert spends most of his spare time thinking about problems that his mind finds during its frequent random walks, though he reserves some time for snowboarding, music, making his ideas useful, spending time with his lovely wife, and torturing plants. Originally from Foster City, California.

 


Top Tips:

  • I used to give (and get) the advice: “Try to read through as much of The Elements of Statistical Learning as you can,” but in retrospect that’s generally bad advice.
    • To be completely honest, machine learning algorithms are not conceptually difficult, but the rigor of many textbooks make the learning curve feel very steep. If you spend 2 hours reading about neural networks, for example, and do not understand that they are simply functions of linear combinations of functions of linear combinations (and so on) of some variables, whatever you’re reading is not doing a good job. There’s no need to know about “neurons” and “activation functions” – you just need to know first what the core algorithm is, then, later, the details of its implementation and how it is adapted to solve problems in different domains.
  • My revised advice: Get a list of algorithms/tools/concepts in machine learning that you need to know, then try to explain out loud how each thing works. This will bring up a number of questions you don’t know the answer to. Google the questions (even youtube videos can often be quite good!), try to work out the math yourself, and/or refer to a textbook like ESL (or my personal favorite, Pattern Recognition and Machine Learning by Christopher Bishop). Then repeat (this is known as the Feynman technique). This will not only give you deep understanding fairly quickly, but it will also prepare you for interviews by forcing you to sequentially ask and answer difficult questions.

 


These are just a few ideas to get you started–stay tuned for future events from Wayfair Data Science, or check out our jobs page for open positions!

 



Source link