(VIDEO) Data Science Community Interview - Erin Williams
"Choose a job you love, and you will never have to work a day in your life." Embedded in that old saying is the idea that loving the job will get you even more dedicated to it, which sometimes takes one to the utmost level of perfectionism.
Erin Williams is a living example of that. Every report we get from her shows how much she is passionate about being a Data Scientist, striving to always produce not only the best possible solution for a problem, but also the best looking and most comprehensive report about her approach.
We were intrigued about what makes her so unique, so we had to call her to find that out.
Hi Erin. Can you tell us a little bit about yourself?
Having worked in the health care industry for a good time now, can you share an example of a project you worked on? What challenges do you see data analytics helping this industry?
Which one was your favorite CrowdANALYTIX contest? Why?
Any parting thoughts?
The full interview is transcribed below:
CrowdANALYTIX: Hi Erin, can you tell us a bit about yourself?
Erin Williams: My name is Erin Williams and I’m an independent data scientist living and working in Houston, Texas.
When I’m not working I like to go exploring, and I love to travel – last year I lived in Tokyo for 3 months and spent a lot of that time exploring the city on foot. Here in Houston I like to take long walks or bike rides around the city, looking for things I’ve never seen before, seeking out and enjoying nature. I also like learning languages and practicing yoga.
CAX: What took you to Tokyo for three months?
EW: Tokyo was the adventure that I went on in between leaving my last job and starting my new company, so it was an adventure for the sake of adventure and I was there going to an immersive Japanese language school, which maybe wouldn’t be for everyone, but for me it was paradise. I loved the city so much. It was my third time there and I can’t wait to go back.
CAX: What other languages do you speak?
EW: I wouldn’t say that I can speak any of them fluently, but I can say I studied them. Besides Japanese, the one I studied the longest is French. Besides that, I spent some time with Spanish and German and there have been slight dalliances with Italian, Russian, and Arabic.
CAX: Outside of CrowdANALYTIX, how are you involved with data science?
EW: I would say I have a passion for data science which has led me to redesign my professional life - my whole life, really - so that I could follow this passion exclusively. I started my own company in 2014 – called Aruku Analysis – through which I work directly with a variety of clients creating data models, performing research, and consulting on topics in business, healthcare, and of course data science and analytics. Launching a business is a challenge and comes with a lot of risk, but it has been so worth it to be able to do exactly the kind of work I love.
I’ve been working with data for over 10 years now, with most of that time in healthcare sector. Before starting my company, I worked for 6 and a half years at MD Anderson Cancer Center here in Houston.
I have an MBA from the University of Houston where I was first exposed to and trained in statistics and advanced analytical techniques. That experience - the way I’ve described it before - is that it was like finally learning the grammar for a language I’d been trying to speak for years.
And that shift to being able to fully express an idea - a solution - in terms of the language of analytics? That’s what really drives me. That’s why I’m here.
CAX: What do you consider to be your biggest strength?
EW: I think there’s value in an ability to find the connections between things that may not have obvious outward similarities. There are patterns everywhere, in everything. I like to think I’m sometimes good at recognizing them. Thinking in symbols and analogies is a useful way to open yourself up to these connections, and these can help when you’re trying to design a model to represent some phenomenon in the real world.
CAX: Can you share an example of how you worked with data at MD Anderson Cancer Center?
EW: The team I was a part of at MD Anderson was focused on providing consulting and establishing relationships with other hospitals in the United States and around the world. Part of that work often involved market analysis, research, and forecasting. For one of the forecasts that I worked on, we were working with a system of oncology treatment centers in Brazil who wanted to look at market forecast scenarios over a 10-year time horizon. The challenge there was that a lot of the variables for that area were unavailable, unknown or unreliable. So I built a model that was completely driven by simulation, where each input was represented by a probability distribution of likely values, and the output was also a distribution of forecasted patient volumes, and then those volumes could be then used for the financial models, staffing models, facility planning models and all the rest.
CAX: Having worked with health care for a good time now, what challenges do you see data analytics helping the healthcare industry?
EW: Well, I think data analytics can help any industry, and healthcare is certainly not unique. I think the healthcare industry has been using analytics and data science for a long time on the medical side: in epidemiological studies, genomics, and biological computing. But healthcare is a delivery system. It’s of course in the throes of huge reforms right now and I think it can utilize a lot of the capabilities of advanced analytical techniques to improve the delivery model, to look at outpatient operations, making them more efficient, cost efficient and also improving outcomes. There are all kinds of ways that analytics can be applied to improve operations in healthcare that benefit not only the providers but the patients and the community as a whole.
CAX: What would you advise companies to start doing as far as establishing and growing their analytics capabilities?
EW: I would say that my advice to companies that are moving into the analytics space would be to start small. There are so many options out there right now as far as business intelligence tools, software platforms, big data and machine learning. . . the list goes on. The great thing about the data economy right now is that everything is completely scalable. So, my advice would be for companies to start small. Establish proof of concept by designing and implementing a targeted analytics project to address one particular, well-defined problem, or set of problems, using data that you have now. Remember that the question is sometimes just as important as the answer. And that a so-called ‘small’ data project can have as much impact as a larger one, depending on how it’s designed and how the solution is implemented.
CAX: Which one was your favorite CrowdANALYTIX contest? Why?
EW: My favorite CrowdAnalytix contest – well, I guess I have two. The first was a contest for the Olive Garden restaurant chain. The goal was to find out why certain restaurant locations were performing better than others. We were given a set of performance metrics and variables, and a set of hypotheses to test. I was also given around 43,000 online Yelp and Google reviews tied to individual restaurant locations. Using these, I was able to combine several methods – multiple linear regression, confirmatory data analytical techniques, and text mining. I was able to use these in order to confirm or reject each of the hypotheses. The most challenging part of this one was figuring out a methodology to mine structured data out of the unstructured online reviews and use those for analysis.
My other favorite contest was more recent - it was this year -, we had to create a forecast for the number and value of four different types of digital wallet transactions for the next 5 years, through 2020. Digital wallet technology is so new right now that there are no starting numbers or information to base a forecast on. So the challenge here was to anchor the forecast to real numbers, and I did that by carefully analysing reports from the Federal Reserve, the Bureau of Labor Statistics and some other places and then creating all the necessary translation steps along the way, tying in all the various drivers that would affect the forecast from year to year. The final model pulled all of these together on top of a simulation engine so that every input variable had flexibility based on a defined probability. The solution though was really driven as much by the analysis as it was by mathematical formulas.
CAX: Any parting thoughts?
EW: I would say that I’m grateful to have found CrowdANALYTIX because it’s a great way for someone like me - an independant data scientist - to get experience working with not only different kinds of data sets but with different problems in different industries. With a more traditional career or job, you usually get a very deep understanding of a certain kind of data, so this is a great way to expand into all different kinds of data. This will bring out those patterns and connections that you can see across different problems. The market forecast in Brazil and the Digital Wallet Transaction Forecast Model, two models that I’ve done, were based on a similar concept, even though they’re completely different problems. That’s what I love about this kind of work.
Domo arigato Erin!
If you're a solver on our Data Science Community, get in touch with us through firstname.lastname@example.org. We'd love to chat with you too!