One weekend, I was looking at hiking apps to see if they could recommend a hike based on what I felt like doing or through an image of the type of scenery I wanted to see. As an avid hiker, data enthusiast, and constant innovator, this made me think – what if we could leverage machine learning to recommend a hike? What if I could translate the typical questions I received into an ML powered hiking application:
”Hey Arjun – I feel like seeing lots of trees and birds. What hike should I go to?”
Or “I have family visiting this weekend. What trail should I take them to? I recently did this hike (shows image of a hike on their phone..)”
I quickly realized that this would require a major undertaking to build out an AI application – going from gathering data from actual hikes, tagging them with reviews, building complex Natural Language Processing (NLP) and computer vision models, operationalizing the model, and finally making recommendations accessible through an end user interface – but I really wanted to try it out.
The How – Data, Modeling, And Predictions
Here’s how I went from reviewing some sample images on my phone to building and testing various cutting edge machine learning techniques to developing an AI app in less than a week to recommend hiking trails!
Data
I had about 200 plus scenic images across a lot of my favorite hikes and trails in the last four years of being in the Bay Area! Some of my favorites being Gray Whale Cove Trail in Montara, Golden Gate Park in San Francisco, or just going for a walk by the Embarcadero. Additionally, I tagged the trips with reviews: “I liked this trail a lot, lots of trees and birds to see.” Each review added a piece of information that would give users the ability to make a wise choice (without my help in the future).
Modeling
As a data scientist, while I had extensive experience fitting and training the more common classification and regression algorithms on tabular data, I had never developed a reliable, accurate, and integrated computer vision and a natural language processing model – simply because of the time required to iterate on such approaches, the amount of data required to get started, and the necessary skills required to accomplish such a task! So how did I test and iterate on hundreds of the most advanced techniques to develop a reliable model in a few hours? DataRobot AutoML!
Quick note if you are new to ML:In this case, machine learning is helping us learn patterns from historical hiking trips to accurately and reliably recommend which hike would be a good fit!
Deployment
Deployment can often be the hardest stage in a project. As a data scientist, you simply expect that once you build some reliable experiments and share insights your work is done – now my IT or MLOps friends can deploy the code (or operationalize this model). However, since this was a weekend project, I had to be my own MLOps team and get my model live and ready to serve real-time predictions for my users (friends).
The What — Making Recommendations and Building Apps
Why does the end-to-end process involve so many stages? As a data scientist, you already have to prepare the data, build and validate models, (educate different audiences on what you are trying to solve), and now I am supposed to be building a front end AI app? However, with some open source streamlit app examples and snippets, I could easily integrate the above API code and test my idea and solution vision!
For fellow coders – feel free to check out the app and modeling code here: Github Repository
For now, I’ve built my model for nine hiking trails. But the project is really motivating me to revisit old hikes to take more pictures (and generate more data). I recently got my hands on the book, Best Hikes in the Bay Area, and there’s a lot more hiking trails that I can’t wait to explore and take pictures of! And the best part of this project is the fact the model can be retrained, updated, and made fully functional within a couple of hours on a weekend. I’ll be continuing to share my hiking passion with my friends and family through my stories and AI solutions!
If you have feedback or questions about the data, process, or my favorite hiking trails – feel free to reach out! Always looking to make incremental improvements! Comment if you would like to see a system setup for adding data or contribute to the app!
Thankful for the support from my fellow hiking friend and colleague, Austin Chou, on helping with data collection and modeling activities!
DataRobot is the leader in Value-Driven AI – a unique and collaborative approach to AI that combines our open AI platform, deep AI expertise and broad use-case implementation to improve how customers run, grow and optimize their business. The DataRobot AI Platform is the only complete AI lifecycle platform that interoperates with your existing investments in data, applications and business processes, and can be deployed on-prem or in any cloud environment. DataRobot and our partners have a decade of world-class AI expertise collaborating with AI teams (data scientists, business and IT), removing common blockers and developing best practices to successfully navigate projects that result in faster time to value, increased revenue and reduced costs. DataRobot customers include 40% of the Fortune 50, 8 of top 10 US banks, 7 of the top 10 pharmaceutical companies, 7 of the top 10 telcos, 5 of top 10 global manufacturers.