Vettd.ai college interns are gaining a unique perspective about the difference between classroom learning and real-world experience. They are being challenged to use open-source and low-cost software components to create prototype applications to compliment or compete with alternative approaches to natural language processing (NLP) solutions. One project in particular focused on scanning resumes for key words using NLP techniques that help understand what terminology professionals most frequently use to describe their work history.
Two Vettd.ai interns – Logan Knapp and Kelly Chan – are just beginning their college careers at the University of Washington and Yale respectively. Logan developed the user interface and back end. Kelly and another intern Kriti Garg, who will be a sophomore at University of California, Irvine, gathered and prepared resume data.
Kriti and Kelly are using open-source, pretrained models for NLP that separate significant and insignificant words from one another.
“Currently, we have been cleaning resume data by removing any special symbols or common, unnecessary words (i.e. a, the, and, of, etc),” Kriti said.
“The application helps find trends in people they hire,” she said. “For example, based on the resume data uploaded, you can discover a common skill or keyword valued in hired employees and then search for that skill or keyword on the resumes of potential employees.”
The test application is being used to demonstrate the limitations of NLP that only evaluates key words commonly used by various types of jobs. The problem is that most NLP software can’t distinguish – for example – between various sales positions such as enterprise vs. midmarket because key words are used by many types of sales professionals.
In contrast, Vettd’s NLP technology learns roles that words play in a sentence, paragraph and document such as a resume. It understands how a group of people express themselves and how this group is similar or different than other groups.
As Vettd’s NLP technology makes determinations, it can be optimized by expert human decision makers.
In addition to giving Vettd a convincing sales tool, said Vettd CEO and co-founder Andrew Buhrmann, the project is helping the interns understand how different academic exercises are from commercial ones.
All three interns will continue their work on the project while they start their college year. Kelly and Kriti who both attended Interlake High School in Bellevue, Washington, are learning remotely while Logan just started his freshman year on campus. They all agree that their work on the project has been invaluable.
“In school, the goal is to solve a problem in as few lines of code as possible or as efficiently as possible,” Logan said. “This often means doing the bare minimum so that you don’t unnecessarily use resources.
“In the real world, it’s often better to start wide. There have been a number of times that I decided ‘you know, it may be a good idea to save this piece of data or implement something, even though I don’t need it now,’ and that’s more often than not been a good decision.
“In short, know that your projects’ stakeholders (that includes you) may have different expectations over time, and it’s better to be prepared for that than to have the most efficient solution of what is expected now.”
Several times, additional functionality was requested which put the NLP work on hold. Kriti said that one of her professor’s emphasis on writing well-documented, clean code played an important role in allowing Kriti and Kelly to get back up to speed quickly.
“Something that often came in handy was my prior experience with documentation because it helped us return to code after long breaks and understand the code’s purpose quickly,” Kriti said. “Many of my professors heavily emphasized the ability to write clean code and documentation so I often try to write code that makes sense while also including documentation so anybody can understand my code.”
In contrast Logan, who attended Skyline High School in Sammamish, Washington, learned a lot through trial and error. His first challenge was to figure out how to transfer the data which could not be sent at once due to the 30 second window to process a request. He finally realized that the only way to do it was to send each file (2,000 to 3,000 of them) separately.
“Outside of what I learned as a developer and software engineer in general, this project gave us a whole new understanding of what it means to work with data,” Logan said. “Typically, when you think data, you think numbers. But this project has definitely convinced me that general and business applications of data are moving into a mostly uncharted realm of language data, and it was really exciting to work on the leading edge of this.”
Kelly said the project has been a valuable work experience she doesn’t expect to get at Yale.
“I think when first learning about AI, it was interesting to see how AI was completely different from the science-fiction type perception that I had of it,” she said. “I was also initially surprised by the amount of industries AI can help in as well as its amount of uses.
“Working with real customer data and communicating with a team in an established business is experience that can’t be fully learned until one actually goes through it.”