Soundless

It is there that the colors burst so loud the entire world falls silent, soundless for a calm wild moment The duality and nuance of what I feel sits together side by side, firing across all my…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Job Qualification Matching and Lexical Analysis

Since the focus of this article is on lexical analysis, I’ll just quickly go through how I got the role qualifications.

My plan was to take each qualification from each job posting and compare them to the qualifications of every other job posting I pulled. I wanted to see which qualifications were unique for each role and which were overlapping with other roles.

For both Levenshtein and Jaccard, numbers closer to 0 mean the two sentences are more closely related. I also filtered out the most common words (stopwords) to increase the percentage of keywords in each comparison.

The idea was to produce a result like this:

Finally, I iterated through each qualification in each role and compared them with all qualifications in every other role:

Sentence 1: You have working understanding of project management tools and methodologies.
Sentence 2: Solid experience in both product management and product marketing. Education apps experience helpful.
Levenshtein: 64, Jaccard: 0.269

Sentence 1: A keen ability to filter and distill substantial information for the right audience.
Sentence 2: Ability to participate in and facilitate requirements brainstorming sessions.
Levenshtein: 53, Jaccard: 0.208

Okay, I guess this is getting closer.

Sentence 1: A keen ability to filter and distill substantial information for the right audience.
Sentence 2: Ability to filter and distill meaningful information for the right audience
Levenshtein: 18, Jaccard: 0.0909 👍

Here we go! Two qualifications from different job postings that are not exactly the same, but incredibly similar.

Rather than looking through the entire 2500-line output, I set the comparisons to only print if the Levenshtein score was below 30 and Jaccard was below 0.15.

Sentence 1: Ability to communicate thoughtfully, leveraging problem-solving skills and a learning mindset to build long-term relationships
Sentence 2: Goal oriented, highly motivated and able to work under minimal supervision in a cross-functional environment at detailed levels whilst taking account of interdependencies at higher levels.
Levenshtein: 122, Jaccard: 0.142

Sentence 1: Goal oriented, highly motivated and able to work under minimal supervision in a cross-functional environment at detailed levels whilst taking account of interdependencies at higher levels
Sentence 2: Goal oriented, highly motivated and able to work under minimal supervision in a cross-functional environment at detailed levels whilst taking account of interdependencies at higher levels.
Levenshtein: 2, Jaccard: 0.037 👍

So it appears that anything with a Levenshtein distance < 20 or a Jaccard distance < 0.1 is similar enough to be the same qualification. With that in mind, let’s graph it.

I figured this would be pretty simple to plug into library that can spit out a nice-looking network graph. However, it’s not quite so simple.

A simple graph with two nodes (A, B) connected by an edge (A-B) can be created like this:

So, I represented roles and qualifications by nodes, and the connection by edges. So a each role would have a few qualifications connected to it:

Now, if qualifications from different roles were exactly the same or similar (had a Jaccard distance < 0.1), I’d want to link them together:

To do the actual comparison:

Now, you may be wondering why I’m adding nodes as I did the comparison instead of generating all nodes (roles, qualifications) and edges (role𝑥←→qualification𝑦) first, then comparing and linking the similar qualification nodes. The issue was that if two or more qualifications from different nodes were exactly the same, and there were similar qualifications found, I’d only be able to add edges to the first one in the node list unless I also added unique IDs to the nodes. I implemented this the way I did out of laziness since I wanted to see results quickly:

As expected, similar qualifications are linked together and it’s easy to see the overlap between roles. Here’s a sample of a triangle of roles with overlapping qualifications. Mousing over each node reveals the full qualification:

Some roles have even more overlap of qualifications. This is a bit more difficult to read since I haven’t figured out a way to group the nodes, or to prevent lines from crossing each other (using this type of graph):

Of course, improving legibility and making it easier to understand connections can still be worked on. But it’s easy to see that some roles have more overlap than others. At the same time, some qualifications say so much that they might as well be describing every role, like this: “Goal oriented, highly motivated and able to work under minimal supervision in a cross-functional environment at detailed levels whilst taking account of interdependencies at higher levels”.

Still, it’s useful to see if some roles more heavily overlap with others. One could apply to groups of similar roles without modifying their resume and/or cover letter, increasing application efficiency.

There are more important things one can do that improve their chances at landing a role. However, this was still an interesting learning experience.

Add a comment

Related posts:

7 Signs You Have a Truly Entrepreneurial Mind

True entrepreneurs are focused and persistent, and they are definitely not inattentive or impulsive. The one is not an entrepreneur who has tons of domains with different projects on the go each…

How Much is Your Concert Ticket Really Worth?

If you are a live music aficionado, you probably have that one friend who bragged about how she made a thousand dollars flipping Coachella tickets, or tickets for some other event that are expensive…