From our last article we learned that we need to gather data for our AI to process through it. We searched through Teo's and Mark's profiles to get an idea from where we could get our data from. We found different useful sections with professional info like About, Articles & activity and Skills & Endorsements.
Now, let's start writing some simple code. We'll be using Python's library called BeautifulSoup. First, we need to install it and import it in our project.
Now let's go to Teo's Linkedin profile page and click Inspect.
In the Developer Tools, we can see the whole HTML code. Now let's copy it into our project in a file named index.html. By hitting CMD+F we can search for certain keywords. Searching for 'Teo Deleanu' gives us the exact location of where that text is in the code.
Let's use some simple knowledge of Python to write our code. We'll be using BeautifulSoup's ability to get data from class names.
This tiny piece of code will return Teo's name in our console. You can see that we are using a try-except block because if a certain section is empty, the script would just crash but we are now returning a simple message. We can do the same for the other sections.
For the Occupation section:
For the About section:
You get the picture. Now the last 2 sections:
That's enough information. Now when we run the script we get this in the console:
I know, it's a lot of tangled text in there but we got our necessary information for our bot to read through. With that done, what's the next step?
Well, when we go to his Linkedin profile page and open up Dev Tools again, we see this:
The class names have changed. Linkedin is doing a great job at protecting its users' profiles.
The best way is to build a ML algorithm for gathering data. This will be powering the Skell.io engine.