How web-scraping can streamline your business processes

How web-scraping can streamline your business processes
Photo by Ilya Pavlov / Unsplash

Web scraping can be a good alternative if your IT systems lack a good API or upgrading a legacy system is too costly.          


Web scraping is mainly known for extracting data from websites and running test suites for developers. But if you are a bit creative, you can also use it to automate business processes.

Using an API (Application Programing Interface) is by far the best choise if this is available to you. But in some cases this is not an option.

As a business owner, you might find yourself in a situation where you want to streamline a particular business process. The lack of access to an API, the cost of getting access to the API, or upgrading to a newer version of the software is too costly.

Legacy IT systems can be a pain in the ass sometimes. But it's the reality and something that the majority of businesses have to deal with.

Even greenfield startup companies frequently have to deal with some third-party legacy system integration. Or rather, the lack of proper integration.

What about RPA - Robotic process automation?

Now, let's keep this short. RPA is a good alternative if you have money to throw at the problem. We'll discuss RPA and our experiences in a later blog post.

For now, note that RPA can be an excellent alternative to scraping, and in some cases, far better suited and powerful.

But this does not mean that you always should grab for RPA as a tool to solve any problem whenever you lack an API. That's about as stupid as always reaching for the hammer in your toolbelt when what you needed was a saw.

Short case study

Recently we helped a company shave 7 - 10 minutes off manually punching every new customer into their IT system. To put this in context, they had several hundreds of new customers every month.

The numbers may seem small at a glance, but they compound fast. We saved this company for about 450 hours in a year. This equals a labor cost of NOK 141 000 ($ 16 442).

Automating this process took us a few days, and the company's first year's savings were approximately NOK 100 000 ($11 660).

Alright - I'm ready to start automating.

Before we dive headfirst into some example code, you'll need to figure out if web scraping will be able to solve your problems in the first place.

In general, if you deal with any software systems that use a browser to be accessed, you'll most likely be fine. This could be CRM software or other IT systems.

And to be clear, web scraping can easily be combined with APIs to gain more flexibility.

If you find yourself stuck, contact us. We'll either point you in the right direction or offer you a review.

Code examples of web scraping

Alright, time to show you some quick code examples. You can use whatever coding language you prefer; in my case, it's Ruby. You can also use JavaScript, Python, or almost any other popular programming language.

Now there is a reason for picking Ruby in this example; it almost reads like regular English. I'll explain each line of code.

Example code:

require 'watir'

browser = Watir::Browser.new

browser.goto 'watir.com'
browser.link(text: 'Guides').click

puts browser.title
# => 'Guides – Watir Project'
browser.close

require 'watir'

This first line of code shows that we use a library for web scraping that is called Watir. This open-source Ruby library is primarily designed for automated testing. But I have used it for a wast variety of automation tasks through the years.  

Watir interacts with a browser the same way people do: clicking links, filling out forms and validating text.

browser = Watir::Browser.new

The next part of the code tells the browser to start - think of it like you open Chrome, Safari, or Firefox. You can keep it open - this is both fun and valuable in many cases.

If you tell the browser to stay open as a regular browser, you can see all the happening actions.


browser.goto 'watir.com

This part tells the browser to navigate to the URL "watir.com", just like if you entered this URL into the address field yourself.

Now, imagine the URL we entered is pointing to some internal IT system that you can access via your browser.


browser.link(text: 'Guides').click

This line finds a link on the webpage that we just navigated to containing the text 'Guides' and then clicks it.

This works similarly to you visually scanning a page with your eyes, looking for a link text containing the word 'Guides', and then click on it if you find it.


The rest of the script is just printing out the page's title and then closing the browser when it's finished.

To run a script like this, you'll need a couple of things. You'll both need to install Ruby on your computer and Watir.

You'll find more information on what's needed here.

Conclusion

Utilizing web scraping to automate business processes is absolutely an alternative to more expensive solutions like RPA.

Web scraping libraries are open source and free to use. So your investment is mainly limited to the development time.

If you want to discuss the possibilities for automating a process in your company, please contact us.