Unlimited Sheets

How to scrape any site with Google sheets

Google Sheets is an amazing tool you can use on your web browser to do things similar to what you would do with Microsoft Excel. Many commands and functions are available to the user to simplify many processes. But sometimes these functions are not enough, so that’s where our add-on comes in the game. 

Unlimited Sheets brings a lot of functions to Google Sheets in order to make them easier to use in different situations. Now we’re going to talk about two of the premium functions that are perfect for web scraping. Let’s take a look at the basic concepts that you need to know and how to execute the functions we’re going to talk about here.

What’s “web scraping”?

Web scraping is, in a simple way, the uso of bots to find and extract information, content and data from a website. Through this method you have the opportunity to find the underlying HTML code and the data stored in a database. It sounds kind of complicated, and it may be, but you don’t need to think too much about it. 

Unlimited Sheets has a couple of functions that do something similar to what we described. In this case, you can use these premium functions to scrape parts of a website using different paths, specifically through the CSS path and Xpath.

Executing the function scrapeByCSSPath

This function is called scrapeByCSSPath and it lets you scrape any part of the webpage you chose by using the CSS path. As this is a premium function, you will need credits that you gain when you subscribe to our service. In this case, every time you use the command you’ll spend a credit. In order to use scrapeByCSSPath properly, you have to follow the next steps. 

  • Select the cell on which you will write the function. 
  • Add an equal (=) before anything else or you won’t be able to execute the function.
  • Now you can write the first part of the command by adding scrapeByCSSPath.
  • Add an opening parenthesis.
  • Add the URL  that you will scrape between quotation marks and add a comma.
  • Now write a “TRUE” for scraping several elements and “FALSE” for just one.
  • Add the CSS path to select which elements you want to scrape between quotation marks.
  • Add a closing parenthesis.
  • The command should result in something like: =scrapeByCssPath(«https://www.seohacks.es», TRUE, «h2», «textContent»)
How to scrape any site with Google sheets-1
  • To execute the command you have to press enter on your keyboard. 

Executing the function scrapeByXpath

Let’s talk now about scrapeByXpath, which allows you to scrape any part of a website using the xpath. This one is also a premium function, so you will need those credits that you get by subscribing to our service. If you want to use scrapeByXpath properly, then follow the next steps.

  • Choose a cell from the spreadsheet to execute the command.
  • Add an equal (=) first, this is needed to execute the function. 
  • You can now write the first part of the command, in this case, scrapeByXpath.
  • Add an opening parenthesis.
  • Next thing, you have to add the URL from the website you want to scrape between quotation marks, and add a comma.
  • Set to “TRUE” to scrape several elements or “FALSE” for just the first one. 
  • Type the xpath you want to scrape between quotation marks.
  • Add the method to get the text from the element between quotation marks. 
  • Write the closing parenthesis.
  • The command should result in something like: scrapeByXPath(«https://www.seohacks.es», TRUE, «//h2/a», «getText»)
How to scrape any site with Google sheets-2
  • Press enter on your keyboard to execute the command.

Lorem fistrum por la gloria de mi madre esse jarl aliqua llevame al sircoo. De la pradera ullamco qué dise usteer está la cosa muy malar.

About the author

Nacho Mascort

Nacho Mascort

My name is Nacho Mascort and I'm an SEO Manager (SEO + Product + Dev) doing some cool stuff at Softonic International.

More Posts

Send Us A Message

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *