I want to scrape data from and I'm using Puppeteer.
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto(";);
let arr = await page.evaluate(() => {
let text = document.getElementsByClassName("price ng-star-inserted")
let array = []
for (let i = 0; i < text.length; i++) {
array.push(text[i].innerText)
}
console.log(array)
})
})()
The problem is that when I run this script, it opens its own browser and opens the page where I am not authorized, so I can't scrape data because even if I paste my login and password, I have to confirm this in steam. So, how can I do this from my browser where I am authorized, or how can I fix this problem? Maybe another library?
I want to scrape data from https://csfloat.com/search?def_index=4727 and I'm using Puppeteer.
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto("https://csfloat.com/search?def_index=4727");
let arr = await page.evaluate(() => {
let text = document.getElementsByClassName("price ng-star-inserted")
let array = []
for (let i = 0; i < text.length; i++) {
array.push(text[i].innerText)
}
console.log(array)
})
})()
The problem is that when I run this script, it opens its own browser and opens the page where I am not authorized, so I can't scrape data because even if I paste my login and password, I have to confirm this in steam. So, how can I do this from my browser where I am authorized, or how can I fix this problem? Maybe another library?
You can always use your beloved browser developer tools.
And then select the Console
tab to write your own script there.
Or you can also use Recorder
tab when you want to do some automated routine task daily or hourly.
You can access it by selecting the double chevron arrow on tab bar.
And there, you can do many things to automate clicks, scroll, and even waiting for an element to be exist and visible. Then you can always export it to puppeteer script, if you like to.
I hope this can help you much.
Edi gives some good suggestions, but to supplement those, here are a few other approaches. There's no silver bullet in web scraping, so you'll need to experiment to see what works for a particular site (I don't have a Steam account).
userDataDir
flag, then run it once headfully with a long timeout or REPL and login manually. The session should be saved, so on subsequent runs, you'll be pre-authorized.
fetch
call. A simple example of this strategy is here, and a full-fledged tool based on manual cookie copying is replit-exporter. The same caveats as above apply.