Get price from webpage using Jsoup
Get price from webpage using Jsoup
I'm trying to get the price from a product on a webpage.
Specifically from within the following html. I don't know how to use CSS but these are my attempts so far.
<div class="pd-price grid-100">
<!-- Selling Price -->
<div class="met-product-price v-spacing-small" data-met-type="regular">
<span class="primary-font jumbo strong art-pd-price">
<sup class="dollar-symbol" itemprop="PriceCurrency" content="USD">$</sup>
399.00</span>
<span itemprop="price" content="399.00"></span>
</div>
</div>
This obviously resides further within a webpage but here is the java code i've attempted to run this.
String url ="https://www.lowes.com/pd/GE-700-sq-ft-Window-Air-Conditioner-115-Volt-14000-BTU-ENERGY-STAR/1000380463";
Document document = Jsoup.connect(url).timeout(0).get();
String price = document.select("div.pd-price").text();
String title = document.title(); //Get title
System.out.println(" Title: " + title); //Print title.
System.out.println(price);
2 Answers
2
Element priceDiv = document.select("div.pd-price").first();
String price = priceDiv.select("span").last().attr("content");
If you need currency too:
String priceWithCurrency = priceDiv.select("sup").text();
I'm not run these, but should work.
For more detail see JSoup API reference
Quick update, I found the reason this isn't working. For some reason home depot won't give you the proper page source if you don't access through a browser.
– user2769894
9 hours ago
How do you get that HTML code? I've visited the page you have in url variable but I can't find that
– Frighi
9 hours ago
I inspected the element and used View page source. Both show up. I'm using Firefox which might make a difference?
– user2769894
9 hours ago
I just investigate, when you select for the first time a Shop based on provided Zip code, the site save a cookie about that, and read it for seguent requests. I think you cannot doing scraping in that simple way.
– Frighi
9 hours ago
First you should familiarize yourself with CSS Selector
W3School
has some resource to get you started.
In this case, the thing you need resides inside div
with pd-price
class
so div.pd-price
is already correct.
div
pd-price
div.pd-price
You need to get the element first.
Element outerDiv = document.selectFirst("div.pd-price");
And then get the child div with another selector
Element innerDiv = outerDiv.selectFirst("div.met-product-price");
And then get the span element inside it
Element spanElement = innerDiv.selectFirst("span.art-pd-price");
At this point you could get the <sup>
element but in this case, you can just call text()
method to get the text
<sup>
text()
System.out.println(spanElement.text());
This will print
$ 399.0
Edit:
After seeing comments in other answer
You can get cookie from your browser and send it from Jsoup to bypass the zipcode requirement
Document document = Jsoup.connect("https://www.lowes.com/pd/GE-700-sq-ft-Window-Air-Conditioner-115-Volt-14000-BTU-ENERGY-STAR/1000380463")
.header("Cookie", "<Your Cookie here>")
.get();
@user2769894 see edited
– Zendy Lim
9 hours ago
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
This seems to work if I parse only the HTML posted, but not the full Link. Any ideas why this might be happening? I'm getting a null pointer.
– user2769894
9 hours ago