Get from <head> of a url in R</a></h1> </div> <div class="grid fw-wrap pb8 mb16 bb bc-black-075"> <div class="grid--cell ws-nowrap mr16 mb8" title="2016-01-12 19:07:53Z"> <span class="fc-light mr2">Asked</span> <time itemprop="dateCreated" datetime="2023-06-22T10:03:06.290" class="fromnow">Jun 22 '23 at 10:03</time> </div> <div class="grid--cell ws-nowrap mr16 mb8"> <span class="fc-light mr2">Active</span> <time class="fromnow" title="2023-06-22T10:15:33.543" datetime="2023-06-22T10:15:33.543">Jun 22 '23 at 10:15</a> </div> <div class="grid--cell ws-nowrap mb8" title="Viewed 42 times"> <span class="fc-light mr2">Viewed</span> 42 times </div> </div> <div id="mainbar" role="main" aria-label="questions and answers"> <div id="question" class="question" data-questionid="76530749" data-ownerid="10483692" data-score="2"> <div class="post-layout"> <div class="votecell post-layout--left"> <div class="js-voting-container grid jc-center fd-column ai-stretch gs4 fc-black-200" data-post-id="76530749"> <button class="js-vote-up-btn grid--cell s-btn s-btn__unset c-pointer"><svg aria-hidden="true" class="m0 svg-icon iconArrowUpLg" width="36" height="36" viewBox="0 0 36 36"><path d="M2 26h32L18 10 2 26z"></path></svg></button> <div class="js-vote-count grid--cell fc-black-500 fs-title grid fd-column ai-center" itemprop="upvoteCount" data-value="2">2</div> <button class="js-bookmark-btn s-btn s-btn__unset c-pointer py4"> <svg aria-hidden="true" class="svg-icon iconBookmark" width="18" height="18" viewBox="0 0 18 18"><path d="M6 1a2 2 0 00-2 2v14l5-4 5 4V3a2 2 0 00-2-2H6zm3.9 3.83h2.9l-2.35 1.7.9 2.77L9 7.59l-2.35 1.7.9-2.76-2.35-1.7h2.9L9 2.06l.9 2.77z"></path></svg> <div class="js-bookmark-count mt4" data-value=""></div> </button> </div> </div> <div class="postcell post-layout--right"> <div class="s-prose js-post-body" itemprop="text"><p>From any URL, I would like to get the text inside the tag in its header. For example, in the screenshot below, the text "javascript - Getting the title of a web page given the URL - Stack Overflow" is what I want to extract.</p> <p><a class="external-link" href="https://i.stack.imgur.com/jTVtv.png" rel="nofollow noreferrer"><img alt="enter image description here" src="../../images/3857452248.webp"/></a></p> <p>I have been trying to get the header with <code>httr</code>, but it does not seem to have the title:</p> <pre><code>library(httr) url_head <- HEAD(url = "https://stackoverflow.com/questions/10940241/getting-the-title-of-a-web-page-given-the-url")) url_head </code></pre> <p>gives</p> <pre><code>Response [https://stackoverflow.com/questions/10940241/getting-the-title-of-a-web-page-given-the-url] Date: 2023-06-22 10:01 Status: 200 Content-Type: text/html; charset=utf-8 <EMPTY BODY> </code></pre> <p>also tried</p> <pre><code>headers(url_head) </code></pre> <p>but nothing there either.</p></div> <div class="mt24 mb12"> <div class="post-taglist grid gs4 gsy fd-column"> <div class="grid ps-relative"> <a href="../../questions/tagged/r" class="post-tag js-gps-track" title="show questions tagged 'r'" rel="tag">r</a> <a href="../../questions/tagged/httr" class="post-tag js-gps-track" title="show questions tagged 'httr'" rel="tag">httr</a> </div> </div> </div> <div class="mb0"> <div class="mt16 grid gs8 gsy fw-wrap jc-end ai-start pt4 mb16"> <div class="grid--cell mr16 fl1 w96"></div> <div class="post-signature owner grid--cell"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="asked Jun 22 '23 at 10:03">asked Jun 22 '23 at 10:03</time> <a href="../../users/10483692/broti" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/10483692.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="broti" /> </a> <div class="s-user-card--info"> <a href="../../users/10483692/broti" class="s-user-card--link">broti</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">1,338</li> <li class="s-award-bling s-award-bling__silver" title="8 silver badges">8</li> <li class="s-award-bling s-award-bling__bronze" title="29 bronze badges">29</li> </ul> </div> </div> </div> </div> </div> </div> <div class="post-layout--right js-post-comments-component"> </div> </div> </div> <div id="answers"> <a name="tab-top"></a> <div id="answers-header"> <div class="answers-subheader grid ai-center mb8"> <div class="grid--cell fl1"> <h2 class="mb0" data-answercount="9">2 Answers<span style="display:none;" itemprop="answerCount">2</span></h2> </div> </div> </div> <a name="76530809"></a> <div id="answer-76530809" class="answer " data-answerid="76530809" data-ownerid="9349302" data-score="2" itemprop="suggestedAnswer" itemscope="" itemtype="https://schema.org/Answer"> <div class="post-layout"> <div class="votecell post-layout--left"> <div class="js-voting-container grid jc-center fd-column ai-stretch gs4 fc-black-200" data-post-id="76530809"> <button class="js-vote-up-btn grid--cell s-btn s-btn__unset c-pointer"><svg aria-hidden="true" class="m0 svg-icon iconArrowUpLg" width="36" height="36" viewBox="0 0 36 36"><path d="M2 26h32L18 10 2 26z"></path></svg></button> <div class="js-vote-count grid--cell fc-black-500 fs-title grid fd-column ai-center" itemprop="upvoteCount" data-value="2">2</div> </div> </div> <div class="postcell post-layout--right"> <div class="s-prose js-post-body" itemprop="text"><p>I would use the {rvest} package for this.</p> <p>We read in the URL, get the element with the CSS selector "head > title" which reads "get the title tag inside the head tag" and then we use <code>html_text()</code> to extract the text.</p> <pre class="lang-r prettyprint-override"><code>library(rvest) url <- "https://stackoverflow.com/questions/10940241/getting-the-title-of-a-web-page-given-the-url" so_page <- read_html(url) so_page |> html_element("head > title") |> html_text() #> [1] "javascript - Getting the title of a web page given the URL - Stack Overflow" </code></pre> <p><sup>Created on 2023-06-22 with <a class="external-link" href="https://reprex.tidyverse.org" rel="nofollow noreferrer">reprex v2.0.2</a></sup></p></div> <div class="mb0"> <div class="mt16 grid gs8 gsy fw-wrap jc-end ai-start pt4 mb16"> <div class="grid--cell mr16 fl1 w96"></div> <div class="post-signature grid--cell"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="answered Jun 22 '23 at 10:10">answered Jun 22 '23 at 10:10</time> <a href="../../users/9349302/timteafan" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/9349302.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="TimTeaFan" /> </a> <div class="s-user-card--info"> <a href="../../users/9349302/timteafan" class="s-user-card--link">TimTeaFan</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">17,549</li> <li class="s-award-bling s-award-bling__gold" title="4 gold badges">4</li> <li class="s-award-bling s-award-bling__silver" title="18 silver badges">18</li> <li class="s-award-bling s-award-bling__bronze" title="39 bronze badges">39</li> </ul> </div> </div> </div> </div> </div> </div> <div class="post-layout--right js-post-comments-component"> </div> </div> </div> <a name="76530847"></a> <div id="answer-76530847" class="answer accepted-answer" data-answerid="76530847" data-ownerid="12009071" data-score="1" itemprop="acceptedAnswer" itemscope="" itemtype="https://schema.org/Answer"> <div class="post-layout"> <div class="votecell post-layout--left"> <div class="js-voting-container grid jc-center fd-column ai-stretch gs4 fc-black-200" data-post-id="76530847"> <button class="js-vote-up-btn grid--cell s-btn s-btn__unset c-pointer"><svg aria-hidden="true" class="m0 svg-icon iconArrowUpLg" width="36" height="36" viewBox="0 0 36 36"><path d="M2 26h32L18 10 2 26z"></path></svg></button> <div class="js-vote-count grid--cell fc-black-500 fs-title grid fd-column ai-center" itemprop="upvoteCount" data-value="1">1</div> <div class="js-accepted-answer-indicator grid--cell fc-green-500 py6 mtn8"><div class="ta-center"><svg aria-hidden="true" class="svg-icon iconCheckmarkLg" width="36" height="36" viewBox="0 0 36 36"><path d="m6 14 8 8L30 6v8L14 30l-8-8v-8z"></path></svg></div></div> </div> </div> <div class="postcell post-layout--right"> <div class="s-prose js-post-body" itemprop="text"><p>You can use the <code>rvest</code> package, which provides powerful tools for web scraping:</p> <pre><code>library(rvest) url <- "https://example.com" # Replace with your desired URL html <- read_html(url) title <- html %>% html_node("head title") %>% html_text() </code></pre> <p>Make sure you install the <code>rvest</code> package.</p></div> <div class="mb0"> <div class="mt16 grid gs8 gsy fw-wrap jc-end ai-start pt4 mb16"> <div class="grid--cell mr16 fl1 w96"></div> <div class="post-signature grid--cell"> <div class="s-user-card s-user-card"> <time class="s-user-card--time" datetime="answered Jun 22 '23 at 10:15">answered Jun 22 '23 at 10:15</time> <a href="../../users/12009071/daniel-kamel" class="s-avatar s-avatar__32 s-user-card--avatar"> <img class="s-avatar--image" src="../../users/profiles/12009071.webp" data-jdenticon-width="32" data-jdenticon-height="32" data-jdenticon-value="Daniel_Kamel" /> </a> <div class="s-user-card--info"> <a href="../../users/12009071/daniel-kamel" class="s-user-card--link">Daniel_Kamel</a> <ul class="s-user-card--awards"> <li class="s-user-card--rep" title="reputation score">610</li> <li class="s-award-bling s-award-bling__silver" title="8 silver badges">8</li> <li class="s-award-bling s-award-bling__bronze" title="29 bronze badges">29</li> </ul> </div> </div> </div> </div> </div> </div> <div class="post-layout--right js-post-comments-component"> </div> </div> </div> </div> </div> </div> </div> <script src="../../static/js/stack-icons.js"></script> <script src="../../static/js/fromnow.js"></script> </body> </html>