2

I'm trying to make a tool that could take all the comments from a video and retrieve all the users that posted a comment, to make giveaways easier

Right now I'm using the Youtube API with

gdata.youtube.com/feeds/api/videos/VIDEO_ID/comments?v=2&alt=json&start-index=1

I just make a simple loops that increment the start-index each time (1, 22, 44, 66, etc.) to get all the comments

The issue is that I now have over 1000 comments on a video, so this doesn't work for example : gdata.youtube.com/feeds/api/videos/VIDEO_ID/comments?v=2&alt=json&start-index=1100

http://pastebin.com/ypLrFmTG

Is there a way to get all the users that posted a comment on a Youtube video? I worked on this for a few hours to understand how the Youtube api works but this issue makes the whole thing useless

Should I rather use curl or another way to get content on these pages : youtube.com/all_comments?threaded=1&v=VIDEO_ID&page=x

user2880435
  • 23
  • 1
  • 4
  • There's a typo in your query parameters, `max-result` should be `max-results` - anyway, according to https://developers.google.com/youtube/2.0/reference?hl=en#Paging_through_Results it does look like there is a hard limit of 1000. "For any given query, you will not be able to retrieve more than 1,000 results even if there are more than that." – frostmatthew Oct 14 '13 at 23:31
  • How do this website get all the comments ? http://www.sandracires.com/en/client/youtube/random.htm I can't get comments using cUrl, Simple dom parser or anything on http://www.youtube.com/all_comments?v=ID&page=2 – user2880435 Oct 15 '13 at 01:34
  • **Update** : I found a way to do it using Zend_Dom_Query, Curl, getElementsByTagName, etc. It took me a while to figure it out. It works, but takes 2 minutes to go through 20 000 comments – user2880435 Oct 15 '13 at 03:35
  • Try my answer and see if it goes faster. – Millie Smith Oct 15 '13 at 03:43
  • @user2880435 hi I'm actually doing something like this. can you pls explain in detail or point me somewhere? I need to get more than 1000 comments for a particular youtube video? – samsamara Nov 02 '15 at 04:24

1 Answers1

3

Well, there are a couple ways. The first is that sandracires link you posted. If you view their javascript and then break it apart (or just watch the traffic in fiddler), it accesses a comments.php page on its site and passes in page numbers. The URL format is: http://www.sandracires.com/en/client/youtube/comments.php?v=videoID&page=1. However, I'm not sure of the legality of that, so I don't recommend it.

I used Fiddler on Youtube itself and here's what I came up with.

http://youtube.com/watch_ajax?action_get_comments=1&v=videoID&p=4&commentthreshold=-5&commenttype=everything&last_comment_id=teKFzQ8cbHNiI0ouIIqSS7lHeH2TZ8eWGlW-0D0Fx5U&page_size=500&source=w

  • v = videoID
  • p = page number
  • commentthreshold = ???
  • commenttype = comment type (everything is the only enum value I know of)
  • last_comment_id = comment right before the ones to load
  • page_size = amount of comments to return
  • source = ??? (maybe w for web)

You might be able to remove the source parameter and others.

I'm not entirely sure if these are all correct. I think p is page number, which would allow you to pull comments without the last_comment_id parameter (it's working for me like this). I did also get the last_comment_id parameter working (where p stays constant) by parsing the resulting XML and finding ?lc=LASTCOMMENTIDHERE.

There seems to be a max of 500 at once. Yes, I've tried 501. As I've noted, the data comes back in XML form. Each comment looks like this:

<div class="content clearfix">
  <p class="metadata">
    <span class="author ">
      <a href="/user/mindmonkey00" class="g-hovercard yt-uix-sessionlink yt-user-name " data-sessionlink="ei=-LFcUvCPNsn-sAf7jIGgAg" dir="ltr" data-ytid="UCAufDxGRQh_LlF5tD6StNtw" data-name="">mindmonkey00</a>
    </span>
      <span class="time" dir="ltr">
        <a dir="ltr" href="http://www.youtube.com/comment?lc=teKFzQ8cbHNkP8a89kiIEtWqiTRiAkKtSnvEHB_hXG4">
          3 weeks ago
        </a>
      </span>
  </p>


  <div class="comment-text" dir="ltr">
    <p>You didn&#39;t answer my question?</p>

  </div>

  <div class="comment-actions">
    <button onclick=";return false;" type="button" class="start comment-action create-channel-lightbox yt-uix-button yt-uix-button-link yt-uix-button-size-default" data-upsell="comment" role="button"><span class="yt-uix-button-content">Reply </span></button>
    <span class="separator">&middot;</span>


    <span ><button title="Vote Up" onclick=";return false;" type="button" class="start comment-action-vote-up comment-action yt-uix-button yt-uix-button-link yt-uix-button-size-default yt-uix-button-has-icon yt-uix-tooltip yt-uix-button-empty" data-tooltip-show-delay="300" data-action="vote-up" role="button"><span class="yt-uix-button-icon-wrapper"><img class="yt-uix-button-icon yt-uix-button-icon-watch-comment-vote-up" src="//s.ytimg.com/yts/img/pixel-vfl3z5WfW.gif" alt="Vote Up" title=""></span></button></span><span ><button title="Vote Down" onclick=";return false;" type="button" class="end comment-action-vote-down comment-action yt-uix-button yt-uix-button-link yt-uix-button-size-default yt-uix-button-has-icon yt-uix-tooltip yt-uix-button-empty" data-tooltip-show-delay="300" data-action="vote-down" role="button"><span class="yt-uix-button-icon-wrapper"><img class="yt-uix-button-icon yt-uix-button-icon-watch-comment-vote-down" src="//s.ytimg.com/yts/img/pixel-vfl3z5WfW.gif" alt="Vote Down" title=""></span></button></span>
  </div>

</div>

Keep in mind that by trying to circumvent Youtube's API rules, you will probably have to redo this process every once in a while. They will probably change up the URL.

Millie Smith
  • 4,536
  • 2
  • 24
  • 60
  • I might take a look at this way of doing it, but right now my solution is working fine, it take a while to load when there are a lot of comments but it's not really an issue as it's just a personal tool that will not be released or anything This is my code, which probably looks really bad : http://puu.sh/4QALb.png It takes 13 seconds to process 1200 comments (3 pages) – user2880435 Oct 15 '13 at 08:03
  • The puu.sh seems to have expired. Also, the sandracires way, if this is just for personal use, is much faster and returns nicely formatted JSON. – Millie Smith Oct 15 '13 at 16:16
  • @Millie Smith Could you please explain how did you come up with link above using the Fiddler solution? Thanks! – Thoth Aug 15 '14 at 14:59
  • I just opened Fiddler and then went to a youtube video. The link popped up in Fiddler, and I played with the parameters. It looks like the link above is broken now. They probably changed the API, which is the problem with using internal APIs. – Millie Smith Aug 15 '14 at 17:28
  • @Thoth forgot to tag you – Millie Smith Aug 15 '14 at 17:28
  • @Millie Smith first of all thanks for the response. I have done what you said but no link such as yours appear. Most of the urls appeared are of the form: `alink.com:anumber`. This means that this clever way of getting the entire comment corpus is blocked? Could you please make a comment on this to overcome this difficulty? (PS: [fiddler Built for .NET 4](http://www.telerik.com/download/fiddler)) – Thoth Aug 15 '14 at 17:58
  • @Thoth I actually don't know. I'm not getting great results with Fiddler right now either. Use the developer tools in chrome to find the relevant div and script. I found the div, but it's very tedious work. The class is "R4 b2 Xha". – Millie Smith Aug 15 '14 at 18:57
  • Hello @Mile Smith I have tried to do the same thing you have done with Fiddler using chrome developers tool. In was in the all_comments page and I had the dev tool open. Pressing the "Show more button" the following link appeared: `https://www.youtube.com/comment_ajax?action_load_comments=1&filter=o9BqrSAHbTc&‌​order_by_time=false.` This url returns this message: {"forbidden": "1"}. I include this interesting comment in case someone can help. – Thoth Aug 16 '14 at 19:27