Everyone’s looking for large datasets these days and Google is here to help with its recent release of YouTube-8M which is comprised of 8 million videos tagged with over 4800 visual labels (I haven’t looked but surely there are tags for that perennial genre of viral video involving inter-species animal friendships). Let the video analysis begin as this trove hosts over 500,000 viewing hours! According to Google, all videos selected are public and have over over 1000 views.
There are large-scale image datasets out there (such as ImageNet) but this YouTube-8M is the fist of its kind for video. The precursor to this newly minted dataset is Sports-1M, containing over a million video URLs tagged with 487 labels. (Sports-1M is actually included in Youtube-8M.) You can learn more about this new open access resource from the recent Google Research Blog announcement, or just dive right into the dataset itself here.
Speaking of YouTube research, check out these titles:
The Impact of YouTube on U. S. Politics by LaChrystal D. Ricke (Lexington Books, 2014).
Unruly media: YouTube, music video, and the new digital cinema, by Carol Vernallis (Oxford, 2013)
Out online: Trans Self-Representation and Community Building on YouTube, by Tobias Raun (Routledge, 2016)
The YouTube Reader, edited by Pelle Snickars and Patrick Vondera (National Library of Sweden, 2009)