Sorting YouTube playlists from the command line
A while ago, I published pl-sort, a web app that enables users to sort YouTube playlists based on various metrics such as views, likes/comments count, and more. Initially, creating this tool was not my goal. I simply wanted to sort a playlist. Here is the method I used, which involves a single bash oneliner using standard command line tools.
Suppose we want the videos in this playlist sorted by their view counts in this format:
[
{
"title": "The essence of calculus",
"views": 7340027
},
{
"title": "The other way to visualize derivatives | Chapter 12, Essence of calculus",
"views": 3339209
},
...
]
We can do this using tools that are readily available on most Unix-like systems:
curl
: A command line HTTP client used to send requests to endpoints.jq
: A JSON processor that lets you transform JSON data into desired format with a powerful syntax. We will use jq to transform the response into the output format we want and sort the output by various keys. If your system doesn't have jq installed, see jq's installation guide.xargs
: Commands likegrep
orawk
can take input from command line arguments as well asstdin
(output from one command can be piped into them as input) butcurl
can only take input as arguments. We will use xargs to take each video ID from the playlist, contruct an endpoint that we can send requests to, and pass it as an argument for curl.
Because of how YouTube's Data API v3 works, we cannot get all the information about every video in a playlist in a single request. First, we need to hit the playlistItems endpoint to get the relevant video IDs, then call the videos endpoint for each ID for details.
Before you begin, obtain an API key from Google Developer Console for YouTube Data API. Then set it as an environment variable by running the following command:
# Don't add any spaces around the = sign
$ export YT_KEY="your api key"
# Add these as well in order to shorten our command
$ export YT_API="https://www.googleapis.com/youtube/v3"
$ export YT_LISTID="PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr" # Playlist ID
By doing this, we don't have to type in these frequently needed values in our command every time we run it. We can refer to them with their keys when we need to, like this:
# -s flag makes curl silent
$ curl -s "$YT_API/playlistItems?part=contentDetails&playlistId=$YT_LISTID&maxResults=50&key=$YT_KEY"
{
"kind": "youtube#playlistItemListResponse",
"etag": "jhS4gzCu5OcX8oftAWInXrLHoZs",
"items": [
{
"kind": "youtube#playlistItem",
"etag": "ppImKV3lyokspBcbXkImZ7cmdEk",
"id": "UExaSFFPYk9XVFFETXNyOUstcmo1M0R3VlJNWU8zdDVZci41NkI0NEY2RDEwNTU3Q0M2",
"contentDetails": {
"videoId": "WUvTyaaNkzM",
"videoPublishedAt": "2017-04-28T15:58:48Z"
}
},
{
"kind": "youtube#playlistItem",
"etag": "a29dMgyS-DwQYMokKGzUXy4H26U",
"id": "UExaSFFPYk9XVFFETXNyOUstcmo1M0R3VlJNWU8zdDVZci4yODlGNEE0NkRGMEEzMEQy",
"contentDetails": {
"videoId": "9vKqVkMQHKk",
"videoPublishedAt": "2017-04-29T16:24:03Z"
}
},
...
"pageInfo": {
"totalResults": 12,
"resultsPerPage": 50
}
}
This gets us a response with all the video IDs of interest, along with a bunch of information that we don't need. To strip this out and get just the information that we care about, we can pipe this response and use jq's filter syntax:
# Hit up arrow to replace ... with previous command
$ ... | jq '.items[].contentDetails.videoId'
"WUvTyaaNkzM"
"9vKqVkMQHKk"
"S0_qX4VJhMQ"
...
Now we need to loop through each line in this output and call curl requesting the /videos endpoint on each record:
# `-I {}` means replace the string `{}` in the URL with the current record (video id)
$ ... | xargs -I {} curl -s "$YT_API/videos?part=snippet,statistics&id={}&key=$YT_KEY"
{
"kind": "youtube#videoListResponse",
"etag": "LWiYRbWgpdH_mL3I4BdCu-yKxio",
"items": [
{
...
"snippet": {
"publishedAt": "2017-04-28T15:58:48Z",
"channelId": "UCYO_jab_esuFRV4b17AJtAw",
"title": "The essence of calculus",
...
}
"statistics": {
"viewCount": "7363949",
"likeCount": "207221",
"favoriteCount": "0",
"commentCount": "6365"
}
...
}
],
...
}
{
"kind": "youtube#videoListResponse",
"etag": "tY3Oi8Po8BhCGIRaZNJR5NfkVWA",
"items": [
{
...
"statistics": {
"viewCount": "2986293",
"likeCount": "76291",
}
...
}
],
...
}
...
To strip the output of unnecessary info, we can use jq's object construction syntax:
Notice the |tonumber
after viewCount. This will cast the string value into a number.
$ ... | jq '{title: .items[0].snippet.title, views: .items[0].statistics.viewCount|tonumber}'
{
"title": "The essence of calculus",
"views": 7363968
}
{
"title": "The paradox of the derivative | Chapter 2, Essence of calculus",
"views": 2986300
}
...
Finally, we can sort this output using the views
key and reverse it to get the results in descending order:
# With --slurp/-s, we read the entire stream of JSON objects above into a large array
# and then run the filter on the array instead of running it on each object
$ ... | jq --slurp '.|=sort_by(.views)|reverse'
[
{
"title": "The essence of calculus",
"views": 7364088
},
{
"title": "The other way to visualize derivatives | Chapter 12, Essence of calculus",
"views": 3344817
},
...
]
tl;dr
The final command comes out to this:
$ export YT_KEY="your api key"
$ export YT_API="https://www.googleapis.com/youtube/v3"
$ export YT_LISTID="PLZHQObOWTQDMsr9K-rj53DwVRMYO3t5Yr" # Playlist id
$ curl -s "$YT_API/playlistItems?part=contentDetails&playlistId=$YT_LISTID&maxResults=50&key=$YT_KEY" | jq '.items[].contentDetails.videoId' | xargs -I {} curl -s "$YT_API/videos?part=snippet,statistics&id={}&key=$YT_KEY" | jq '{title: .items[0].snippet.title, views: .items[0].statistics.viewCount|tonumber}' | jq --slurp '.|=sort_by(.views)|reverse'