教你如何用python下载UC短视频

废话不多说直接开始UC视频的下载过程。过程不是很繁琐,可以直接从最下面代码中找到直观过程,跳过多余的步骤。如果你想学会分析过程,可以认真解读完整过程,任何不懂的可以在下方留言评论。

准备阶段:

  • 具备基本的编程经验
  • 能安装python环境和依赖包,读懂和运行python代码
  • 会进行谷歌浏览器调试

第一步,找到链接:

打开手机UC浏览器,首先找到自己感兴趣的视频,比如我很喜欢李子柒的视频,找到李子柒个人主页。

然后点击右上角分享按钮,找到右下方的复制链接。这个链接如下所示:

http://a.mp.uc.cn/media.html?mid=5ebdd3d3aada4f3fb5b5a33b3669a0c6&client=ucweb&uc_param_str=frdnsnpfvecpntnwprdsssnikt

 

第二步,浏览器打开视频:

然后将这个链接在电脑端谷歌浏览器中打开,你会看到如下内容:

然后F12打开调试窗口,点击第一个视频看看发送了哪些请求,得到了哪些结果,可以看到发送了超级多的请求,那么接下来就要进行数据分析,看看每个API都在干什么,总有API是返回视频标题,视频图片和视频源的,我们只要用心去找,总能找到正在的视频地址。


第三步,找到视频真正的地址:

真正的视频链接很容易被找到,以下就是该视频的真正链接,打开看一下,对,就是这个视频。当然咯,你可以直接鼠标右键视频另存为就完成操作了,但是接下来我们要讲的是如何使用python爬取这个视频呢?(链接可能失效,实时链接如下调试截图可找到)


http://dy-frontend.video.ums.uc.cn/video/wemedia/5ebdd3d3aada4f3fb5b5a33b3669a0c6/c29d246285c69c57f40e37493d51c23c-3578638567-2-0-3-h264.mp4?auth_key=1596341775-84b818dbb7824f8288492456bfafe749-0-08bcc97bc1128a75928621fc9d7e6686

 

第四步:反推请求过程:

既然找到这个链接,那么开始反推,它做了哪些请求:

它需要一个token,一个ums_id,一个wm_cid,一个wm_id和一个画质分辨率resolution,接下来,开始分析并且找到这些值,token就是你当前浏览器的token,其它几个参数需要通过以下步骤拿到。

https://mparticle.uc.cn/video.html?uc_param_str=frdnsnpfvecpntnwprdssskt&wm_aid=ce1b8ba53f9e4ffa8b609cd7955e5cad&wm_id=5ebdd3d3aada4f3fb5b5a33b3669a0c6

wmAidwm_id都能从上面这个链接地址中获取,接下来需要用这个参数去请求下面这个API来获取wm_cidums_id,你可以在它的返回结果中找到这两个参数,content_id就是wm_cid
https://ff.dayu.com/contents/origin/"+wmAid+"?biz_id=1002

然后所有参数就都找到了,最后拼接并请求如下URL:

https://mparticle.uc.cn/api/vps?token='+token+'&ums_id='+ums_id+'&wm_cid='+cid+'&wm_id='+wmId+'&resolution=high

以下是根据视频页面地址获取视频地址的完整代码:

import urllib,re,requests
import sys
import json
import os
from bs4 import BeautifulSoup
from urllib.parse import urlparse, parse_qs

headers = {
 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36',
 'Cookie':'MUID=0DA4D4BD498D6CFB3AA9D9BE4D8D6820; _SS=SID=00; videoCookiesLastCategory=en-ca=animals; _cb_ls=1; _cb=DsAPZCJmzJ0BiCB2c; _chartbeat2=.1550504320201.1550509083431.11.uAsG96ZMUeDgWcawC2JWWNCmna0Z.1; ANON=A=E43533FD3C93526D33F4F5C4FFFFFFFF&E=164d&W=1; NAP=V=1.9&E=15f3&C=gBm5NGQ6hq9_2JLke_9M_uUX-nhzUVJp3UwliRTchLXC4pE05Iv2DA&W=1; vidvol=10; adoptout={"msaOptOut":0,"adIdOptOut":0}; videoerrorcount=0; trg=0%7C0%7C0; ecasession=v2_9a22cfec7b49fc3893239e7b074a63fd_ba74fcd1-eff2-48d8-b6af-082423d58358-tuct35eea8e_1550666026_1550666984_CNawjgYQqLw-GP6e37rd9J7zBiACKAYwMDjK_QdA_qAQSI7OHlCJxAlYAGAC'
}

def downloadVideo(url, path):
    if (not os.path.exists(path)):
        urllib.request.urlretrieve(url, path)

if __name__ == "__main__":
    initURL = "https://mparticle.uc.cn/video.html?uc_param_str=frdnsnpfvecpntnwprdssskt&wm_aid=ce1b8ba53f9e4ffa8b609cd7955e5cad&wm_id=5ebdd3d3aada4f3fb5b5a33b3669a0c6"
    res = requests.get(initURL, headers = headers)
    token = None
    if res.status_code == 200:
        res.encoding = 'utf8'   
        token = res.cookies['vpstoken']
        print(token)
    url = initURL
    parsed = urlparse(url)
    wmAid = parse_qs(parsed.query)['wm_aid'][0]
    wmId = parse_qs(parsed.query)['wm_id'][0]
    print(wmAid)
    print(wmId)
    umsIDAPI = "https://ff.dayu.com/contents/origin/"+wmAid+"?biz_id=1002"
    res = requests.get(umsIDAPI, headers = headers)
    if res.status_code == 200:
        res.encoding = 'utf8'   
        videoInfo = json.loads(res.text)
        cid = videoInfo['data']["content_id"]
        title = videoInfo['data']['title']
        coverURL = videoInfo['data']['cover_url']
        ums_id = videoInfo['data']['body']['videos'][0]['ums_id']
        print(title)
        print(coverURL)
        print(ums_id)
        videoAPI = 'https://mparticle.uc.cn/api/vps?token='+token+'&ums_id='+ums_id+'&wm_cid='+cid+'&wm_id='+wmId+'&resolution=high'
        print(videoAPI)
        res = requests.get(videoAPI, headers = headers)
        if res.status_code == 200:
            res.encoding = 'utf8'
            videoURLInfo = json.loads(res.text)
            print(videoURLInfo['data']['url'])

 

如何批量下载作者的所有视频:

如果你想获取李子柒这个号的所有视频链接呢?也很好做,通过如下链接,通过传递size,page参数,就可以拿到所有李子柒视频列表了
https://ff.dayu.com/contents/author/5ebdd3d3aada4f3fb5b5a33b3669a0c6?biz_id=1002&_size=8&_page=1&_order_type=published_at&status=1&_fetch=1&uc_param_str=frdnsnpfvecpntnwprdsssnikt&_=1596342684957

每个视频的content_id和ums_id都可以找到,继而循环下载所有视频:

完整代码如下,可将李子柒的视频下载到"D\\李子柒"目录下,以下代码仅下载第一页的八个视频。

import urllib,re,requests
import sys
import json
import os
import time
from bs4 import BeautifulSoup
from urllib.parse import urlparse, parse_qs

headers = {
 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36',
 'Cookie':'MUID=0DA4D4BD498D6CFB3AA9D9BE4D8D6820; _SS=SID=00; videoCookiesLastCategory=en-ca=animals; _cb_ls=1; _cb=DsAPZCJmzJ0BiCB2c; _chartbeat2=.1550504320201.1550509083431.11.uAsG96ZMUeDgWcawC2JWWNCmna0Z.1; ANON=A=E43533FD3C93526D33F4F5C4FFFFFFFF&E=164d&W=1; NAP=V=1.9&E=15f3&C=gBm5NGQ6hq9_2JLke_9M_uUX-nhzUVJp3UwliRTchLXC4pE05Iv2DA&W=1; vidvol=10; adoptout={"msaOptOut":0,"adIdOptOut":0}; videoerrorcount=0; trg=0%7C0%7C0; ecasession=v2_9a22cfec7b49fc3893239e7b074a63fd_ba74fcd1-eff2-48d8-b6af-082423d58358-tuct35eea8e_1550666026_1550666984_CNawjgYQqLw-GP6e37rd9J7zBiACKAYwMDjK_QdA_qAQSI7OHlCJxAlYAGAC'
}

def downloadVideo(url, path):
    if (not os.path.exists(path)):
        urllib.request.urlretrieve(url, path)

def analysizeAndDownload(url, page, size, directory):
    res = requests.get(initURL, headers = headers)
    token = None
    if res.status_code == 200:
        res.encoding = 'utf8'   
        token = res.cookies['vpstoken']
        print(token)
    parsed = urlparse(url)
    wmAid = parse_qs(parsed.query)['mid'][0]
    uc_param_str = parse_qs(parsed.query)['uc_param_str'][0]
    listAPI = 'https://ff.dayu.com/contents/author/'+wmAid+'?biz_id=1002&_size='+str(size)+'&_page='+str(page)+'&_order_type=published_at&status=1&_fetch=1&uc_param_str='+uc_param_str+'&_='+str(int(time.time()) * 1000)+''
    res = requests.get(listAPI, headers = headers)
    if res.status_code == 200:
        res.encoding = 'utf8'   
        listData = json.loads(res.text)
        for one in listData['data']:
            wmId = wmAid
            cid = one["content_id"]
            title = one['title']
            ums_id = one['body']['videos'][0]['ums_id']
            videoAPI = 'https://mparticle.uc.cn/api/vps?token='+token+'&ums_id='+ums_id+'&wm_cid='+cid+'&wm_id='+wmId+'&resolution=high'
            res = requests.get(videoAPI, headers = headers)
            if res.status_code == 200:
                res.encoding = 'utf8'
                videoURLInfo = json.loads(res.text)
                videoURL = videoURLInfo['data']['url']
                print("downloading: " + videoURL)
                downloadVideo(videoURL, directory+"\\"+title+".mp4")
                print("Fininsed!")
            


if __name__ == "__main__":
    initURL = "http://a.mp.uc.cn/media.html?mid=5ebdd3d3aada4f3fb5b5a33b3669a0c6&client=ucweb&uc_param_str=frdnsnpfvecpntnwprdsssnikt"
    directory="D:\\李子柒"
    if (not os.path.exists(directory)):
        os.makedirs(directory)
    analysizeAndDownload(initURL, 1, 8, directory) #下载第一页的八个视频,可修改参数循环下载第二页,第三页

 

此文仅供学习用途,若用于商业用途,自行承担后果!

转载请标明出处!http://52sbl.cn/article/8

相关标签:
  • UC短视频爬虫
  • python下载视频
2人点赞

发表评论

当前游客模式,请登陆发言

所有评论(0)

用户头像
张三

挺好的