934926.jpg

韶华の

GF  2020-08-22 12:12
(仓鼠症重度患者;代下115磁力,50sp/G;更多看简介)

[全年龄正常向]python相关,爬取漫画,知道图片的地址,但是下载不下来

漫画地址:https://raws.mangazuki.co/manga/lucky-g/14/1
附上源代码:
如果不能解决至少告诉我具体的错误在哪里,谢谢大佬

import os
import time
import requests

header = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
          'referer':'https://raws.mangazuki.co/manga/lucky-g',
          'upgrade-insecure-requests':'1'

           }

# ##cookies##
cookies = "__cfduid=dcd4c0fd7cef0f359f5b2738d7621dc581598058098; cf_chl_1=c9e22b71250c4f4; cf_chl_prog=a37; cf_clearance=fd182b2c30ecaae31b407c4e28a6408449abb3fd-1598062999-0-1z76666e6bz595cd7c8z95c6774b-150; XSRF-TOKEN=eyJpdiI6IkpZK2t0T3JJelo4cUlFVnFPdkMyZkE9PSIsInZhbHVlIjoiYmF1eFNxZUNYZW0wQklJS1YzTGlvWCtkK2FOWXZ3UlRaSFQyZlpUaE1seHJpYWp1OFh5SHhSVXBKMGMydzUxVmc4R2YrRlhRK1lPbm9wbmN0eWt4M1E9PSIsIm1hYyI6IjRkNjBiYzY5YzA3NjVkM2VmYWNkOTMzODU0ZjQ3YTZiNDdmN2I5NTUxMmQwNDVkYjgxNTYyZDlkODQ5MjY1NmIifQ%3D%3D; laravel_session=eyJpdiI6IjBRNmdiYTZKNk9uSjZ6aEZaajQ2VWc9PSIsInZhbHVlIjoiM1duYkZsTUp5eXRRUGRyWnN6aEtRRGRSXC9zQ1Juc2RMNG5KQ3RKMGVwdzZJZ3hTRWg3MEhQRGUxYU1BakJtS1d2dkYrMTlEbXRVeit0N04xXC82K3RIQT09IiwibWFjIjoiYzM5MTUyOTBlOTI3YTg5OTUyMGIzNDBiZGVhYmRkNzIzZjkxOTE5YmFkODkwYzhjMDkxYjA3Njc1ODI4MjBjYSJ9; _ga=GA1.2.1069158752.1598058099; _gid=GA1.2.267406105.1598058099; AdskeeperStorage=%7B%220%22%3A%7B%22svspr%22%3A%22%22%2C%22svsds%22%3A31%2C%22TejndEEDj%22%3A%22c6ojt7ZDZ%22%7D%2C%22C989026%22%3A%7B%22page%22%3A20%2C%22time%22%3A1598063581039%7D%2C%22C989029%22%3A%7B%22page%22%3A12%2C%22time%22%3A1598063580948%7D%7D"
cookie={}
for line in cookies.split(';'):
    name,value=line.strip().split('=',1)
    cookie[name]=value

chap = []
for chapter in range(1,70):
    for page in range(1,30):
        p = '%02d' % page
        urls = 'https://raws.mangazuki.co/uploads/manga/lucky-g/chapters/' + str(chapter) +  "/"+ str(p) +'.jpg'
        urls.replace(" ", "")
        print(urls)

        ##對圖片進行下載
        path = "C://mangzukli//" + str(chapter) + "//"
        print(path)
        name  = path + str(p) + ".jpg"
        try:
            print("1")
            if not os.path.exists(path):
                print("2")
                os.mkdir(path)
            if not os.path.exists(name):
                print("3")
                r = requests.get(urls,headers = header, cookies =cookie)
                print(r)
                r.raise_for_status()
                time.sleep(10)
                with open(name,"wb") as f:
                    f.write(r.content)

                    print("ok")
            else:
                 print("none")
        except Exception:
            print("無圖片")
此帖悬赏结束
最佳答案: 20 SP币
最佳答案获得者: 0bd5a4b8

none.gif

Kusher

回 4楼(韶华の) 的帖子

他加了cloudflare,不能直接爬,应该要先绕过去,试一试用cloudflare-scrape?

只想存图的话,lz用chrome点你给的那个链接,进去图片加载完之后,按ctrl+s,保存。 然后你去保存的目录里找一个叫“Lucky Guy Chapter 14 - Page 1_files”的文件夹,图应该都在里面
最佳答案奖励: (+20) SP币

none.gif

Kusher

.jpg可以直接访问,直接用正则表达式筛选元素get吧

https://zhuanlan.zhihu.com/p/63679720
热心助人奖励: (+1) SP币

none.gif

Kusher

等下,这网站加了cloudflare

none.gif

Kusher

试了一下,楼主直接右键 另存为 网页到本地吧,全部图都会被存下来

934926.jpg

韶华の

B5F  2020-08-22 12:22
(仓鼠症重度患者;代下115磁力,50sp/G;更多看简介)

回 3楼(Kusher) 的帖子

全部都能存下来?

怎么操作的?

可以说详细点吗?

可以跑下我的代码,看看问题出在哪