Kkit.mder
This is a multithreading m3u8 download module. Support download m3u8 file and convert it to mp4. Support resume download.
Example:
#test.py
downloader = mder.m3u8_downloader(m3u8_file_path='./test.m3u8',temp_file_path='./',mp4_path='./test.mp4',num_of_threads=10)
# parameters
# 1.m3u8_file_path
# default : no default (type : str)
# 2.temp_file_path
# default : '.' (type : str)
# 3.mp4_path
# default : './test.mp4' (type : str)
# 4.num_of_threads
# default : 10 (type : int)
downloader.start()
# parameters
# 1.mod
# default : 0 (type : int)
# mod 0 means delete TS folder and m3u8 file if download completely
# mod 1 means delete m3u8 file only if download completely
# mod 2 means delete TS folder only if download completely
# mod 3 means reserve TS folder and m3u8 file if download completely
# 2.time_out
# default : 60 (type : int)(units : seconds)
# The time_out is the timeout in request.get(timeout=)
before download
the structure of ./ is:
.
├── test.m3u8
└── test.py
when it is downloading
the structure of ./ is:
.
├── TS
│ ├── qzCFnDUZE9_720_5308_0000.ts
│ ├── qzCFnDUZE9_720_5308_0001.ts
│ ├── qzCFnDUZE9_720_5308_0002.ts
│ ├── qzCFnDUZE9_720_5308_0003.ts
│ ├── qzCFnDUZE9_720_5308_0004.ts
│ ├── qzCFnDUZE9_720_5308_0005.ts
│ ├── qzCFnDUZE9_720_5308_0006.ts
│ ├── qzCFnDUZE9_720_5308_0007.ts
│ ├── qzCFnDUZE9_720_5308_0008.ts
│ ├── qzCFnDUZE9_720_5308_0009.ts
│ └── qzCFnDUZE9_720_5308_0010.ts
├── test.m3u8
└── test.py
process bar: <<\*>> 29% 500/1752 [01:33<04:02] <<\*>>
TS is temp folder, all .ts file are in it. The path of it is %temp_file_path%/TS, in the test case, it is in ./TS. If the mission is not complete, the m3u8 file and TS folder will be reserved, you can instance a new downloader with corresponding TS folder and m3u8 file, and use the start() function to begin, in this way, the mission will go on.
after download and download successfully
the structure of ./ is:
.
├── test.mp4
└── test.py
If some .ts download failed, the module will redownload for 3 times, and the information will print to the command line
at last, the command line is like this:
<<*>> 99% 1737/1752 [05:35<00:22] <<*>>
thread0 Time out ERROR qzCFnDUZE9_720_5308_1710.ts
thread2 Time out ERROR qzCFnDUZE9_720_5308_1722.ts
thread0 redownload successfully qzCFnDUZE9_720_5308_1710.ts
<<*>> 99% 1738/1752 [06:20<03:19] <<*>>
thread2 redownload successfully qzCFnDUZE9_720_5308_1722.ts
<<*>> 100% 1752/1752 [06:26<00:00] <<*>>
downloading finished 100.00%
restart If you want to restart a incomplete mission, you only should use the corresponding TS folder and .m3u8 file
1""" 2This is a multithreading m3u8 download module. Support download m3u8 file and convert it to mp4. Support resume download. 3 4Example: 5 6```python 7#test.py 8downloader = mder.m3u8_downloader(m3u8_file_path='./test.m3u8',temp_file_path='./',mp4_path='./test.mp4',num_of_threads=10) 9# parameters 10# 1.m3u8_file_path 11# default : no default (type : str) 12# 2.temp_file_path 13# default : '.' (type : str) 14# 3.mp4_path 15# default : './test.mp4' (type : str) 16# 4.num_of_threads 17# default : 10 (type : int) 18 19downloader.start() 20# parameters 21# 1.mod 22# default : 0 (type : int) 23# mod 0 means delete TS folder and m3u8 file if download completely 24# mod 1 means delete m3u8 file only if download completely 25# mod 2 means delete TS folder only if download completely 26# mod 3 means reserve TS folder and m3u8 file if download completely 27# 2.time_out 28# default : 60 (type : int)(units : seconds) 29# The time_out is the timeout in request.get(timeout=) 30``` 31 32**before download** 33 34the structure of ./ is: 35``` 36. 37├── test.m3u8 38└── test.py 39``` 40 41**when it is downloading** 42 43the structure of ./ is: 44``` 45. 46├── TS 47│ ├── qzCFnDUZE9_720_5308_0000.ts 48│ ├── qzCFnDUZE9_720_5308_0001.ts 49│ ├── qzCFnDUZE9_720_5308_0002.ts 50│ ├── qzCFnDUZE9_720_5308_0003.ts 51│ ├── qzCFnDUZE9_720_5308_0004.ts 52│ ├── qzCFnDUZE9_720_5308_0005.ts 53│ ├── qzCFnDUZE9_720_5308_0006.ts 54│ ├── qzCFnDUZE9_720_5308_0007.ts 55│ ├── qzCFnDUZE9_720_5308_0008.ts 56│ ├── qzCFnDUZE9_720_5308_0009.ts 57│ └── qzCFnDUZE9_720_5308_0010.ts 58├── test.m3u8 59└── test.py 60``` 61process bar: <<\*>> 29% 500/1752 [01:33<04:02] <<\*>> 62 63TS is temp folder, all .ts file are in it. The path of it is %temp_file_path%/TS, in the test case, it is in ./TS. If the mission is not complete, the m3u8 file and TS folder will be reserved, you can instance a new downloader with corresponding TS folder and m3u8 file, and use the start() function to begin, in this way, the mission will go on. 64 65**after download and download successfully** 66 67the structure of ./ is: 68``` 69. 70├── test.mp4 71└── test.py 72``` 73 74If some .ts download failed, the module will redownload for 3 times, and the information will print to the command line 75 76at last, the command line is like this: 77``` 78<<*>> 99% 1737/1752 [05:35<00:22] <<*>> 79thread0 Time out ERROR qzCFnDUZE9_720_5308_1710.ts 80thread2 Time out ERROR qzCFnDUZE9_720_5308_1722.ts 81thread0 redownload successfully qzCFnDUZE9_720_5308_1710.ts 82<<*>> 99% 1738/1752 [06:20<03:19] <<*>> 83thread2 redownload successfully qzCFnDUZE9_720_5308_1722.ts 84<<*>> 100% 1752/1752 [06:26<00:00] <<*>> 85downloading finished 100.00% 86``` 87**restart** 88If you want to restart a incomplete mission, you only should use the corresponding TS folder and .m3u8 file 89 90""" 91 92# a multithreading m3u8 download module and the number of threads can decide by yourself 93# author: walkureHHH 94# last modify: 2020/06/17 95import requests 96from urllib.parse import urljoin 97from threading import Thread 98from threading import Lock 99import os 100import shutil 101from tqdm import tqdm 102 103 104class thread_num_ERROR(Exception): 105 """ 106 Thread number error. 107 Be raised when the number of threads is eqial to smaller than 0. 108 """ 109 pass 110 111class mod_ERROR(Exception): 112 """ 113 Mod error. 114 Be raised when the mod is not in [0,1,2,3]. 115 """ 116 pass 117 118class m3u8_downloader: 119 """ 120 M3u8 downloader. 121 """ 122 temp_file_path = '' 123 """@private""" 124 mp4_path = '' 125 """@private""" 126 num_of_threads = '' 127 """@private""" 128 m3u8_file_path = '' 129 """@private""" 130 urls = [] 131 """@private""" 132 names = [] 133 """@private""" 134 has_download_name = [] 135 """@private""" 136 cant_dow = [] 137 """@private""" 138 total = 0 139 """@private""" 140 lock = Lock() 141 """@private""" 142 def __init__(self,m3u8_file_path, url_prefix=None,temp_file_path='.',mp4_path='./test.mp4',num_of_threads=10): 143 """ 144 Initialize the m3u8 downloader. 145 146 Parameters 147 ---------- 148 m3u8_file_path : str 149 The path of the m3u8 file. 150 151 url_prefix : str 152 The prefix of the url. Default is None. 153 Some m3u8 file has not the full url, so you can add the prefix to the url. 154 For example, the url is '/video/1.ts', and the prefix is 'http://www.example.com'. 155 156 temp_file_path : str 157 The path of the temporary folder (store *.ts files). Default is '.'. 158 159 mp4_path : str 160 The path of the result mp4 file. Default is './test.mp4'. 161 162 num_of_threads : int 163 The number of threads. Default is 10. 164 165 """ 166 if num_of_threads <= 0: 167 raise thread_num_ERROR('the number of threads can\'t smaller than 0') 168 self.mp4_path = mp4_path 169 self.temp_file_path = temp_file_path 170 self.num_of_threads = num_of_threads 171 self.m3u8_file_path = m3u8_file_path 172 if os.path.exists(self.temp_file_path+'/TS'): 173 print("""warning: the temporary folder has exited\n 174please comfirm the temporary folder included the fragment video you need""") 175 self.has_download_name = os.listdir(self.temp_file_path+'/TS') 176 else: 177 os.mkdir(self.temp_file_path+'/TS') 178 self.has_download_name = [] 179 with open(self.m3u8_file_path,'r') as m3u8: 180 temp_url = [m3u8_lines.replace('\n','') for m3u8_lines in m3u8.readlines() if m3u8_lines.startswith('#')==False] 181 if url_prefix != None: 182 temp_url = [urljoin(url_prefix, i) for i in temp_url] 183 self.total = len(temp_url) 184 self.names = [i.split('/')[-1].split('?')[0] for i in temp_url] 185 self.urls = [[] for j in range(0, self.num_of_threads)] 186 for index, el in enumerate(temp_url): 187 self.urls[index%self.num_of_threads].append(el) 188 return 189 190 def start(self,mod = 0, time_out = 60): 191 """ 192 Start download. 193 194 Parameters 195 ---------- 196 mod : int 197 The mod of the download. Default is 0. 198 0: delete the m3u8 file and the temporary folder. 199 1: delete the m3u8 file. 200 2: delete the temporary folder. 201 3: do nothing. 202 203 time_out : int 204 The time out of the download. Default is 60s. 205 """ 206 if mod not in [0,1,2,3]: 207 raise mod_ERROR('Only have mod 0 , 1 , 2 or 3') 208 with tqdm(total=self.total,bar_format='<<*>> {percentage:3.0f}% {n_fmt}/{total_fmt} [{elapsed}<{remaining}] <<*>> ') as jdt: 209 Threads = [] 210 for i in range(self.num_of_threads): 211 thread = Thread(target=self.__download, args=(self.urls[i],'thread'+str(i),jdt,time_out)) 212 Threads.append(thread) 213 for threads in Threads: 214 threads.start() 215 for threads in Threads: 216 threads.join() 217 percent = '%.02f%%'%((len(self.has_download_name)/len(self.names))*100) 218 if len(self.has_download_name)==len(self.names): 219 print('downloading finished',percent) 220 for names in self.names: 221 ts = open(self.temp_file_path+'/TS/'+names,'rb') 222 with open(self.mp4_path,'ab') as mp4: 223 mp4.write(ts.read()) 224 ts.close() 225 if mod == 0 or mod == 1: 226 os.remove(self.m3u8_file_path) 227 if mod == 0 or mod == 2: 228 shutil.rmtree(self.temp_file_path+'/TS') 229 else: 230 print('----------------------------------------------------------------') 231 for cantdow_urls in self.cant_dow: 232 print('downloading fail:',cantdow_urls) 233 print('incomplete downloading',percent) 234 235 def __download(self, download_list, thread_name, jdt, time_out): 236 for urls in download_list: 237 if urls.split('/')[-1].split('?')[0] not in self.has_download_name: 238 for i in range(0,5): 239 try: 240 conn = requests.get(urls,timeout=time_out) 241 if conn.status_code == 200: 242 with open(self.temp_file_path+'/TS/'+urls.split('/')[-1].split('?')[0],'wb') as ts: 243 ts.write(conn.content) 244 with self.lock: 245 if i != 0: 246 print('\n'+thread_name,'redownload successfully',urls.split('/')[-1].split('?')[0]) 247 self.has_download_name.append(urls.split('/')[-1].split('?')[0]) 248 jdt.update(1) 249 break 250 else: 251 with self.lock: 252 if i == 0: 253 print('\n'+thread_name,conn.status_code,urls.split('/')[-1].split('?')[0],'begin retry 1') 254 else: 255 print('\n'+thread_name,conn.status_code,urls.split('/')[-1].split('?')[0],'Retry '+ str(i) +'/3') 256 if i == 4: 257 self.cant_dow.append(urls) 258 except: 259 with self.lock: 260 if i == 0: 261 print('\n'+thread_name,'Time out ERROR',urls.split('/')[-1].split('?')[0],'begin retry 1') 262 else: 263 print('\n'+thread_name,'Time out ERROR',urls.split('/')[-1].split('?')[0],'Retry '+ str(i) +'/3') 264 if i == 4: 265 self.cant_dow.append(urls) 266 else: 267 with self.lock: 268 jdt.update(1) 269if __name__ == "__main__": 270 a = m3u8_downloader('/mnt/c/Users/kylis/Downloads/r.m3u8',temp_file_path='.',mp4_path='./1.mp4', num_of_threads=17) 271 a.start()
105class thread_num_ERROR(Exception): 106 """ 107 Thread number error. 108 Be raised when the number of threads is eqial to smaller than 0. 109 """ 110 pass
Thread number error. Be raised when the number of threads is eqial to smaller than 0.
112class mod_ERROR(Exception): 113 """ 114 Mod error. 115 Be raised when the mod is not in [0,1,2,3]. 116 """ 117 pass
Mod error. Be raised when the mod is not in [0,1,2,3].
119class m3u8_downloader: 120 """ 121 M3u8 downloader. 122 """ 123 temp_file_path = '' 124 """@private""" 125 mp4_path = '' 126 """@private""" 127 num_of_threads = '' 128 """@private""" 129 m3u8_file_path = '' 130 """@private""" 131 urls = [] 132 """@private""" 133 names = [] 134 """@private""" 135 has_download_name = [] 136 """@private""" 137 cant_dow = [] 138 """@private""" 139 total = 0 140 """@private""" 141 lock = Lock() 142 """@private""" 143 def __init__(self,m3u8_file_path, url_prefix=None,temp_file_path='.',mp4_path='./test.mp4',num_of_threads=10): 144 """ 145 Initialize the m3u8 downloader. 146 147 Parameters 148 ---------- 149 m3u8_file_path : str 150 The path of the m3u8 file. 151 152 url_prefix : str 153 The prefix of the url. Default is None. 154 Some m3u8 file has not the full url, so you can add the prefix to the url. 155 For example, the url is '/video/1.ts', and the prefix is 'http://www.example.com'. 156 157 temp_file_path : str 158 The path of the temporary folder (store *.ts files). Default is '.'. 159 160 mp4_path : str 161 The path of the result mp4 file. Default is './test.mp4'. 162 163 num_of_threads : int 164 The number of threads. Default is 10. 165 166 """ 167 if num_of_threads <= 0: 168 raise thread_num_ERROR('the number of threads can\'t smaller than 0') 169 self.mp4_path = mp4_path 170 self.temp_file_path = temp_file_path 171 self.num_of_threads = num_of_threads 172 self.m3u8_file_path = m3u8_file_path 173 if os.path.exists(self.temp_file_path+'/TS'): 174 print("""warning: the temporary folder has exited\n 175please comfirm the temporary folder included the fragment video you need""") 176 self.has_download_name = os.listdir(self.temp_file_path+'/TS') 177 else: 178 os.mkdir(self.temp_file_path+'/TS') 179 self.has_download_name = [] 180 with open(self.m3u8_file_path,'r') as m3u8: 181 temp_url = [m3u8_lines.replace('\n','') for m3u8_lines in m3u8.readlines() if m3u8_lines.startswith('#')==False] 182 if url_prefix != None: 183 temp_url = [urljoin(url_prefix, i) for i in temp_url] 184 self.total = len(temp_url) 185 self.names = [i.split('/')[-1].split('?')[0] for i in temp_url] 186 self.urls = [[] for j in range(0, self.num_of_threads)] 187 for index, el in enumerate(temp_url): 188 self.urls[index%self.num_of_threads].append(el) 189 return 190 191 def start(self,mod = 0, time_out = 60): 192 """ 193 Start download. 194 195 Parameters 196 ---------- 197 mod : int 198 The mod of the download. Default is 0. 199 0: delete the m3u8 file and the temporary folder. 200 1: delete the m3u8 file. 201 2: delete the temporary folder. 202 3: do nothing. 203 204 time_out : int 205 The time out of the download. Default is 60s. 206 """ 207 if mod not in [0,1,2,3]: 208 raise mod_ERROR('Only have mod 0 , 1 , 2 or 3') 209 with tqdm(total=self.total,bar_format='<<*>> {percentage:3.0f}% {n_fmt}/{total_fmt} [{elapsed}<{remaining}] <<*>> ') as jdt: 210 Threads = [] 211 for i in range(self.num_of_threads): 212 thread = Thread(target=self.__download, args=(self.urls[i],'thread'+str(i),jdt,time_out)) 213 Threads.append(thread) 214 for threads in Threads: 215 threads.start() 216 for threads in Threads: 217 threads.join() 218 percent = '%.02f%%'%((len(self.has_download_name)/len(self.names))*100) 219 if len(self.has_download_name)==len(self.names): 220 print('downloading finished',percent) 221 for names in self.names: 222 ts = open(self.temp_file_path+'/TS/'+names,'rb') 223 with open(self.mp4_path,'ab') as mp4: 224 mp4.write(ts.read()) 225 ts.close() 226 if mod == 0 or mod == 1: 227 os.remove(self.m3u8_file_path) 228 if mod == 0 or mod == 2: 229 shutil.rmtree(self.temp_file_path+'/TS') 230 else: 231 print('----------------------------------------------------------------') 232 for cantdow_urls in self.cant_dow: 233 print('downloading fail:',cantdow_urls) 234 print('incomplete downloading',percent) 235 236 def __download(self, download_list, thread_name, jdt, time_out): 237 for urls in download_list: 238 if urls.split('/')[-1].split('?')[0] not in self.has_download_name: 239 for i in range(0,5): 240 try: 241 conn = requests.get(urls,timeout=time_out) 242 if conn.status_code == 200: 243 with open(self.temp_file_path+'/TS/'+urls.split('/')[-1].split('?')[0],'wb') as ts: 244 ts.write(conn.content) 245 with self.lock: 246 if i != 0: 247 print('\n'+thread_name,'redownload successfully',urls.split('/')[-1].split('?')[0]) 248 self.has_download_name.append(urls.split('/')[-1].split('?')[0]) 249 jdt.update(1) 250 break 251 else: 252 with self.lock: 253 if i == 0: 254 print('\n'+thread_name,conn.status_code,urls.split('/')[-1].split('?')[0],'begin retry 1') 255 else: 256 print('\n'+thread_name,conn.status_code,urls.split('/')[-1].split('?')[0],'Retry '+ str(i) +'/3') 257 if i == 4: 258 self.cant_dow.append(urls) 259 except: 260 with self.lock: 261 if i == 0: 262 print('\n'+thread_name,'Time out ERROR',urls.split('/')[-1].split('?')[0],'begin retry 1') 263 else: 264 print('\n'+thread_name,'Time out ERROR',urls.split('/')[-1].split('?')[0],'Retry '+ str(i) +'/3') 265 if i == 4: 266 self.cant_dow.append(urls) 267 else: 268 with self.lock: 269 jdt.update(1)
M3u8 downloader.
143 def __init__(self,m3u8_file_path, url_prefix=None,temp_file_path='.',mp4_path='./test.mp4',num_of_threads=10): 144 """ 145 Initialize the m3u8 downloader. 146 147 Parameters 148 ---------- 149 m3u8_file_path : str 150 The path of the m3u8 file. 151 152 url_prefix : str 153 The prefix of the url. Default is None. 154 Some m3u8 file has not the full url, so you can add the prefix to the url. 155 For example, the url is '/video/1.ts', and the prefix is 'http://www.example.com'. 156 157 temp_file_path : str 158 The path of the temporary folder (store *.ts files). Default is '.'. 159 160 mp4_path : str 161 The path of the result mp4 file. Default is './test.mp4'. 162 163 num_of_threads : int 164 The number of threads. Default is 10. 165 166 """ 167 if num_of_threads <= 0: 168 raise thread_num_ERROR('the number of threads can\'t smaller than 0') 169 self.mp4_path = mp4_path 170 self.temp_file_path = temp_file_path 171 self.num_of_threads = num_of_threads 172 self.m3u8_file_path = m3u8_file_path 173 if os.path.exists(self.temp_file_path+'/TS'): 174 print("""warning: the temporary folder has exited\n 175please comfirm the temporary folder included the fragment video you need""") 176 self.has_download_name = os.listdir(self.temp_file_path+'/TS') 177 else: 178 os.mkdir(self.temp_file_path+'/TS') 179 self.has_download_name = [] 180 with open(self.m3u8_file_path,'r') as m3u8: 181 temp_url = [m3u8_lines.replace('\n','') for m3u8_lines in m3u8.readlines() if m3u8_lines.startswith('#')==False] 182 if url_prefix != None: 183 temp_url = [urljoin(url_prefix, i) for i in temp_url] 184 self.total = len(temp_url) 185 self.names = [i.split('/')[-1].split('?')[0] for i in temp_url] 186 self.urls = [[] for j in range(0, self.num_of_threads)] 187 for index, el in enumerate(temp_url): 188 self.urls[index%self.num_of_threads].append(el) 189 return
Initialize the m3u8 downloader.
Parameters
m3u8_file_path : str The path of the m3u8 file.
url_prefix : str The prefix of the url. Default is None. Some m3u8 file has not the full url, so you can add the prefix to the url. For example, the url is '/video/1.ts', and the prefix is 'http://www.example.com'.
temp_file_path : str The path of the temporary folder (store *.ts files). Default is '.'.
mp4_path : str The path of the result mp4 file. Default is './test.mp4'.
num_of_threads : int The number of threads. Default is 10.
191 def start(self,mod = 0, time_out = 60): 192 """ 193 Start download. 194 195 Parameters 196 ---------- 197 mod : int 198 The mod of the download. Default is 0. 199 0: delete the m3u8 file and the temporary folder. 200 1: delete the m3u8 file. 201 2: delete the temporary folder. 202 3: do nothing. 203 204 time_out : int 205 The time out of the download. Default is 60s. 206 """ 207 if mod not in [0,1,2,3]: 208 raise mod_ERROR('Only have mod 0 , 1 , 2 or 3') 209 with tqdm(total=self.total,bar_format='<<*>> {percentage:3.0f}% {n_fmt}/{total_fmt} [{elapsed}<{remaining}] <<*>> ') as jdt: 210 Threads = [] 211 for i in range(self.num_of_threads): 212 thread = Thread(target=self.__download, args=(self.urls[i],'thread'+str(i),jdt,time_out)) 213 Threads.append(thread) 214 for threads in Threads: 215 threads.start() 216 for threads in Threads: 217 threads.join() 218 percent = '%.02f%%'%((len(self.has_download_name)/len(self.names))*100) 219 if len(self.has_download_name)==len(self.names): 220 print('downloading finished',percent) 221 for names in self.names: 222 ts = open(self.temp_file_path+'/TS/'+names,'rb') 223 with open(self.mp4_path,'ab') as mp4: 224 mp4.write(ts.read()) 225 ts.close() 226 if mod == 0 or mod == 1: 227 os.remove(self.m3u8_file_path) 228 if mod == 0 or mod == 2: 229 shutil.rmtree(self.temp_file_path+'/TS') 230 else: 231 print('----------------------------------------------------------------') 232 for cantdow_urls in self.cant_dow: 233 print('downloading fail:',cantdow_urls) 234 print('incomplete downloading',percent)
Start download.
Parameters
mod : int The mod of the download. Default is 0. 0: delete the m3u8 file and the temporary folder. 1: delete the m3u8 file. 2: delete the temporary folder. 3: do nothing.
time_out : int The time out of the download. Default is 60s.