Downloader

class parfive.Downloader(max_conn=5, progress=True, file_progress=True, loop=None, notebook=None, overwrite=False, headers=None, use_aiofiles=False)[source]

Bases: object

Download files in parallel.

Parameters
  • max_conn (int, optional) – The number of parallel download slots.

  • progress (bool, optional) – If True show a main progress bar showing how many of the total files have been downloaded. If False, no progress bars will be shown at all.

  • file_progress (bool, optional) – If True and progress is true, show max_conn progress bars detailing the progress of each individual file being downloaded.

  • loop (asyncio.AbstractEventLoop, optional) – No longer used, and will be removed in a future release.

  • notebook (bool, optional) – If True tqdm will be used in notebook mode. If None an attempt will be made to detect the notebook and guess which progress bar to use.

  • overwrite (bool or str, optional) – Determine how to handle downloading if a file already exists with the same name. If False the file download will be skipped and the path returned to the existing file, if True the file will be downloaded and the existing file will be overwritten, if 'unique' the filename will be modified to be unique.

  • headers (dict) – Request headers to be passed to the server. Adds User-Agent information about parfive, aiohttp and python if not passed explicitely.

Attributes Summary

default_chunk_size

aiofiles requires a different default chunk size

queued_downloads

The total number of files already queued for download.

use_aiofiles

aiofiles will be used if installed and must be explicitly enabled

Methods Summary

download([timeouts])

Download all files in the queue.

enqueue_file(url[, path, filename, overwrite])

Add a file to the download queue.

retry(results)

Retry any failed downloads in a results object.

run_download([timeouts])

Download all files in the queue.

simple_download(urls, *[, path, overwrite])

Download a series of URLs to a single destination.

Attributes Documentation

default_chunk_size

aiofiles requires a different default chunk size

queued_downloads

The total number of files already queued for download.

use_aiofiles

aiofiles will be used if installed and must be explicitly enabled

PARFIVE_OVERWRITE_ENABLE_AIOFILES takes precedence if present, aiofiles will not be used

finally the Downloader’s constructor argument is considered.

Methods Documentation

download(timeouts=None)[source]

Download all files in the queue.

Parameters

timeouts (dict, optional) – Overrides for the default timeouts for http downloads. Supported keys are any accepted by the aiohttp.ClientTimeout class. Defaults to no timeout for total session timeout (overriding the aiohttp 5 minute default) and 90 seconds for socket read timeout.

Returns

parfive.Results – A list of files downloaded.

Notes

This is a synchronous version of run_download, an asyncio event loop will be created to run the download (in it’s own thread if a loop is already running).

The defaults for the 'total' and 'sock_read' timeouts can be overridden by two environment variables PARFIVE_TOTAL_TIMEOUT and PARFIVE_SOCK_READ_TIMEOUT.

enqueue_file(url, path=None, filename=None, overwrite=None, **kwargs)[source]

Add a file to the download queue.

Parameters
  • url (str) – The URL to retrieve.

  • path (str, optional) – The directory to retrieve the file into, if None defaults to the current directory.

  • filename (str or callable, optional) – The filename to save the file as. Can also be a callable which takes two arguments the url and the response object from opening that URL, and returns the filename. (Note, for FTP downloads the response will be None.) If None the HTTP headers will be read for the filename, or the last segment of the URL will be used.

  • overwrite (bool or str, optional) – Determine how to handle downloading if a file already exists with the same name. If False the file download will be skipped and the path returned to the existing file, if True the file will be downloaded and the existing file will be overwritten, if 'unique' the filename will be modified to be unique. If None the value set when constructing the Downloader object will be used.

  • kwargs (dict) – Extra keyword arguments are passed to aiohttp.ClientSession.get or aioftp.Client.context depending on the protocol.

Notes

Proxy URL is read from the environment variables HTTP_PROXY or HTTPS_PROXY, depending on the protocol of the url passed. Proxy Authentication proxy_auth should be passed as a aiohttp.BasicAuth object. Proxy Headers proxy_headers should be passed as dict object.

retry(results)[source]

Retry any failed downloads in a results object.

Note

This will start a new event loop.

Parameters

results (parfive.Results) – A previous results object, the .errors property will be read and the downloads retried.

Returns

parfive.Results – A modified version of the input results with all the errors from this download attempt and any new files appended to the list of file paths.

async run_download(timeouts=None)[source]

Download all files in the queue.

Parameters

timeouts (dict, optional) – Overrides for the default timeouts for http downloads. Supported keys are any accepted by the aiohttp.ClientTimeout class. Defaults to no timeout for total session timeout (overriding the aiohttp 5 minute default) and 90 seconds for socket read timeout.

Returns

parfive.Results – A list of files downloaded.

Notes

The defaults for the 'total' and 'sock_read' timeouts can be overridden by two environment variables PARFIVE_TOTAL_TIMEOUT and PARFIVE_SOCK_READ_TIMEOUT.

classmethod simple_download(urls, *, path='./', overwrite=None)[source]

Download a series of URLs to a single destination.

Parameters
  • urls (iterable) – A sequence of URLs to download.

  • path (pathlib.Path, optional) – The destination directory for the downloaded files. Defaults to the current directory.

  • overwrite (bool, optional) – Overwrite the files at the destination directory. If False the URL will not be downloaded if a file with the corresponding filename already exists.

Returns

parfive.Results – A list of files downloaded.