document.javascript Module

class wpull.document.javascript.JavaScriptReader[source]

Bases: wpull.document.base.BaseDocumentDetector, wpull.document.base.BaseTextStreamReader

JavaScript Document Reader.

BUFFER_SIZE = 1048576
STREAM_REWIND = 4096
URL_PATTERN = '(\\\\{0,8}[\'"])(https?://[^\'"]{1,500}|[^\\s\'"]{1,500})(?:\\1)'
URL_REGEX = re.compile('(\\\\{0,8}[\'"])(https?://[^\'"]{1,500}|[^\\s\'"]{1,500})(?:\\1)')
classmethod is_file(file)[source]

Return whether the file is likely JS.

classmethod is_request(request)[source]

Return whether the document is likely to be JS.

classmethod is_response(response)[source]

Return whether the document is likely to be JS.

classmethod is_url(url_info)[source]

Return whether the document is likely to be JS.

iter_text(file, encoding=None)[source]

Return an iterator of links found in the document.

Parameters:
  • file – A file object containing the document.
  • encoding (str) – The encoding of the document.
Returns:

str

Return type:

iterable