module xmlhelper.html_parser_json
¶
Short summary¶
module pyrsslocal.xmlhelper.html_parser_json
parsing HTML to convert it into JSON
Classes¶
class |
truncated documentation |
---|---|
Functions¶
function |
truncated documentation |
---|---|
Iterates on every field contains in the JSON structure. |
Properties¶
property |
truncated documentation |
---|---|
Returns the JSON strucure. |
Static Methods¶
staticmethod |
truncated documentation |
---|---|
Iterates on every field contains in the JSON structure. |
|
Methods¶
method |
truncated documentation |
---|---|
Cleans a dictionary of value. |
|
What to do with data. |
|
What to do for the end of a tag. |
|
What to do for a new tag. |
Documentation¶
parsing HTML to convert it into JSON
- class pyrsslocal.xmlhelper.html_parser_json.HTMLtoJSONParser(raise_exception=True)¶
Bases:
HTMLParser
Parses HTML and output a JSON structure. Example:
file = ... with open(file,"r",encoding="utf8") as f : content = f.read() parser = HTMLtoJSONParser() parser.feed(content) js = parser.json
Or:
js = HTMLtoJSONParser.to_json(content)
To iterator on path:
all = [ (k,v) for k,v in HTMLtoJSONParser.iterate(js) ]
- Parameters:
raise_exception – if True, raises an exception if the HTML is malformed, otherwise does what it can
- __init__(raise_exception=True)¶
- Parameters:
raise_exception – if True, raises an exception if the HTML is malformed, otherwise does what it can
- clean(values)¶
Cleans a dictionary of value.
- handle_data(data)¶
What to do with data.
- handle_endtag(tag)¶
What to do for the end of a tag.
- handle_starttag(tag, attrs)¶
What to do for a new tag.
- static iterate(json_structure, prefix='', keep_dictionaries=False, skip=['__parent__'])¶
Iterates on every field contains in the JSON structure.
- Parameters:
json_structure – json structure
prefix – prefix to add
keep_dictionaries – if True, add yield k,v where v is a JSON dictionary
skip – do not enter the following tag
- Returns:
iterator of (path, value)
- pyrsslocal.xmlhelper.html_parser_json.iterate_on_json(json_structure, prefix='', keep_dictionaries=False, skip=['__parent__'])¶
Iterates on every field contains in the JSON structure.
- Parameters:
json_structure – json structure
prefix – prefix to add
keep_dictionaries – if True, add yield k,v where v is a JSON dictionary
skip – do not enter the following tag
- Returns:
iterator of (path, value)