module rss.rss_stream

Inheritance diagram of pyrsslocal.rss.rss_stream

Short summary

module pyrsslocal.rss.rss_stream

Description of a RSS stream.

source on GitHub

Classes

class

truncated documentation

StreamRSS

Requires :epkg:`feedparser`. Description of an RSS stream.

Properties

property

truncated documentation

asdict

Returns all members as a dictionary.

asrow

Returns all the values as a row (following the schema given by schema_database()).

index

Defines the column to use as an index.

schema_database

Returns all members names and types as a dictionary.

stat_nb

Returns the statistics nb: self.stat.get("nb", 0).

Static Methods

staticmethod

truncated documentation

enumerate_post_from_rsslist

Enumerates all posts found in all rss_streams given as a list.

enumerate_stream_from_google_list

Retrieves the list of RSS streams from a dump made with Google Reader.

fill_table

Fills a table of a database, if the table does not exists, it creates it.

schema_database_read

Returns all members names and types as a dictionary.

Methods

method

truncated documentation

__init__

__lt__

cmp operator

__str__

usual

enumerate_post

Parses a RSS stream.

html

Displays the blogs in HTML format, the template contains two kinds of informations:

Documentation

Description of a RSS stream.

source on GitHub

class pyrsslocal.rss.rss_stream.StreamRSS(titleb, type, xmlUrl, htmlUrl, keywordsb, id=-1, nb=None)

Bases: object

Requires :epkg:`feedparser`. Description of an RSS stream.

<outline text="Freakonometrics" title="Freakonometrics"
     type="rss"
     xmlUrl="http://freakonometrics.hypotheses.org/feed"
     htmlUrl="http://freakonometrics.hypotheses.org" />

attribute

meaning

titleb

title of the stream

type

type

xmlUrl

url of the rss stream

htmlUrl

main page of the blog

keywordsb

list of keywords

source on GitHub

Parameters:
  • titleb – title of the stream

  • type – type

  • xmlUrl – url of the rss stream

  • htmlUrl – main page of the blog

  • keywordsb – keywords

  • id – an id

  • nb – not included in the database, part of the statistics with can be added if they not None

source on GitHub

__init__(titleb, type, xmlUrl, htmlUrl, keywordsb, id=-1, nb=None)
Parameters:
  • titleb – title of the stream

  • type – type

  • xmlUrl – url of the rss stream

  • htmlUrl – main page of the blog

  • keywordsb – keywords

  • id – an id

  • nb – not included in the database, part of the statistics with can be added if they not None

source on GitHub

__lt__(o)

cmp operator

source on GitHub

__str__()

usual

source on GitHub

property asdict

Returns all members as a dictionary.

Returns:

dictionary

source on GitHub

property asrow

Returns all the values as a row (following the schema given by schema_database()).

Returns:

list of values

source on GitHub

enumerate_post(path=None, fLOG=None)

Parses a RSS stream.

Parameters:
  • path – if None, use self.xmlUrl, otherwise, uses this path (url or local file)

  • fLOG – logging function

Returns:

list of BlogPost

We expect the format to be:

{'summary_detail':
        {'base': '',
         'value': '<p> J'ai encore perdu des ... </p>',
         'language': None,
         'type': 'text/html'},
  'title_detail':
        {'base': '',
        'value': 'Installer pip pour Python',
        'language': None,
        'type': 'text/plain'},
   'published': '2013-06-24 00:00:00',
   'published_parsed': time.struct_time(tm_year=2013, tm_mon=6, tm_mday=24,
                                        tm_hour=0, tm_min=0, tm_sec=0,
                                        tm_wday=0, tm_yday=175, tm_isdst=0),
   'link': 'http://www.xavierdupre.fr/blog/xd_blog.html?date=2013-06-24',
   'summary': '<p> J'ai encore perdu de... </p>',
   'guidislink': False,
   'title': 'Installer pip pour Python',
   'links': [{'href': 'http://www.xavierdupre.fr/blog/xd_blog.html?date=2013-06-24',
            'rel': 'alternate', 'type': 'text/html'}],
    'id': 'http://www.xavierdupre.fr/blog/xd_blog.html?date=2013-06-24'}

If there is no date, the function will give the date of today (assuming you fetch posts from this blog everyday). If the id is not present, the guid will be the url, otherwise, it will be the id.

source on GitHub

static enumerate_post_from_rsslist(list_rss_stream, fLOG=None)

Enumerates all posts found in all rss_streams given as a list.

Parameters:
  • list_rss_stream – list of rss streams

  • fLOG – logging function

Returns:

enumeration of blog post

source on GitHub

static enumerate_stream_from_google_list(file, encoding='utf8', fLOG=None)

Retrieves the list of RSS streams from a dump made with Google Reader.

Parameters:
  • file – filename

  • encoding – encoding

  • fLOG – logging function

Returns:

list of StreamRSS

The format is the following:

An entry in the XML config file

<outline text="Freakonometrics"
     title="Freakonometrics"
     type="rss"
     xmlUrl="http://freakonometrics.hypotheses.org/feed"
     htmlUrl="http://freakonometrics.hypotheses.org" />

source on GitHub

static fill_table(db, tablename, iterator_on)

Fills a table of a database, if the table does not exists, it creates it.

Parameters:
  • db – database object (Database)

  • tablename – name of a table (created if it does not exists)

  • iterator_on – iterator_on on StreamRSS object

Example:

res = list(StreamRSS.enumerate_stream_from_google_list(file))
StreamRSS.fill_table(db, "blogs", res)

source on GitHub

html(template=None, action='{0.htmlUrl}', style='blogtitle', addlog=True)

Displays the blogs in HTML format, the template contains two kinds of informations: - {0.member}: this string will be replaced by the member

Parameters:
  • template – html template, if not None, it can equal to another default template: - default - default_stat

  • action – url to use when clicking on a blog

  • style – style of the paragraph containing the url

  • addlog – if True, url will be prefix by /logs/click/ in order to be logged

Returns:

html string

If the template is None, it will be replaced a default value (see the code and the variable template).

source on GitHub

property index

Defines the column to use as an index.

source on GitHub

property schema_database

Returns all members names and types as a dictionary.

Returns:

dictionary

source on GitHub

static schema_database_read()

Returns all members names and types as a dictionary.

Returns:

dictionary

source on GitHub

property stat_nb

Returns the statistics nb: self.stat.get("nb", 0).

Returns:

number

source on GitHub