Wordgraph Top Level¶
The high level interface for Wordgraph is the wordgraph.describe function.
- wordgraph.describe(data, source=None, language='English', demographic='summary')¶
Describe the supplied graph object, together with a hint about the source of that object.
@return: None if there was no description text generated for the graph @return:
- Supported sources include:
- – Raw data (graphite-style)
- Unsupported sources include:
- – graphite full integration – matplotlib (under dev) – dot, networkx? – json, text?
This module contains the code for extracting the values needed for text generation out of the supplied data package.
The Graph class defines the basic interface, which is simply to be able to get a dictionary of descriptor variables back. It is extended by AutoGraph which is also abstract, and defines the interface for automated data ingestion.
AutoGraph is extended by GraphiteGraph, which does something. This Graph class makes various non-general assupmtions about the incoming data, consistent with the data being generated by the “graphite” web graphing application.
Useful extensions to this module would include a broader selection of specific graph types, supporting a wider array of input data use cases.
- class wordgraph.grapher.AutoGraph¶
Base class for automated data ingestion types. Defines the call to data ingestion.
Perform the processing on raw data and prepare the dictionary of values needed for the realisers
- class wordgraph.grapher.Graph¶
Base class for all other graph types, defining the data access method which will be used by the Realiser classes to fetch descriptor data.
Return a dictionary of values for use by the realiser classes.
- class wordgraph.grapher.GraphiteGraph¶
Expects data as produced by the “graphite” web application.
Stores the raw data into self.raw_data Stores structured data into self.processed_data Creates the response for as_dist ad self.result
Analysers are used to process a single series, and produce structured output describing the series.
As a user of this module, you will usually simply invoke get_analysis(points).
Analysers do not care about things like axis labels; they only need to find the best way of representing the data in the graph.
The analysers’ results are language agnostic, and will be translated into natural language elsewhere.
To add a new analyser, simply subclass the FixedIntervalAnalyser and implement the two methods. get_validity() will be used to asses which analyser is most suitable for describing the data, while get_result() does the actual analysis. Then add it to the list of analysers.
Note that the values passed into the analysers are just the y-values; you should not need to know the x-values, and you can assume the x distance between consecutive points is constant.
- class wordgraph.analysers.FixedIntervalAnalyser(points)¶
Given a series of y-values, associated with fixed x-axis increments, provide analysis of it.
Returns a (jsonable) representation of the values using this analyser.
Returns a float representing how well this analyser can describe these values. 1 represents a perfect fit, 0 represents no fit.
- class wordgraph.analysers.LinearDistribution(points)¶
- name = 'linear'¶
- class wordgraph.analysers.NormalDistribution(points)¶
Using the original x values of the input data, return a new set of points which as based on the assumed mean and stddev
(hacky) approach: Assuming that we do have a normal distribution. Then compare the theoretical cumulative normal distribution to the actual distribution at various points, and base the overall score on the overall deviation.
We will assume the X values start at 0 and increment by 1 for each point in the series.
- name = 'normal'¶
What is the x value which places this proportion of the graph to the left of this place?
- class wordgraph.analysers.RandomDistribution(points)¶
No meaninful pattern in the data.
- name = 'random'¶
Instantiate a bunch of analysers and return the one which suits this data best.
Cumulative distribution function for the standard normal distribution.
Note – this expects the standard deviation normalisation to have already been done outside of this fn.
Taken from the python math docs. Using to avoid a scipy dependency for now – scipy is very hard to install via pip due to O/S package dependencies. We want to support a simple install process if at all possible.