imhr.data

@purpose: Module designed with accessing and visualizing data.
@date: Created on Sat May 1 15:12:38 2019
@author: Semeon Risom

Classes

Download([isLibrary]) Download raw data from apache, Box, or REDCap servers.
Plot([isLibrary]) Hub for creating data visualizations using pandas, seaborn, and bokeh.
class imhr.data.Download(isLibrary=False)[source]

Bases: imhr.data.download.Download

Download raw data from apache, Box, or REDCap servers.

Parameters:
isLibrary : bool

Check if required libraries are available.

Methods

REDCap(path, token, url, content[, payload]) Download data from an Research Electronic Data Capture (REDCap) server.
SFTP(source, destination, hostname, …) Connect to a remote server using a Secure File Transfer Protocol (SFTP).
SQL(driver, destination, hostname, username, …) [summary]
classmethod REDCap(path, token, url, content, payload=None, **kwargs)[source]

Download data from an Research Electronic Data Capture (REDCap) server.

Parameters:
path : str

Path to save data. For example:

>>> path = '/Users/mdl-admin/Desktop/r33/redcap'
token : str

The API token specific to your REDCap project and username. This is usually found on the Applications > API page. For example:

>>> token = 'D19859823032SFDMR24395298'
url : str

The URL of the REDCap project. For example:

>>> url = 'https://redcap.prc.utexas.edu/redcap/api/'.
content : str {report, file, raw, arm, instrument, returnFormat, metadata, project, surveyLink, user, participantList}

Type of export. Examples include exporting a report (report), file (file), or project info (project).

payload_ : dict or None, optional

Manually submit parameters for exporting or importing data. Can be entered within the function for convenience.

**kwargs : str or None, optional

Additional properties, relevent for specific content types. Here’s a list of available properties:

Property Description
report_id : str (report, record) The report ID number provided next to the report name on the report list page.
cformat : str {csv, json, xml, odm} Format to return data, either csv, json, xml, or odm. Default is json.
ctype : str Shape of data. Default is flat.
rawOrLabel: str {raw, label} (report, record) TExport the raw coded values or labels for the options of multiple choice fields.
rawOrLabelHeaders: str (report, record) TExport the variable/field names (raw) or the field labels (label).
exportCheckboxLabel: str Specifies the format of checkbox field values specifically when exporting the data as labels (i.e., when rawOrLabel=label).
returnFormat : str Format to return errors. Default is json.
Returns:
log : pandas.DataFrame or None

Pandas dataframe of each download request.

content : pandas.DataFrame or None

Pandas dataframe of all data downloaded.

start, end : str

Timestamp (ISO format) and name of most recent (end) and first (start) file created in folder.

now : str

Current timestamp in ISO format.

classmethod SFTP(source, destination, hostname, username, password, **kwargs)[source]

Connect to a remote server using a Secure File Transfer Protocol (SFTP).

Parameters:
source : str

The directory path to retrieve paticipant data.

destination : str

The directory path to save paticipant data.

hostname : str

SFTP hostname.

username : str

SFTP username.

password : str

SFTP password.

**kwargs : str or None, optional

Additional properties, relevent for specific content types. Here’s a list of available properties:

Property Description
filetype : :str or None Filetype to download. Default is csv.
Returns:
log : pandas.DataFrame or None

Pandas dataframe of each download request.

content : pandas.DataFrame or None

Pandas dataframe of all files downloaded.

start, end : str

Timestamp (ISO format) and name of most recent (end) and first (start) file created in folder.

now : str

Current timestamp in ISO format.

Examples

>>> name='r33'; source='/home/utweb/utw1211/public_html/r33'; d='/Users/mdl/Desktop/r33/'; un='utw1211'; pwd='43#!9amZ?K$'
>>> log, start, end, now = download.SFTP(source=s, destination=d, hostname=hostname, username=un, password=pwd)
classmethod SQL(driver, destination, hostname, username, password, database, table, **kwargs)[source]

[summary]

Parameters:
type : str {‘MySQL’,’MSSQL’}

The type of SQL server. Either MySQL or MSSQL.

hostname : str

The host name or IP address of the sql server.

username : str

The user name used to authenticate with the sql server.

password : str

The password to authenticate the user with the sql server.

database : str

The database name to use when connecting with the sql server.

table : str

The table name to use when connecting with the sql server.

**kwargs : str or None, optional

Additional properties, relevent for specific content types. Here’s a list of available properties:

Property Description
port : int (report, record) The report ID number provided next to the report name on the report list page.
Returns:
connection : :obj:`mysql.connector.connect <https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysql-connector-connect.html>`__

MySQL connector instance.

class imhr.data.Plot(isLibrary=False)[source]

Bases: imhr.data.plot.Plot

Hub for creating data visualizations using pandas, seaborn, and bokeh.

Parameters:
isLibrary : bool

Check if required libraries are available.

Methods

bokeh_calibration(config, df, cxy, event[, …]) Create calibration matrix, using pandas and bokeh.
bokeh_trial(config, df, stim_bounds, …) Create single subject trial bokeh plots.
boxplot(config, df[, path, x, y, title, …]) Creates boxplot using seaborn and pandas.
cooks_plot(config, y, model, path, df) Create cooks distance plot plot using seaborn, pandas, and statsmodel.
corr_matrix(config, df, path, title, method) Create correlation matrix using bokeh and pandas.
density_plot(config, df, title) Create density plot (draws kernel density estimate), using seaborn and pandas.
html([destination, df, raw_data, name, …]) Create HTML output.
logit_plot(config, df, path, param) Create logistic regression plot using seaborn and pandas
onset_diff_plot(config, df, meta, drop, y[, …]) Plot onset differences using pandas and seaborn.
qq_plot(config, y, residuals, path) Create probability plot using seaborn, pandas, and rpy2.
residual_plot(config, y, residuals, path) Create probability plot using seaborn, pandas, and rpy2.
single_subject(config, df, path) Create single subject scatterplot using seaborn and pandas.
classmethod bokeh_calibration(config, df, cxy, event, monitorSize=[1920, 1080])[source]

Create calibration matrix, using pandas and bokeh.

Parameters:
config : dict

Configuration data.

df : pandas.DataFrame

Pandas dataframe of raw data.

cxy : pandas.DataFrame

Pandas dataframe of calibration points.

event : string

calibration, or validation.

monitorSize : list

Monitor size, in pixels.

classmethod bokeh_trial(config, df, stim_bounds, roi_bounds, flt)[source]

Create single subject trial bokeh plots.

Parameters:
df : pandas.DataFrame

Pandas dataframe of participant sample data.

stim_bounds : dict

Stimulus bounds on screen.

roi_bounds : dict

ROI bounds on screen.

flt : str

Filter type.

classmethod boxplot(config, df, path=None, x=None, y=None, title=None, plots=None, cat='analysis')[source]

Creates boxplot using seaborn and pandas.

Parameters:
config : dict

Configuration data.

df : pandas.DataFrame

Pandas dataframe of raw data.

path : str

Path to save data.

drift : str

X-axis.

drift : str

Y-axis.

title : str

Plot title.

plots : dict

Dictionary of plots metadata.

cat : str

Type of plot.

classmethod cooks_plot(config, y, model, path, df)[source]

Create cooks distance plot plot using seaborn, pandas, and statsmodel.

Parameters:
config : dict

Configuration data.

df : pandas.DataFrame

Pandas dataframe of raw data.

path : str

The directory path save the seaborn plot.

y : str

The predictor variable.

model : dict

statsmodel model.

Returns:
res

seaborn plot.

classmethod corr_matrix(config, df, path, title, method, footnote=None)[source]

Create correlation matrix using bokeh and pandas.

Parameters:
config : dict

Configuration data.

df : pandas.DataFrame

Pandas dataframe of raw data.

path : str

The directory path to save the bokeh or plot.

method : str

Spearman or Pearsons correlation coefficient.

title : str

Chart title.

footnote : str

Chart footnote.

Returns:
cm

Bokeh plot.

classmethod density_plot(config, df, title)[source]

Create density plot (draws kernel density estimate), using seaborn and pandas.

Parameters:
config : dict

Configuration data.

df : pandas.DataFrame

Pandas dataframe of raw data.

title : str

Chart title.

Returns:
cm

Bokeh or seaborn plot.

classmethod html(destination=None, df=None, raw_data=None, name=None, path=None, plots=None, source=None, title=None, intro=None, footnote=None, script='', **kwargs)[source]

Create HTML output.

Parameters:
destination : str

Path to save file to.

df : pandas.DataFrame

Pandas dataframe of analysis results data.

raw_data : pandas.DataFrame

Pandas dataframe of raw data.

name : str

(py::if source is logit) The name of csv file created.

path : str

The directory path of the html file.

plots : dict

If generating seaborn images, the list of plots used.

source : str

The type of data being recieved.

trial : str

(If Bokeh) Trial Number.

session : str

(If Bokeh) Session Number.

bokeh_type : str

(If Bokeh) Control directory location. If trial, create trial plots.

title : str

The title of the table or figure.

intro : str

The introduction of the group of figures or tables.

footnote : str

The footnote of the table or figure.

metadata : dict

Additional data to be included.

**kwargs : str, int, or None, optional

Additional properties, relevent for specific content types. Here’s a list of available properties:

class:kwargs
widths:25 50
header-rows:1
    • Property
      • Description
    • short, long : str
      • Short (aoi) and long form (Area of Interest) label of html page. This is primarily used for constructing metadata tags in html.
    • display : str
      • (For bokeh) The type of calibration/validation display.
    • trial : str
      • (For bokeh) The trial number for the eyetracking task.
    • session : int
      • (For bokeh) The session number for the eyetracking task.
    • day : str
      • (For bokeh) The day the eyetracking task was run.
Returns:
html : str

String of html code.

classmethod logit_plot(config, df, path, param)[source]

Create logistic regression plot using seaborn and pandas

Parameters:
df : pandas.DataFrame

Pandas dataframe of raw data.

path : str

The directory path save the seaborn plot.

param : dict

x, y, groupby parameters.

Returns:
lmp

seaborn plot.

classmethod onset_diff_plot(config, df, meta, drop, y, clip=None)[source]

Plot onset differences using pandas and seaborn.

Parameters:
config : dict

Configuration data.

df : pandas.DataFrame

Pandas dataframe of raw data.

meta : pandas.DataFrame

Metadata for each chart.

drop : pandas.DataFrame

Participants to be dropped.

y : str

Variable of interest.

clip : int

Clip value for single subject plot.

Returns:
odp

Bokeh or seaborn plot.

classmethod qq_plot(config, y, residuals, path)[source]

Create probability plot using seaborn, pandas, and rpy2.

Parameters:
config : dict

Configuration data.

y : str

Predictor variable.

residuals : pandas.DataFrame

Pandas dataframe of residuals vs fitted, qq data, and raw data.

path : str

The directory path save the seaborn plot.

Returns:
lmp

seaborn plot.

classmethod residual_plot(config, y, residuals, path)[source]

Create probability plot using seaborn, pandas, and rpy2.

Parameters:
config : dict

Configuration data.

y : str

Predictor variable.

residuals : pandas.DataFrame

Pandas dataframe of residuals vs fitted, qq data, and raw data.

path : str

The directory path save the seaborn plot.

Returns:
lmp

seaborn plot.

classmethod single_subject(config, df, path)[source]

Create single subject scatterplot using seaborn and pandas.

Parameters:
df : pandas.DataFrame

Pandas dataframe of raw data.

path : str

The directory path to save the bokeh or seaborn plot.

Returns:
cm

Bokeh or seaborn plot.