imhr.Webgazer.processing¶

@purpose: Hub for running processing and analyzing raw data.   
@date: Created on Sat May 1 15:12:38 2019   
@author: Semeon Risom   
@email: semeon.risom@gmail.com   
@url: https://semeon.io/d/R33-analysis   

Classes

Processing(config[, isLibrary, isDebug]) Hub for running processing and analyzing raw data.

class imhr.Webgazer.processing.Processing(config, isLibrary=False, isDebug=False)[source]¶

Bases: object

Hub for running processing and analyzing raw data.

Methods

Methods

`append_classify`(self, df, cg_df)	Appending classification to Dataframe.
`classify`(self, config, df[, ctype, …])	I-DT algorithm takes into account the distribution or spatial proximity of eye position points in the eye-movement trace.
`dwell`(self, df[, cores, isMultiprocessing])	Calculate dwell time for sad and neutral images.
`filter_data`(self, df, filter_type, config)	Butterworth: Design an Nth-order digital or analog Butterworth filter and return the filter coefficients.
`getData`(self[, path])	preparing data for use in analysis.
`getEstimatedMonitor`(self, diagonal, window)	calculate estimate monitor size (w,h;cm) using estimated diagonal monitor (hypotenuse; cm).
`onset_diff`(self, df0[, merge, cores])	Calculate differences in onset presentation (stimulus, dotloc) using bokeh, seaborn, and pandas.
`preprocess`(self, df, window)	Initial data cleaning.
`process`(self, window, filters, gxy_df, trial)	Plotting and preparing data for classification.
`roi`(self[, filters, flt, df, manual, …])	Check if fixation is within bounds.
`run`(self, path[, task_type, single_subject, …])	Processing of data.
`subject_metadata`(self, fpath, spath)	Collect all subjects metadata.
`variables`(self, df)	Output list of variables for easy html viewing.

`append_classify`(self, df, cg_df)	Appending classification to Dataframe.
`classify`(self, config, df[, ctype, …])	I-DT algorithm takes into account the distribution or spatial proximity of eye position points in the eye-movement trace.
`dwell`(self, df[, cores, isMultiprocessing])	Calculate dwell time for sad and neutral images.
`filter_data`(self, df, filter_type, config)	Butterworth: Design an Nth-order digital or analog Butterworth filter and return the filter coefficients.
`getData`(self[, path])	preparing data for use in analysis.
`getEstimatedMonitor`(self, diagonal, window)	calculate estimate monitor size (w,h;cm) using estimated diagonal monitor (hypotenuse; cm).
`onset_diff`(self, df0[, merge, cores])	Calculate differences in onset presentation (stimulus, dotloc) using bokeh, seaborn, and pandas.
`preprocess`(self, df, window)	Initial data cleaning.
`process`(self, window, filters, gxy_df, trial)	Plotting and preparing data for classification.
`roi`(self[, filters, flt, df, manual, …])	Check if fixation is within bounds.
`run`(self, path[, task_type, single_subject, …])	Processing of data.
`subject_metadata`(self, fpath, spath)	Collect all subjects metadata.
`variables`(self, df)	Output list of variables for easy html viewing.

getEstimatedMonitor(self, diagonal, window)[source]¶

calculate estimate monitor size (w,h;cm) using estimated diagonal monitor (hypotenuse; cm).

Attributes:	df_raw : `pandas.DataFrame` Pandas dataframe of subjects.

preprocess(self, df, window)[source]¶

Initial data cleaning.

Parameters:	df : `pandas.DataFrame` Pandas dataframe of raw data. window : `tuple` horizontal, vertical resolution
Attributes:	m_delta : `int` Maxinum one-sample change in velocity

Notes

remove_missing:: Remove samples with null values.
remove_bounds:: Remove samples outside of window bounds (1920,1080).
remove_spikes:: remove one-sample spikes if x and y-axis delta is greater than 5.

getData(self, path=None)[source]¶

preparing data for use in analysis.

Parameters:	path : `str` The directory path of the subject data
Attributes:	path : `str` Specific directory path used.
Returns:	df : `pandas.DataFrame` Pandas dataframe of raw data. _path : `list` list of files used for analysis.

Notes

You can either get data from all subjects within a directory, or from a specific subject (subject_session).

Examples

>>> #if using path:
>>> df_raw = getData(path=self.config['path']['raw'])

>>> #if getting data for single subject:
>>> df_raw = getData(path=self.config['path']['raw'],subject_session=['1099','1', '0'])

filter_data(self, df, filter_type, config)[source]¶

Butterworth: Design an Nth-order digital or analog Butterworth filter and return the filter coefficients.

Parameters:	df : `pandas.DataFrame` Pandas dataframe of raw data. filter_type : `str`, optional Type of filter. config : `dict` Configuration data. i.e. trial number, location.
Attributes:	filter_type : `str` Filter type: ‘butterworth’

classify(self, config, df, ctype='ivt', filter_type=None, v_th=None, dr_th=None, di_th=None, missing=None, maxdist=None, mindur=None)[source]¶

I-DT algorithm takes into account the distribution or spatial proximity of eye position points in the eye-movement trace.

In the I-VT model, the velocity value is computed for every eye position sample. The velocity value is then compared to the threshold. If the sampled velocity is less than the threshold, the corresponding eye-position sample is marked as part of a fixation, otherwise it is marked as a part of a saccade.

The simple model detects fixations, defined as consecutive samples with an inter-sample distance of less than a set amount of pixels (disregarding missing data)

Parameters:	config : `dict` Configuration data. i.e. trial number, location. df : `pandas.DataFrame` Pandas dataframe of classified data. ctype : `str` Classification type: ‘ivt’ filter_type : [type], optional Filter type: ‘butter’ ctype : `int`, optional velocity threshold (ivt), dispersion threshold (idt; used by SR-Research and Tobii), or simple v_th : `str` Velocity threshold in pix/sec (ivt) dr_th : `str` Fixation duration threshold in pix/msec (idt) di_th : `str` Dispersion threshold in pixels (idt) missing : `str` value to be used for missing data (simple) maxdist : `str` maximal inter sample distance in pixels (simple) mindur : `str` minimal duration of a fixation in milliseconds; detected fixation cadidates will be disregarded if they are below this duration (simple)
Returns:	df : `pandas.DataFrame` Pandas dataframe of classified data.
Raises:	ValueError Unknown classification type.

roi(self, filters=None, flt=None, df=None, manual=False, monitorSize=None)[source]¶

Check if fixation is within bounds.

Attributes:	manual : `str` Whether or not processing.roi() is access manually. monitorSize : `list` Monitor size. filters : `list` Filter parameters. Default [[‘SavitzkyGolay’,’sg’]]. df : `pandas.DataFrame` Pandas dataframe of classified data.
Returns:	df : `pandas.DataFrame` Pandas dataframe of classified data.

process(self, window, filters, gxy_df, trial, _classify=True, ctype='simple', _param='', log=False, v_th=20, dr_th=200, di_th=20, _missing=0.0, _maxdist=25, _mindur=50)[source]¶

Plotting and preparing data for classification. Combined plot of each filter.

Parameters:	window : `list` horizontal, vertical resolution filters : `list` List of filters along with short-hand names. gxy_df : `pandas.DataFrame` Pandas dataframe of raw data. Unfiltered raw data. trial : `str` Trial number. _classify : `bool` parameter to include classification ctype : `str` classification type. simple, idt, ivt _param : `str` [description] (the default is ‘’, which [default_description]) log : `bool` [description] (the default is False, which [default_description]) v_th : `str` Velocity threshold in px/sec (ivt) dr_th : `str` Fixation duration threshold in px/msec (idt) di_th : `str` Dispersion threshold in px (idt) _missing : `bool` value to be used for missing data (simple) _maxdist : `str` maximal inter sample distance in pixels (simple) _mindur : `str` minimal duration of a fixation in milliseconds; detected fixation cadidates will be disregarded if they are below this duration (simple) (default = 100)
Attributes:	_fxy_df : `pandas.DataFrame` Pandas dataframe of raw data. Filtered data. Subset of _fgxy_df.
Returns:	_fgxy_df : `pandas.DataFrame` Pandas dataframe of filtered data. c_xy : `pandas.DataFrame` Pandas dataframe of classified data.

append_classify(self, df, cg_df)[source]¶

Appending classification to Dataframe.

Parameters:	df : `list` Pandas dataframe of raw data. gxy_df : `pandas.DataFrame` Pandas dataframe of raw data of classification events.

run(self, path, task_type='eyetracking', single_subject=False, single_trial=False, subject=0, trial=0, isMultiprocessing=True, cores=1)[source]¶

Processing of data. Steps here include: cleaning data, fixation identification, and exporting data.

Parameters:

Parameters:	path : `string` Path of raw data. task_type : `string` Running analysis on eyetracking or behavioral data. single_subject : `bool` Whether to run function with all or single subject. single_trial : `bool` Whether to run function with all or single trial. subject : `int` Subject number. Only if single_subject = True. trial : `int` Trial number. Only if single_trial = True. isMultiprocessing : `bool` Whether multiprocessing of data will be used. Only if single_subject = False. cores : `int` Number of cores to use for multiprocessing. Only if single_subject = False & isMultiprocessing=True.
Attributes:	process : `bool` Process all data for export.

path : string: Path of raw data.
task_type : string: Running analysis on eyetracking or behavioral data.
single_subject : bool: Whether to run function with all or single subject.
single_trial : bool: Whether to run function with all or single trial.
subject : int: Subject number. Only if single_subject = True.
trial : int: Trial number. Only if single_trial = True.
isMultiprocessing : bool: Whether multiprocessing of data will be used. Only if single_subject = False.
cores : int: Number of cores to use for multiprocessing. Only if single_subject = False & isMultiprocessing=True.

Attributes:

process : bool: Process all data for export.

subject_metadata(self, fpath, spath)[source]¶

Collect all subjects metadata.

Parameters:	fpath : `str` The directory path of all participant data. spath : `str` The directory path of all participant data.
Returns:	df : `ndarray` Pandas dataframe of subject metadata.

variables(self, df)[source]¶

Output list of variables for easy html viewing.

Parameters:	df : `pandas.DataFrame` Pandas dataframe of raw data. This is used as a filter to prevent unused participants from being included in the data. path : `str` The directory path save and read the hdf5 dataframe.
Returns:	df_definitions : `pandas.DataFrame`

dwell(self, df, cores=1, isMultiprocessing=False)[source]¶

Calculate dwell time for sad and neutral images.

Parameters:	df : `pandas.DataFrame` Pandas dataframe of raw data. This is used as a filter to prevent unused participants from being included in the data. cores : `int` Number of cores to use for multiprocessing.
Returns:	df : `pandas.DataFrame` Pandas dataframe with dwell time. error : `list` List of participants that were not included in dataframe.

onset_diff(self, df0, merge=None, cores=1)[source]¶

Calculate differences in onset presentation (stimulus, dotloc) using bokeh, seaborn, and pandas.

Parameters:	df0 : `pandas.DataFrame` Pandas dataframe of raw data. This is used to merge variables that may be useful for analysis. merge : `list` or None Variables to merge into returned df. cores : `int` Number of cores to use for multiprocessing.
Returns:	df1 : `pandas.DataFrame` Pandas dataframe. error : `pandas.DataFrame` Dataframe of each participants and the amount trials included in their data. drop : `list` List of participants that are 3 SD from median.