compact.reciprocal_hits
module to find protein pairs that are reciprocal “top” hits with options for different criteria for “top” hit cut-off
- main functions:
get_reciprocal_top_hits: determine reciprocal top hits from rbo score matrix between two datasets
- compact.reciprocal_hits.get_reciprocal_top_hits(scores, score_type='between', criterium='percent', percent=1, out_type='series')
determine reciprocal top hits for given score matrix
- Args:
- scores (pd df): matrix with interaction scores
in case of ‘within’ scores: symmetric matrix with same ids in index and columns.
- score_type (str, optional): Defaults to ‘between’.
type of interaction scores. either ‘within’ or ‘between’ ‘within’ for interaction scores within a single sample ‘between’ for interaction scores between two samples
- criterium (‘best’ or ‘percent’): top hit criterium type
if ‘best’: returns only the single best hit if ‘percent’: takes the top n % of highest scoring hits
- percent (numeric, optional): top percentage Defaults to 1.
if criterium is percent, returns top n percent of hits
- out_type (str, optional): Defaults to ‘series’.
if ‘dict’, output is dict. if anything else, output is a dict
- Raises:
ValueError: when score_type parameter is invalid
- Returns:
- pd series: reciprocal top hits in pd series format
2-level multiindex with id pair, values are scores OR
- top_hits_dict (dict): dictionary with reciprocal
top hits. strucure: {(‘l_id’,’r_id’):score}